To create meaningful application stacks - there will be many containers involved. arm64 containers are more the exception than the rule for what is immediately available for certain high-quality components. This means that the first problem to solve in a hybrid cluster is providing hints to the scheduler to select the correct node ( x86_64 or ARM64 ) for a container to properly run when a pod deploys.
This problem was described well in this excellent blog on building Microk8s clusters
The author mentions many of my motivations for building a home k8 lab and introduces the first major issue to be resolved. The use of nodeSelector is described well in that blog, so I won't repeat that here. The important take-away is this code fragment:
nodeSelector: kubernetes.io/arch: amd64
The optimal approach is to modify the deployment files before they are applied. Adding this simple code fragment will have the pod deployment schedule on a x86_64 node.
Unfortunately it can be time consuming to convert an existing deployment package to include this fragment - for the pod deployments that end up pulling containers that are not otherwise labeled amd64. Unless you study all containers pulled for a pod to carefully select arm64 replacements - the assumption should be there will be ephemeral containers that will only run on x86_64.
The approach I have been taking is to edit the deployments, after they have been applied to provide a hint to the scheduler and then kill the pods that had been scheduled on arm64 nodes. When the pod get rescheduled - the hint gets the pod over to the x86_64 nodes for execution.
The first complex effort I've attempted is to deploy postgresql in a HA configuration. I may eventually find a way to get the workload moved over to arm64, but not today. I recently deployed the PostgreSQL Operator for a client and was stunned at how easy it made deploying HA postgresql.
To obtain the x86_64 nodes I pulled a Dell OptiPlex960 from the garage ( 2 cores and 4G mem ) and a 17" MacBookPro running parallels ( 3 cores / 6G mem ) and installed Ubuntu 20.04.1 server. This provides adequate resources to run a performant HA deployment with local hostpath provisioning. Persistent storage becomes another wrinkle, as we soon find out.
I deployed the 4.4.1 version and then patched the deployments to schedule on the two new nodes - Wendy (OptiPlex960) and Clyde (MacBookPro).
Hybrid cluster storage
The Operator uses hostpath-provisioner to obtain Persistent Volumes (PV) to satisfy required Persistent Volume Claims (PVC). By default this container is pulled:
And of course - that is a x86_64 only container - and isn't written to accomodate hybrid cluster resources. Hostpath-provisioner needs two variant deployments to work properly in the hybrid cluster.
Reviewing the cdkbot repository - we find there are CHOICES.
But it pays to check the tags CAREFULLY. Note this repository has a LATEST tag that is two years old, while the 1.0.0 tag is only a year old. Setting the tag to LATEST by default is a good way to end up with a drinking problem.
And same issue for the amd64 variant of that container.
This provides a path for creating ARCH-aware hostpath deployments so hostpath-provisioner doesn't end up putting the PVC on the wrong storage.
There is more nuance to setting up for the storage behavior needed. In my case for the PostgreSQL Operator, I need to modify the deployment to not use hostpath-provisioner. Instead, I have large partitions already created on Wendy and Clyde that I can deploy as generic PV set and then patch them (kubectl edit pv) to be claimed by a specific pod so I get the placement I desire.
I'm still working on getting this deploy finished. I'll cover more nuances of k8 storage in my next log.