Prerequisites
- Basic Knowledge of Kubernetes: Familiarity with pods, deployments, and CRDs
- Golang Development Environment: Go version 1.18 or later
- kubectl and Minikube: For deploying and testing your operator
- Operator SDK: A CLI tool for scaffolding and building Kubernetes Operators
As a Kubernetes user, you’ve likely experienced its power in managing stateless applications. With Kubernetes, you can easily deploy, scale, and manage these applications without much manual intervention.
However, when it comes to stateful applications, things get a bit more complex. These applications require persistent storage, complex configuration, and careful state management. Kubernetes, while powerful, doesn’t natively address these specific needs.
Why are Kubernetes operators needed?
Kubernetes is a fantastic tool for managing your stateless applications. You can easily define the desired state for your applications, and Kubernetes works hard to maintain it. For instance, you can deploy a Nginx deployment with 3 replicas, and Kubernetes ensures they’re all up and running, ready to handle web traffic.
But what about your stateful applications? These are the ones that require persistent storage, complex configuration, and careful state management. Kubernetes, while powerful, doesn’t natively handle these complexities. This is where Kubernetes Operators come to the rescue. They automate the management of stateful applications, handling tasks like:]
- Configuration: Ensuring your application is configured correctly.
- Scaling: Automatically scaling your application up or down based on demand.
- Backups: Taking regular backups of your data to protect against failures.
- Upgrades: Seamlessly upgrading your application to newer versions.
For example, a PostgreSQL operator can automate the entire lifecycle of a PostgreSQL database, from initialization to scaling and backups.
In this blog, we'll dive into the world of Kubernetes Operators, exploring how to build your own in Golang. We'll even build a simple "PodWatcher" operator to demonstrate the core concepts.
Key concepts of writing an operator
- Custom resource definitions - The purpose of Custom Resource Definitions is to extend the Kubernetes API with custom resource types. For example we can define a resource like PodWatcher with custom fields such as labelSelector or emailAddress. This enables users to create and manage domain-specific objects in Kubernetes.
- Custom resources - Instances of the custom resource type defined by a CRD.
- Controllers - Controllers reconcile the actual state of a resource with its desired state, as defined in the CR. They continuously observe resources and check if the actual state matches the desired state and take corrective action to achieve the desired state.
- Reconciliation - The core logic of an Operator that ensures the actual state matches the desired state.
- Informers - Cache resource states and notify controllers of changes.
- Watchers - Provide real-time event streams from the Kubernetes API.
- Service accounts and RBAC control the permissions an Operator has in the cluster.
- Service account - An identity for the Operator.
- Role/ClusterRole - Defines what actions the Operator can take (e.g., list pods, update CRs).
- RoleBinding/ClusterRoleBinding - Grants the Operator access to resources.
Steps to write a Kubernetes Operator in Golang
Now let’s implement an operator which will monitor a PodWatcher Resource and alert users when a pod is restarted.
Step 1: Install the Operator SDK
brew install operator-sdk
Step 2: Create an Operator project
Start by creating an Operator project using the Operator SDK:
operator-sdk init --domain example.com --repo github.com/JonesJefferson/operator-example
This sets up the foundational structure and configuration files for your Operator.
- --domain example.com: Defines the default domain for your API group. This is used in CRD generation (example.com/v1).
- --repo github.com/JonesJefferson/operator-example: Specifies the module path for the Go project. This is useful for dependency management.
Generated File Structure
├── config/ │ ├── crd/ │ ├── default/ │ ├── manager/ │ ├── rbac/ │ ├── samples/ │ └── scorecard/ ├── controllers/ │ └── <empty initially> ├── Dockerfile ├── go.mod ├── go.sum ├── main.go ├── Makefile ├── PROJECT └── README.md config/ Directory:
- Contains Kubernetes manifests for deploying the Operator, including:
- CRD definitions (config/crd).
- RBAC configurations (config/rbac) for the Operator.
- Deployment manifests for the Operator controller (config/manager).
- Templates and settings for default resources and namespaces.
controllers/ Directory:
- Placeholder for custom controller logic. Initially empty but will contain controllers for custom resources.
main.go:
- Entry point for the Operator.
- Initializes the manager, which sets up controllers and handles reconciliation.
Makefile:
- Includes useful targets to build, test, and deploy the Operator (e.g., make run, make docker-build).
Step 3: Create the API and controller
operator-sdk create api --group=core --version=v1 --kind=PodWatcher --controller --resource
This command adds a new Custom Resource Definition and corresponding controller to the project and also defines the API schema for the custom resource and scaffolds the controller logic.
--group=core: Specifies the API group (core.example.com).
--version=v1: Defines the API version (core.example.com/v1).
--kind=PodWatcher: Specifies the custom resource kind (PodWatcher).
--controller: Indicates that a controller should be generated for this resource.
--resource: Indicates that the CRD should be scaffolded.
Generated Files
podwatcher-operator/ ├── api/ │ └── v1/ │ ├── podwatcher_types.go │ ├── groupversion_info.go │ └── zz_generated.deepcopy.go ├── controllers/ │ └── podwatcher_controller.go api/v1/podwatcher_types.go:
- Contains the Go struct definition for the PodWatcher resource.
- Defines the schema of the resource (spec, status, etc.).
api/v1/groupversion_info.go:
- Registers the API group and version with the Kubernetes scheme.
- Ensures the custom resource can be serialized/deserialized.
controllers/podwatcher_controller.go:
- Contains the reconciliation logic for the PodWatcher custom resource.
- Scaffolds the Reconcile function, which is the heart of the controller.
func (r *PodWatcherReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { // Reconciliation logic here return ctrl.Result{}, nil }
Step 4: Modify the PodWatcher resource
Modify the PodWatcher API (api/v1/podwatcher_types.go) to include the fields for filtering pods and notification details.
package v1 import ( metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" ) // EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN! // NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized. // PodWatcherSpec defines the desired state of PodWatcher type PodWatcherSpec struct { // INSERT ADDITIONAL SPEC FIELDS - desired state of cluster // Important: Run "make" to regenerate code after modifying this file LabelSelector map[string]string `json:"labelSelector,omitempty"` } // PodWatcherStatus defines the observed state of PodWatcher type PodWatcherStatus struct { LastPodRestartTime string `json:"lastPodRestartTime,omitempty"` // Important: Run "make" to regenerate code after modifying this file } // +kubebuilder:object:root=true // +kubebuilder:subresource:status // PodWatcher is the Schema for the podwatchers API type PodWatcher struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec PodWatcherSpec `json:"spec,omitempty"` Status PodWatcherStatus `json:"status,omitempty"` } // +kubebuilder:object:root=true // PodWatcherList contains a list of PodWatcher type PodWatcherList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata,omitempty"` Items []PodWatcher `json:"items"` } func init() { SchemeBuilder.Register(&PodWatcher{}, &PodWatcherList{}) }
After making this change, run “make” to regenerate the code
Step 5: Implement the controller logic
Edit the PodWatcher controller in controllers/podwatcher_controller.go:
func (r *PodWatcherReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { logger := log.FromContext(ctx) // Fetch the PodWatcher resource var podWatcher appv1.PodWatcher if err := r.Get(ctx, req.NamespacedName, &podWatcher); err != nil { logger.Error(err, "unable to fetch PodWatcher") return ctrl.Result{}, client.IgnoreNotFound(err) }
The Reconcile function is the core of the controller. It is invoked whenever there is a change (add, update, or delete) to a PodWatcher custom resource or any related resources being watched. In the above snippet we first get the PodWatcher resource. The LabelSelector in the PodWatcher specification determines which pods the operator should monitor.
podList := &corev1.PodList{} listOpts := &client.ListOptions{ Namespace: req.Namespace, } if err := r.List(ctx, podList, listOpts); err != nil { logger.Error(err, "unable to list pods") return ctrl.Result{}, err } for _, pod := range podList.Items { // Check if the pod matches the label selector matches := true for key, value := range labelSelector { if pod.Labels[key] != value { matches = false break } } if matches { for _, status := range pod.Status.ContainerStatuses { if status.RestartCount > 1 { message := fmt.Sprintf("Pod '%s' in namespace '%s' has restarted %d times!", pod.Name, pod.Namespace, status.RestartCount) fmt.Println(message) // Update PodWatcher status podWatcher.Status.LastPodRestartTime = time.Now().String() if err := r.Status().Update(ctx, &podWatcher); err != nil { logger.Error(err, "failed to update PodWatcher status") } } } } } return ctrl.Result{RequeueAfter: time.Second}, nil
In the above snippet, we list the pods and if they match the label selector mentioned in the PodWatcher resource, then we get the number of times that pod was restarted and print that info. We then set the next reconciliation loop to occur the very next second.
Step 6: Build and push the image
make docker-build docker-push IMG=clivebixby/podwatcher-operator:v1.0.0
Step 7: Deploy the Operator
make deploy IMG=<your-dockerhub-username>/podwatcher-operator:v1.0.0
This will deploy the operator along with the resources needed to run the operator as well
build config/default | kubectl apply -f - namespace/kubernetes-operators-system unchanged customresourcedefinition.apiextensions.k8s.io/podwatchers.core.example.com unchanged serviceaccount/kubernetes-operators-controller-manager unchanged role.rbac.authorization.k8s.io/kubernetes-operators-leader-election-role unchanged clusterrole.rbac.authorization.k8s.io/kubernetes-operators-manager-role unchanged clusterrole.rbac.authorization.k8s.io/kubernetes-operators-metrics-auth-role unchanged clusterrole.rbac.authorization.k8s.io/kubernetes-operators-metrics-reader unchanged clusterrole.rbac.authorization.k8s.io/kubernetes-operators-podwatcher-editor-role unchanged clusterrole.rbac.authorization.k8s.io/kubernetes-operators-podwatcher-viewer-role unchanged rolebinding.rbac.authorization.k8s.io/kubernetes-operators-leader-election-rolebinding unchanged clusterrolebinding.rbac.authorization.k8s.io/kubernetes-operators-manager-rolebinding unchanged clusterrolebinding.rbac.authorization.k8s.io/kubernetes-operators-metrics-auth-rolebinding unchanged service/kubernetes-operators-controller-manager-metrics-service unchanged deployment.apps/kubernetes-operators-controller-manager configured
Step 8: Apply the PodWatcher CR
apiVersion: example.com/v1 kind: PodWatcher metadata: name: podwatcher-example spec: labelSelector: app: nginx kubectl apply -f podwatcher.yaml
Step 9: Create ClusterRole and ClusterRoleBinding to allow operator to list pods
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: pod-watcher-role rules: - apiGroups: [""] resources: ["pods"] verbs: ["list", "watch"] apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: pod-watcher-rolebinding subjects: - kind: ServiceAccount name: kubernetes-operators-controller-manager namespace: kubernetes-operators-system roleRef: kind: ClusterRole name: pod-watcher-role apiGroup: rbac.authorization.k8s.io
And we’re done!
Now to test this. Create a faulty pod that restarts endlessly and watch the logs in the operator pod
jones.jefferson@OPLPT043 ~ % kubectl logs -f kubernetes-operators-controller-manager-55967d886b-p8gw8 -n kubernetes-operators-system 2024-12-11T14:17:54Z INFO setup starting manager 2024-12-11T14:17:54Z INFO controller-runtime.metrics Starting metrics server 2024-12-11T14:17:54Z INFO setup disabling http/2 2024-12-11T14:17:54Z INFO starting server {"name": "health probe", "addr": "[::]:8081"} I1211 14:17:54.942626 1 leaderelection.go:250] attempting to acquire leader lease kubernetes-operators-system/525f3881.example.com... 2024-12-11T14:17:55Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8443", "secure": true} I1211 14:18:25.606874 1 leaderelection.go:260] successfully acquired lease kubernetes-operators-system/525f3881.example.com 2024-12-11T14:18:25Z DEBUG events kubernetes-operators-controller-manager-55967d886b-p8gw8_e8c06ec2-85d1-47ba-845d-c9aa3f2a3eda became leader {"type": "Normal", "object": {"kind":"Lease","namespace":"kubernetes-operators-system","name":"525f3881.example.com","uid":"1d3c41a8-2f84-42ce-9aae-ff852afed66e","apiVersion":"coordination.k8s.io/v1","resourceVersion":"17951"}, "reason": "LeaderElection"} 2024-12-11T14:18:25Z INFO Starting EventSource {"controller": "podwatcher", "controllerGroup": "core.example.com", "controllerKind": "PodWatcher", "source": "kind source: *v1.PodWatcher"} 2024-12-11T14:18:25Z INFO Starting Controller {"controller": "podwatcher", "controllerGroup": "core.example.com", "controllerKind": "PodWatcher"} 2024-12-11T14:18:25Z INFO Starting workers {"controller": "podwatcher", "controllerGroup": "core.example.com", "controllerKind": "PodWatcher", "worker count": 1} Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times! Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times! Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times! Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times! Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times! Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times! Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times! Pod 'ubuntu-pod' in namespace 'default' has restarted 13 times!
You've successfully built a Kubernetes Operator to manage PodWatcher resources using Golang. This example can be extended to include more complex logic, such as creating other Kubernetes resources or integrating external APIs. Operators empower you to encapsulate application logic in a declarative and Kubernetes-native way, enhancing automation and maintainability.
To view the entire code referenced in this blog, please visit: https://github.com/JonesJefferson/operator-example