In the previous parts of this blog series we’ve covered various ways to manipulate and test Kubernetes YAML manifests using Ruby. But what if you’re using Helm to manage your cluster?
Helm helps you manage Kubernetes applications but it can can be frustrating. Not only do you have to generate a new nested YAML file in order to create another set of nested YAML files, but often it won’t quite generate what you want. You’re then stuck with either writing the whole thing out by hand yourself, or trying to get the chart modified to handle your case - usually via yet another addition to the already long list of parameters.
A case in point is the stable/cluster-autoscaler
chart which we want to reuse
to run the Brightbox
Autoscaler
on Kubernetes clusters within
Brightbox.
Using the values.yaml
file:
autoDiscovery:
clusterName: kubernetes.cluster.local
cloudProvider: brightbox
image:
repository: brightbox/cluster-autoscaler-brightbox
tag: dev
pullPolicy: Always
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
extraArgs:
v: 4
stderrthreshold: info
logtostderr: true
cluster-name: kubernetes.cluster.local
skip-nodes-with-local-storage: true
podDisruptionBudget: |
maxUnavailable: 1
podAnnotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8085'
rbac:
create: true
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
envFromSecret: brightbox-credentials
priorityClassName: "system-cluster-critical"
dnsPolicy: "Default"
Helm will generate manifests that very nearly run the Brightbox autoscaler. Nearly, but not quite.
I0204 14:08:07.543489 1 leaderelection.go:247] failed to acquire
lease kube-system/cluster-autoscaler
E0204 14:08:10.080196 1 leaderelection.go:331] error
retrieving resource lock kube-system/cluster-autoscaler:
leases.coordination.k8s.io "cluster-autoscaler" is forbidden: User
"system:serviceaccount:kube-system:release-brightbox-cluster-autoscaler"
cannot get resource "leases" in API group "coordination.k8s.io" in the
namespace "kube-system"
So close!
The RBAC generator within this version of the chart has a bug in it that doesn’t grant the correct permissions over the lease objects. We need to add
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create"]
- apiGroups: ["coordination.k8s.io"]
resourceNames: ["cluster-autoscaler"]
resources: ["leases"]
verbs: ["get", "update"]
to the bottom of the list of rules in the ClusterRole resource. We can’t
easily do this with a kubectl patch
because you’ll notice from the API that the rules list has no merge
tag, and can only be replaced in its entirety.
Additionally the chart adds a Service resource that we don’t really have any use for and want to remove. But there isn’t a parameter for that!
Fortunately we can fix both these issues with a little bit of Ruby:
require 'yaml'
result = []
YAML.load_stream(ARGF) do |resource|
case resource['kind']
when "Service"
next
when "ClusterRole"
resource['rules'] <<
{
"apiGroups" => ["coordination.k8s.io"],
"resources" => ["leases"],
"verbs" => ["create"]
} <<
{
"apiGroups" => ["coordination.k8s.io"],
"resourceNames" => ["cluster-autoscaler"],
"resources" => ["leases"],
"verbs" => ["get", "update"]
}
end
result << resource
end
print YAML.dump_stream(*result)
Now we can install the manifests by running the output of Helm through the Ruby filter.
$ helm template release stable/cluster-autoscaler --namespace kube-system -f values.yaml |
ruby fix_cluster_autoscaler.rb |
kubectl -n kube-system apply -f -
poddisruptionbudget.policy/release-brightbox-cluster-autoscaler created
serviceaccount/release-brightbox-cluster-autoscaler created
clusterrole.rbac.authorization.k8s.io/release-brightbox-cluster-autoscaler created
clusterrolebinding.rbac.authorization.k8s.io/release-brightbox-cluster-autoscaler created
role.rbac.authorization.k8s.io/release-brightbox-cluster-autoscaler created
rolebinding.rbac.authorization.k8s.io/release-brightbox-cluster-autoscaler created
deployment.apps/release-brightbox-cluster-autoscaler created
and we’re running
I0204 14:18:56.802671 1 leaderelection.go:252] successfully acquired
lease kube-system/cluster-autoscaler
I0204 14:18:56.803683 1 event.go:281]
Event(v1.ObjectReference{Kind:"Lease", Namespace:"kube-system",
Name:"cluster-autoscaler", UID:"25553d84-bb33-4be9-a55b-79490ce8ee91",
APIVersion:"coordination.k8s.io/v1", ResourceVersion:"13859814",
FieldPath:""}): type: 'Normal' reason: 'LeaderElection'
release-brightbox-cluster-autoscaler-55b79b9464-q8qbx became leader
You can also use the same script to repair an Helm installation of autoscaler that isn’t working properly
$ kubectl -n kubesystem get clusterrole \
release-brightbox-cluster-autoscaler -o yaml |
ruby fix_cluster_autoscaler.rb |
kubectl -n kube-system replace -f -
clusterrole.rbac.authorization.k8s.io/release-brightbox-cluster-autoscaler replaced
Here we extract the current YAML definition of the faulty ClusterRole from Kubernetes, run it through the Ruby Patch filter, then replace the resource on Kubernetes.
With the correct permission granted, the program obtains the lease and carries on.
Ruby’s well developed YAML and JSON libraries, along with its scripty nature allows you patch and adjust the output of other tools with ease.
Here I’ve used an external shell pipeline, but Ruby’s process management tools would allow you to embed the call out to Helm and include the values parameters within the Ruby program. Truly configuration as code.
You may like to try and write that as an exercise.
Brightbox have been managing web deployments large and small for over a decade. If you’re interested in the benefits of Kubernetes but want us to handle managing and monitoring it for you, drop us a line.