Skip to content

Bug: hetzner cluster(s) dead after first config change after initial setup #2063

@gebi

Description

@gebi

Current Behaviour

I used this howto for a simple claudie cluster on hetzner https://community.hetzner.com/tutorials/kubernetes-with-claudie
After the cluster is initially created, which works and one can get eg. all nodes of the new cluster.
But every change i tried after that re-created most nodes (all controle nodes) and resulted in all api servers being offline (connection refused)

The only change i did was scaling down the helsinki worker nodes from 2 to 1 and reapplying the Inputmanifest with kubectl apply -f hetzner-config.yaml

      - name: cmpt-hel
        providerSpec:
          name: hetzner-secret
          region: hel1
          zone: hel1-dc14
-        count: 2
+        count: 1
        serverType: cx23
        image: ubuntu-24.04
        storageDiskSize: 50

After the change there is no api server anymore (api server is down on all controll plane nodes)

% kubectl get nodes
The connection to the server 46.62.205.26:6443 was refused - did you specify the right host or port?

In the mgmt cluster the autoscaler hetzner component is restarting with backoff because of the same error

W0421 00:51:50.592234       1 feature_gate.go:352] Setting GA feature gate DynamicResourceAllocation=false. It will be removed in a future release.
F0421 00:51:50.645866       1 main.go:409] Failed to get nodes from apiserver: Get "https://46.62.205.26:6443/api/v1/nodes": dial tcp 46.62.205.26:6443: connect: connection refused
stream closed: EOF for claudie/autoscaler-hetzner-cluster-0uk1gcw-5649cf7884-lkk8m (cluster-autoscaler)

I found the following output in claudie/terraformer-667b9f4556-r278f:terraformer

hetzner-cluster-0uk1gcw          - name              = "hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02d" -> null
hetzner-cluster-0uk1gcw          - server_id         = 127580549 -> null
hetzner-cluster-0uk1gcw          - size              = 50 -> null
hetzner-cluster-0uk1gcw        }
hetzner-cluster-0uk1gcw    
hetzner-cluster-0uk1gcw      # hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att must be replaced
hetzner-cluster-0uk1gcw    -/+ resource "hcloud_volume_attachment" "hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att" {
hetzner-cluster-0uk1gcw          ~ id        = "105469553" -> (known after apply)
hetzner-cluster-0uk1gcw          ~ server_id = 127580555 # forces replacement -> (known after apply) # forces replacement
hetzner-cluster-0uk1gcw            # (2 unchanged attributes hidden)
hetzner-cluster-0uk1gcw        }
hetzner-cluster-0uk1gcw    
hetzner-cluster-0uk1gcw      # hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att will be destroyed
hetzner-cluster-0uk1gcw      # (because hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att is not in configuration)
hetzner-cluster-0uk1gcw      - resource "hcloud_volume_attachment" "hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att" {
hetzner-cluster-0uk1gcw          - automount = false -> null
hetzner-cluster-0uk1gcw          - id        = "105469551" -> null
hetzner-cluster-0uk1gcw          - server_id = 127580549 -> null
hetzner-cluster-0uk1gcw          - volume_id = 105469551 -> null
hetzner-cluster-0uk1gcw        }
hetzner-cluster-0uk1gcw    
hetzner-cluster-0uk1gcw    Plan: 5 to add, 0 to change, 8 to destroy.
hetzner-cluster-0uk1gcw    
hetzner-cluster-0uk1gcw    Changes to Outputs:
hetzner-cluster-0uk1gcw      ~ cmpt-hel-y1lzsl0_hetzner-secret_81495bda85d1a99d9f8bcddba4320596 = {
hetzner-cluster-0uk1gcw          ~ hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01 = "62.238.4.55" -> (known after apply)
hetzner-cluster-0uk1gcw          - hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02 = "62.238.15.8"
hetzner-cluster-0uk1gcw        }
hetzner-cluster-0uk1gcw      ~ ctrl-hel-ce9v4e5_hetzner-secret_81495bda85d1a99d9f8bcddba4320596 = {
hetzner-cluster-0uk1gcw          ~ hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01 = "46.62.205.26" -> (known after apply)
hetzner-cluster-0uk1gcw          ~ hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02 = "157.180.67.73" -> (known after apply)
hetzner-cluster-0uk1gcw          ~ hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03 = "204.168.243.233" -> (known after apply)
hetzner-cluster-0uk1gcw        }
hetzner-cluster-0uk1gcw    hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att: Destroying... [id=105469551]
hetzner-cluster-0uk1gcw    hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att: Destroying... [id=105469553]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destroying... [id=127580541]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destroying... [id=127580542]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destroying... [id=127580543]
hetzner-cluster-0uk1gcw    hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att: Destruction complete after 8s
hetzner-cluster-0uk1gcw    hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att: Destruction complete after 8s
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destroying... [id=127580549]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destroying... [id=127580555]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still destroying... [id=127580543, 10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still destroying... [id=127580541, 10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still destroying... [id=127580542, 10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destruction complete after 16s
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destruction complete after 16s
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destruction complete after 16s
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creating...
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creating...
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creating...
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still destroying... [id=127580549, 10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still destroying... [id=127580555, 10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destruction complete after 16s
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Destruction complete after 16s
hetzner-cluster-0uk1gcw    hcloud_volume.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume: Destroying... [id=105469551]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creating...
hetzner-cluster-0uk1gcw    hcloud_volume.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume: Destruction complete after 0s
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still creating... [10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still creating... [10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still creating... [10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creation complete after 10s [id=127582166]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creation complete after 12s [id=127582165]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creation complete after 12s [id=127582167]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Still creating... [10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_server.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596: Creation complete after 12s [id=127582194]
hetzner-cluster-0uk1gcw    hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att: Creating...
hetzner-cluster-0uk1gcw    hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att: Still creating... [10s elapsed]
hetzner-cluster-0uk1gcw    hcloud_volume_attachment.hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596_volume_att: Creation complete after 16s [id=105469553]
hetzner-cluster-0uk1gcw    ╷
hetzner-cluster-0uk1gcw    │ Warning: Argument is deprecated
hetzner-cluster-0uk1gcw    │ 
hetzner-cluster-0uk1gcw    │   with hcloud_server.hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596,
hetzner-cluster-0uk1gcw    │   on hetzner-cluster-0uk1gcw-hetzner-secret-node-81495bda85d1a99d9f8bcddba4320596.tf line 20, in resource "hcloud_server" "hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01_hetzner-secret_81495bda85d1a99d9f8bcddba4320596":
hetzner-cluster-0uk1gcw    │   20:           datacenter    = "hel1-dc14"
hetzner-cluster-0uk1gcw    │ 
hetzner-cluster-0uk1gcw    │ The datacenter attribute is deprecated and will be removed after 1 July
hetzner-cluster-0uk1gcw    │ 2026. Please use the location attribute instead. See
hetzner-cluster-0uk1gcw    │ https://docs.hetzner.cloud/changelog#2025-12-16-phasing-out-datacenters.
hetzner-cluster-0uk1gcw    │ 
hetzner-cluster-0uk1gcw    │ (and 13 more similar warnings elsewhere)
hetzner-cluster-0uk1gcw    ╵
hetzner-cluster-0uk1gcw    
hetzner-cluster-0uk1gcw    Apply complete! Resources: 5 added, 0 changed, 8 destroyed.
hetzner-cluster-0uk1gcw    
hetzner-cluster-0uk1gcw    Outputs:
hetzner-cluster-0uk1gcw    
hetzner-cluster-0uk1gcw    cmpt-hel-y1lzsl0_hetzner-secret_81495bda85d1a99d9f8bcddba4320596 = {
hetzner-cluster-0uk1gcw      "hetzner-cluster-0uk1gcw-cmpt-hel-y1lzsl0-01" = "62.238.15.8"
hetzner-cluster-0uk1gcw    }
hetzner-cluster-0uk1gcw    cmpt-nbg-7gb4vtq_hetzner-secret_81495bda85d1a99d9f8bcddba4320596 = {
hetzner-cluster-0uk1gcw      "hetzner-cluster-0uk1gcw-cmpt-nbg-7gb4vtq-01" = "116.203.46.6"
hetzner-cluster-0uk1gcw    }
hetzner-cluster-0uk1gcw    ctrl-hel-ce9v4e5_hetzner-secret_81495bda85d1a99d9f8bcddba4320596 = {
hetzner-cluster-0uk1gcw      "hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-01" = "157.180.67.73"
hetzner-cluster-0uk1gcw      "hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-02" = "204.168.243.233"
hetzner-cluster-0uk1gcw      "hetzner-cluster-0uk1gcw-ctrl-hel-ce9v4e5-03" = "46.62.205.26"
hetzner-cluster-0uk1gcw    }
2026-04-21T00:26:23Z INF Cluster build successfully Nats-Msg-Id=423ca6a1-6fa4-4f1f-99ff-711788cc36e8 claudie-internal-cluster-name=hetzner-cluster claudie-internal-input-manifest-name=claudie-multi-cloud-k8s cluster=hetzner-cluster-0uk1gcw module=terraformer terraform-stage=UPDATE_INFRASTRUCTURE
2026-04-21T00:26:23Z INF Task processed Nats-Msg-Id=423ca6a1-6fa4-4f1f-99ff-711788cc36e8 claudie-internal-cluster-name=hetzner-cluster claudie-internal-input-manifest-name=claudie-multi-cloud-k8s module=terraformer

Expected Behaviour

Cluster api still being available after scaling worker nodes from 2 to 1 ;)

Steps To Reproduce

As written in the linked tutorial and after comparing it with https://docs.claudie.io/v0.12.0/getting-started/detailed-guide/#claudie-deployment they seem to be the same.

kind create cluster --name claudie-mgmt
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.20.0/cert-manager.yaml
kubectl apply -f https://github.com/berops/claudie/releases/download/v0.12.0/claudie.yaml
kubectl apply -f claudie-manifest.yaml -n claudie

everything's fine, node listing works

% kubectl get nodes
NAME                  STATUS   ROLES           AGE     VERSION
cmpt-hel-y1lzsl0-01   Ready    <none>          3m5s    v1.34.0
cmpt-hel-y1lzsl0-02   Ready    <none>          3m5s    v1.34.0
cmpt-nbg-7gb4vtq-01   Ready    <none>          3m11s   v1.34.0
ctrl-hel-ce9v4e5-01   Ready    control-plane   5m59s   v1.34.0
ctrl-hel-ce9v4e5-02   Ready    control-plane   5m      v1.34.0
ctrl-hel-ce9v4e5-03   Ready    control-plane   4m6s    v1.34.0

Change helsinki worker nodes from 2 to 1 and reapply with kubectl apply -f and the cluster never recovers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions