This guide walks you through installing and configuring Talos Linux, including generating secrets, creating configuration files, deploying the control plane, bootstrapping the cluster, exporting kubeconfig, deploying the Cilium CNI, and upgrading the cluster.
talosctlcommand-line tool installed.helmcommand-line tool installed.- DNS setup:
talos-api.whezzel.comis an A record pointing to the Talos VIP (10.69.20.10).- Each node has its own A record.
For the initial installation, download the SecureBoot ISO from the Talos image factory. The basic configuration (without any system extensions or other config options) uses the image ID 6905bc709e5947573a4ec2d11723b58882936d3d0e15c708f7d78f0c689684a5. Since this is just the boot ISO, it doesn't require special configuration.
Note
If you need additional system extensions to support specific hardware, generate a new image ID at https://factory.talos.dev/.
Download the ISO:
VERSION=<VERSION>
IMAGE_ID=6905bc709e5947573a4ec2d11723b58882936d3d0e15c708f7d78f0c689684a5
wget -c "https://factory.talos.dev/image/${IMAGE_ID}/v${VERSION}/metal-amd64-secureboot.iso" -O "metal-amd64-secureboot-v${VERSION}.iso"Note
Replace <VERSION> with the version number of Talos you intend to install, e.g., 1.10.6.
Boot each node from this ISO to start Talos in memory mode before applying configurations.
Generate secrets for Talos Linux:
talosctl gen secrets --output secrets.yamlCreate configuration files for the cluster:
talosctl gen config prod \
--with-secrets secrets.yaml \
https://talos-api.whezzel.com:6443 \
--force --output ~/repos/talosMove the talosconfig file to the correct location:
mkdir --parent ~/.talos
cp talosconfig ~/.talos/configPatch machine configurations for each node:
talosctl machineconfig patch controlplane.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-controlplane.yaml \
--patch @patches/talos01.yaml \
--output talos01.yaml
talosctl machineconfig patch controlplane.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-controlplane.yaml \
--patch @patches/talos02.yaml \
--output talos02.yaml
talosctl machineconfig patch controlplane.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-controlplane.yaml \
--patch @patches/talos03.yaml \
--output talos03.yaml
talosctl machineconfig patch worker.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-worker.yaml \
--patch @patches/talos04.yaml \
--output talos04.yaml
talosctl machineconfig patch worker.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-worker.yaml \
--patch @patches/talos05.yaml \
--output talos05.yaml
talosctl machineconfig patch worker.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-worker.yaml \
--patch @patches/talos06.yaml \
--output talos06.yaml
talosctl machineconfig patch worker.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-worker.yaml \
--patch @patches/shared-worker-nvidia.yaml \
--patch @patches/talos07.yaml \
--output talos07.yaml
talosctl machineconfig patch worker.yaml \
--patch @patches/shared.yaml \
--patch @patches/shared-worker.yaml \
--patch @patches/shared-worker-nvidia.yaml \
--patch @patches/talos08.yaml \
--output talos08.yamlSince the node lacks a certificate during initial deployment, use the --insecure flag to bypass authentication:
talosctl apply-config \
--nodes talos01 \
--file talos01.yaml \
--insecure \
--endpoints 10.69.20.11Note
The --insecure flag is required for the initial deployment because the node does not yet have a certificate, and authentication would otherwise fail. The VIP (talos-api.whezzel.com) is not active until the cluster is bootstrapped, so the node's IP (10.69.20.11) is used as the endpoint.
Bootstrap the cluster using the first control plane node:
talosctl bootstrap \
--nodes talos01 \
--endpoints 10.69.20.11Now that the cluster has been bootstrapped, set the endpoint to the Talos API VIP:
talosctl config endpoint talos-api.whezzel.comApply configurations to the remaining nodes (use --insecure as certificates are not yet issued):
talosctl apply-config -n talos02 --file talos02.yaml --insecure
talosctl apply-config -n talos03 --file talos03.yaml --insecure
talosctl apply-config -n talos04 --file talos04.yaml --insecure
talosctl apply-config -n talos05 --file talos05.yaml --insecure
talosctl apply-config -n talos06 --file talos06.yaml --insecure
talosctl apply-config -n talos07 --file talos07.yaml --insecure
talosctl apply-config -n talos08 --file talos08.yaml --insecureExport the kubeconfig to interact with the cluster:
talosctl kubeconfig ~/.kube/configWith Flannel CNI disabled, deploy Cilium as the CNI using the Helm chart.
Create a values.yaml file for Cilium:
cni:
exclusive: false
bgpControlPlane:
enabled: true
cgroup:
autoMount:
enabled: false
hostRoot: "/sys/fs/cgroup"
ipam:
mode: "kubernetes"
k8sServiceHost: "localhost"
k8sServicePort: "7445"
kubeProxyReplacement: true
kubeProxyReplacementHealthzBindAddr: "0.0.0.0:10256"
securityContext:
capabilities:
ciliumAgent: ["CHOWN","KILL","NET_ADMIN","NET_RAW","IPC_LOCK","SYS_ADMIN","SYS_RESOURCE","DAC_OVERRIDE","FOWNER","SETGID","SETUID"]
cleanCiliumState: ["NET_ADMIN","SYS_ADMIN","SYS_RESOURCE"]
image:
pullPolicy: "IfNotPresent"Deploy Cilium:
helm upgrade --install cilium cilium \
--repo https://helm.cilium.io \
--namespace cilium \
--create-namespace \
--values ./helm/cilium/values.yamlAfter deploying Cilium, your cluster is ready for workloads. For load balancer services via BGP, refer to the Cilium documentation: https://docs.cilium.io/en/stable/installation/k8s-install-helm/.
To upgrade Talos Linux, follow these steps to ensure a smooth and safe process. Perform upgrades one node at a time to maintain cluster availability.
Check the image IDs for your nodes to ensure they match the desired version and configuration.
For ARM-based control plane nodes (talos01, talos02, talos03):
curl -s -X POST --data-binary @customization/arm.yaml https://factory.talos.dev/schematics | jq -r '.id'For AMD64-based worker nodes (talos04, talos05, talos06):
curl -s -X POST --data-binary @customization/longhorn.yaml https://factory.talos.dev/schematics | jq -r '.id'For AMD64-based worker nodes with Nvidia GPU and Coral TPU (talos07, talos08):
curl -s -X POST --data-binary @customization/longhorn-coral-nvidia.yaml https://factory.talos.dev/schematics | jq -r '.id'Note
Ensure the customization/*.yaml files exist and contain the correct system extensions for your nodes. Update them if necessary to match the target Talos version.
Review and update the patch files (./patches/shared-controlplane.yaml, ./patches/shared-worker.yaml, ./patches/shared-worker-nvidia.yaml) to include the correct image IDs and any other required changes for the new Talos version.
Regenerate the configuration files for the cluster:
talosctl gen config prod \
--with-secrets secrets.yaml \
https://talos-api.whezzel.com:6443 \
--force --output ~/repos/talosEdit the generated controlplane.yaml and worker.yaml files to comment out the install section, as custom installation details are defined in the patch files:
--- controlplane.yaml
+++ controlplane.yaml
@@ -187,10 +187,10 @@
# enabled: true # Enable the KubeSpan feature.
# Used to provide instructions for installations.
- install:
- disk: /dev/sda # The disk used for installations.
- image: ghcr.io/siderolabs/installer:<version> # Allows for supplying the image used to perform the installation.
- wipe: false # Indicates if the installation disk should be wiped at installation time.
+# install:
+# disk: /dev/sda # The disk used for installations.
+# image: ghcr.io/siderolabs/installer:<version> # Allows for supplying the image used to perform the installation.
+# wipe: false # Indicates if the installation disk should be wiped at installation time.--- worker.yaml
+++ worker.yaml
@@ -187,10 +187,10 @@
# enabled: true # Enable the KubeSpan feature.
# Used to provide instructions for installations.
- install:
- disk: /dev/sda # The disk used for installations.
- image: ghcr.io/siderolabs/installer:<version> # Allows for supplying the image used to perform the installation.
- wipe: false # Indicates if the installation disk should be wiped at installation time.
+# install:
+# disk: /dev/sda # The disk used for installations.
+# image: ghcr.io/siderolabs/installer:<version> # Allows for supplying the image used to perform the installation.
+# wipe: false # Indicates if the installation disk should be wiped at installation time.Reapply patches to generate updated machine configuration files for each node, as described in Step 3: Generate Machine Configuration Files.
Upgrade each node one at a time, replacing {image_factory_id} with the appropriate image ID from Step 1 and {version} with the target Talos version (e.g., 1.11.1):
talosctl upgrade -n talos01 --image factory.talos.dev/metal-installer/{image_factory_id}:v{version}
talosctl upgrade -n talos02 --image factory.talos.dev/metal-installer/{image_factory_id}:v{version}
talosctl upgrade -n talos03 --image factory.talos.dev/metal-installer/{image_factory_id}:v{version}
talosctl upgrade -n talos04 --image factory.talos.dev/metal-installer-secureboot/{image_factory_id}:v{version}
talosctl upgrade -n talos05 --image factory.talos.dev/metal-installer-secureboot/{image_factory_id}:v{version}
talosctl upgrade -n talos06 --image factory.talos.dev/metal-installer-secureboot/{image_factory_id}:v{version}
talosctl upgrade -n talos07 --image factory.talos.dev/metal-installer-secureboot/{image_factory_id}:v{version}
talosctl upgrade -n talos08 --image factory.talos.dev/metal-installer-secureboot/{image_factory_id}:v{version}Note
Monitor each node's upgrade process using talosctl get members to ensure the node rejoins the cluster successfully before proceeding to the next node.
Perform a dry run to preview the Kubernetes upgrade changes:
talosctl upgrade-k8s --nodes talos01 --dry-runIf the dry run output is satisfactory, proceed with the Kubernetes upgrade:
talosctl upgrade-k8s --nodes talos01Note
The Kubernetes upgrade should be performed on a control plane node (e.g., talos01). Ensure all nodes are upgraded before running the Kubernetes upgrade.
To reset a node:
talosctl --nodes <node> reset --system-labels-to-wipe EPHEMERAL,STATE --reboot --graceful=falsekubectl debug -n kube-system -it --image alpine --profile=sysadmin node/<node>apk add cri-toolsexport CONTAINER_RUNTIME_ENDPOINT=unix:///host/run/containerd/containerd.sockcrictl rmi --prune