feat: move guides
Some checks failed
test-build / build (push) Failing after 36s
test-build / guarddog (push) Successful in 46s

This commit is contained in:
2026-04-27 16:56:50 -05:00
parent 78da2d0e42
commit 33e887348b
5 changed files with 18 additions and 0 deletions

View File

@@ -0,0 +1,79 @@
---
title: Talos Upgrade 1.12.0
description: Steps followed for the v1.12.0 upgrade process
hero:
tagline: Steps followed for the v1.12.0 upgrade process
image:
file: https://raw.githubusercontent.com/siderolabs/docs/3989ed11f0622252d7cee03b3ba3a3052be242d7/public/images/talos.svg
---
The upgrade to this version was more extension as there have been migrations to using configuration documents. This required rewriting the configuration document to a series of patches and provide a deterministic generation command for the different host types. In addition there was also a change to storage layout to separate ceph, local-path, and ephemeral storage on the NUC hosts.
## Preparation
The NUC hosts are to be wiped since because of the storage reconfiguration. For the RPIs only the first command with proper image was needed. The new configuration format could be applied later. Both the node and the disks have to be removed.
The following command is used to upgrade the image. This was first to ensure the boot after wipe would still be v1.12.0 to use the updated configuration documents.
```bash
talosctl upgrade --nodes 10.232.1.23 --image factory.talos.dev/metal-installer/495176274ce8f9e87ed052dbc285c67b2a0ed7c5a6212f5c4d086e1a9a1cf614:v1.12.0
```
Wipe command.
```bash
talosctl reset --system-labels-to-wipe EPHEMERAL,STATE --reboot -n 10.232.1.23
```
## Remove old references
Remove the node from the cluster.
```bash
kubectl delete node talos-9vs-6hh
```
Exec into the rook-ceph-tools container in order to remove the disk from the cluster.
```bash
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items\[\*].metadata.name}') -- bash
```
Inside the rook-ceph-tools container remove the OSD/disk and the node:
```bash
ceph osd tree
```
```bash
ceph osd out 0
```
```bash
ceph osd purge 0 --yes-i-really-mean-it
```
```bash
ceph osd crush remove talos-9vs-6hh
```
# Apply new configuration
The wiped node should now be in maintenance mode and ready to be configured. Use the generate command in the README of the talos-config repo to make the configuration to be supplied.
```bash
talosctl apply-config -f generated/worker-nuc.yaml -n 10.232.1.23 --insecure
```
Add the required labels if Talos does not add them:
```yaml
node-role.kubernetes.io/bgp: '65020'
node-role.kubernetes.io/local-storage-node: local-storage-node
node-role.kubernetes.io/rook-osd-node: rook-osd-node
```
## Verification
Verify the disks have been created:
```bash
talosctl get disks -n 10.232.1.23
```
Verify the mount paths and volumes are created:
```bash
talosctl -n 10.232.1.23 ls /var/mnt
```
```bash
talosctl -n 10.232.1.23 get volumestatuses
```

View File

@@ -0,0 +1,72 @@
---
title: Talos Upgrade Generic
description: Steps followed for the standard upgrade process
hero:
tagline: Steps followed for the v1.12.0 upgrade process
image:
file: https://raw.githubusercontent.com/siderolabs/docs/3989ed11f0622252d7cee03b3ba3a3052be242d7/public/images/talos.svg
---
This is the standard upgrade process for Talos. Relatively simple, just verify, run commands, and verify.
## Health Check
### Etcd
Check status of etcd, ensure there is a leader and there are no errors.
```bash
talosctl -n 10.232.1.11,10.232.1.12,10.232.1.13 etcd status
```
### Ceph
Check if ceph is healthy:
Either browse to the [webpage](https://ceph.alexlebens.net/#/dashboard), or run the following commands on the tools container
```bash
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items\[\*].metadata.name}') -- bash
```
Inside the rook-ceph-tools container check the status:
```bash
ceph status
```
### Cloudnative-PG
Check the status of the Cloudnative-PG clusters to ensure they are all healthy. There is potential data loss if a worker node has a failure or the local volume isn't reattached.
[Dashboard](https://grafana.alexlebens.net/d/cloudnative-pg/cloudnativepg)
### Garage
Check the status of the Garage cluster to ensure there is no data loss of the local S3 store. This will result in data loss of short term WALs if this cluster fails
[Dashboard](https://garage-webui.alexlebens.net/)
## Upgrade
Reference the [config repo](https://gitea.alexlebens.dev/alexlebens/talos-config/src/branch/main) for the exact commands, links to the factory page, and update the image versions. Each type has its own image string.
As an example to upgrade a NUC node:
```bash
talosctl upgrade --nodes 10.232.1.23 --image factory.talos.dev/metal-installer/495176274ce8f9e87ed052dbc285c67b2a0ed7c5a6212f5c4d086e1a9a1cf614:v1.12.0
```
# Apply new configuration
Use the generate command in the README of the talos-config repo to make the configuration to be supplied.
As an example to apply that generated config to a NUC node:
```bash
talosctl apply-config -f generated/worker-nuc.yaml -n 10.232.1.23
```
## Verification
Verify all is health on the dashboard:
```bash
talosctl -n 10.232.1.23 dashboard
```