diff --git a/src/content/docs/guides/talos-upgrade.md b/src/content/docs/guides/talos-upgrade-1-12-0.md similarity index 100% rename from src/content/docs/guides/talos-upgrade.md rename to src/content/docs/guides/talos-upgrade-1-12-0.md diff --git a/src/content/docs/guides/talos-upgrade-generic.md b/src/content/docs/guides/talos-upgrade-generic.md new file mode 100644 index 0000000..033e9b2 --- /dev/null +++ b/src/content/docs/guides/talos-upgrade-generic.md @@ -0,0 +1,68 @@ +--- +title: Talos Upgrade +description: Steps followed for the standard upgrade process +--- + +This is the standard upgrade process for Talos. Relatively simple, just verify, run commands, and verify. + +## Health Check + +### Etcd + +Check status of etcd, ensure there is a leader and there are no errors. + +```bash +talosctl -n 10.232.1.11,10.232.1.12,10.232.1.13 etcd status +``` + +### Ceph + +Check if ceph is healthy: + +Either browse to the [webpage](https://ceph.alexlebens.net/#/dashboard), or run the following commands on the tools container + +```bash +kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items\[\*].metadata.name}') -- bash +``` + +Inside the rook-ceph-tools container check the status: +```bash +ceph status +``` + +### Cloudnative-PG + +Check the status of the Cloudnative-PG clusters to ensure they are all healthy. There is potential data loss if a worker node has a failure or the local volume isn't reattached. + +[Dashboard](https://grafana.alexlebens.net/d/cloudnative-pg/cloudnativepg) + +### Garage + +Check the status of the Garage cluster to ensure there is no data loss of the local S3 store. This will result in data loss of short term WALs if this cluster fails + +[Dashboard](https://garage-webui.alexlebens.net/) + +## Upgrade + +Reference the [config repo](https://gitea.alexlebens.dev/alexlebens/talos-config/src/branch/main) for the exact commands, links to the factory page, and update the image versions. Each type has its own image string. + +As an example to upgrade a NUC node: +```bash +talosctl upgrade --nodes 10.232.1.23 --image factory.talos.dev/metal-installer/495176274ce8f9e87ed052dbc285c67b2a0ed7c5a6212f5c4d086e1a9a1cf614:v1.12.0 +``` + +# Apply new configuration + +Use the generate command in the README of the talos-config repo to make the configuration to be supplied. + +As an example to apply that generated config to a NUC node: +```bash +talosctl apply-config -f generated/worker-nuc.yaml -n 10.232.1.23 +``` + +## Verification + +Verify all is health on the dashboard: +```bash +talosctl -n 10.232.1.23 dashboard +```