feat: move guides
This commit is contained in:
79
src/content/docs/guides/Talos/talos-upgrade-1-12-0.mdx
Normal file
79
src/content/docs/guides/Talos/talos-upgrade-1-12-0.mdx
Normal file
@@ -0,0 +1,79 @@
|
||||
---
|
||||
title: Talos Upgrade 1.12.0
|
||||
description: Steps followed for the v1.12.0 upgrade process
|
||||
hero:
|
||||
tagline: Steps followed for the v1.12.0 upgrade process
|
||||
image:
|
||||
file: https://raw.githubusercontent.com/siderolabs/docs/3989ed11f0622252d7cee03b3ba3a3052be242d7/public/images/talos.svg
|
||||
---
|
||||
|
||||
The upgrade to this version was more extension as there have been migrations to using configuration documents. This required rewriting the configuration document to a series of patches and provide a deterministic generation command for the different host types. In addition there was also a change to storage layout to separate ceph, local-path, and ephemeral storage on the NUC hosts.
|
||||
|
||||
## Preparation
|
||||
|
||||
The NUC hosts are to be wiped since because of the storage reconfiguration. For the RPIs only the first command with proper image was needed. The new configuration format could be applied later. Both the node and the disks have to be removed.
|
||||
|
||||
The following command is used to upgrade the image. This was first to ensure the boot after wipe would still be v1.12.0 to use the updated configuration documents.
|
||||
```bash
|
||||
talosctl upgrade --nodes 10.232.1.23 --image factory.talos.dev/metal-installer/495176274ce8f9e87ed052dbc285c67b2a0ed7c5a6212f5c4d086e1a9a1cf614:v1.12.0
|
||||
```
|
||||
|
||||
Wipe command.
|
||||
```bash
|
||||
talosctl reset --system-labels-to-wipe EPHEMERAL,STATE --reboot -n 10.232.1.23
|
||||
```
|
||||
|
||||
## Remove old references
|
||||
|
||||
Remove the node from the cluster.
|
||||
```bash
|
||||
kubectl delete node talos-9vs-6hh
|
||||
```
|
||||
|
||||
Exec into the rook-ceph-tools container in order to remove the disk from the cluster.
|
||||
```bash
|
||||
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items\[\*].metadata.name}') -- bash
|
||||
```
|
||||
|
||||
Inside the rook-ceph-tools container remove the OSD/disk and the node:
|
||||
```bash
|
||||
ceph osd tree
|
||||
```
|
||||
```bash
|
||||
ceph osd out 0
|
||||
```
|
||||
```bash
|
||||
ceph osd purge 0 --yes-i-really-mean-it
|
||||
```
|
||||
```bash
|
||||
ceph osd crush remove talos-9vs-6hh
|
||||
```
|
||||
|
||||
# Apply new configuration
|
||||
|
||||
The wiped node should now be in maintenance mode and ready to be configured. Use the generate command in the README of the talos-config repo to make the configuration to be supplied.
|
||||
```bash
|
||||
talosctl apply-config -f generated/worker-nuc.yaml -n 10.232.1.23 --insecure
|
||||
```
|
||||
|
||||
Add the required labels if Talos does not add them:
|
||||
```yaml
|
||||
node-role.kubernetes.io/bgp: '65020'
|
||||
node-role.kubernetes.io/local-storage-node: local-storage-node
|
||||
node-role.kubernetes.io/rook-osd-node: rook-osd-node
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
Verify the disks have been created:
|
||||
```bash
|
||||
talosctl get disks -n 10.232.1.23
|
||||
```
|
||||
|
||||
Verify the mount paths and volumes are created:
|
||||
```bash
|
||||
talosctl -n 10.232.1.23 ls /var/mnt
|
||||
```
|
||||
```bash
|
||||
talosctl -n 10.232.1.23 get volumestatuses
|
||||
```
|
||||
72
src/content/docs/guides/Talos/talos-upgrade-generic.mdx
Normal file
72
src/content/docs/guides/Talos/talos-upgrade-generic.mdx
Normal file
@@ -0,0 +1,72 @@
|
||||
---
|
||||
title: Talos Upgrade Generic
|
||||
description: Steps followed for the standard upgrade process
|
||||
hero:
|
||||
tagline: Steps followed for the v1.12.0 upgrade process
|
||||
image:
|
||||
file: https://raw.githubusercontent.com/siderolabs/docs/3989ed11f0622252d7cee03b3ba3a3052be242d7/public/images/talos.svg
|
||||
---
|
||||
|
||||
This is the standard upgrade process for Talos. Relatively simple, just verify, run commands, and verify.
|
||||
|
||||
## Health Check
|
||||
|
||||
### Etcd
|
||||
|
||||
Check status of etcd, ensure there is a leader and there are no errors.
|
||||
|
||||
```bash
|
||||
talosctl -n 10.232.1.11,10.232.1.12,10.232.1.13 etcd status
|
||||
```
|
||||
|
||||
### Ceph
|
||||
|
||||
Check if ceph is healthy:
|
||||
|
||||
Either browse to the [webpage](https://ceph.alexlebens.net/#/dashboard), or run the following commands on the tools container
|
||||
|
||||
```bash
|
||||
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items\[\*].metadata.name}') -- bash
|
||||
```
|
||||
|
||||
Inside the rook-ceph-tools container check the status:
|
||||
```bash
|
||||
ceph status
|
||||
```
|
||||
|
||||
### Cloudnative-PG
|
||||
|
||||
Check the status of the Cloudnative-PG clusters to ensure they are all healthy. There is potential data loss if a worker node has a failure or the local volume isn't reattached.
|
||||
|
||||
[Dashboard](https://grafana.alexlebens.net/d/cloudnative-pg/cloudnativepg)
|
||||
|
||||
### Garage
|
||||
|
||||
Check the status of the Garage cluster to ensure there is no data loss of the local S3 store. This will result in data loss of short term WALs if this cluster fails
|
||||
|
||||
[Dashboard](https://garage-webui.alexlebens.net/)
|
||||
|
||||
## Upgrade
|
||||
|
||||
Reference the [config repo](https://gitea.alexlebens.dev/alexlebens/talos-config/src/branch/main) for the exact commands, links to the factory page, and update the image versions. Each type has its own image string.
|
||||
|
||||
As an example to upgrade a NUC node:
|
||||
```bash
|
||||
talosctl upgrade --nodes 10.232.1.23 --image factory.talos.dev/metal-installer/495176274ce8f9e87ed052dbc285c67b2a0ed7c5a6212f5c4d086e1a9a1cf614:v1.12.0
|
||||
```
|
||||
|
||||
# Apply new configuration
|
||||
|
||||
Use the generate command in the README of the talos-config repo to make the configuration to be supplied.
|
||||
|
||||
As an example to apply that generated config to a NUC node:
|
||||
```bash
|
||||
talosctl apply-config -f generated/worker-nuc.yaml -n 10.232.1.23
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
Verify all is health on the dashboard:
|
||||
```bash
|
||||
talosctl -n 10.232.1.23 dashboard
|
||||
```
|
||||
Reference in New Issue
Block a user