A year ago I built a NAS to reduce my reliance on cloud services, and set up an arr stack. I went with TrueNAS Scale, which was on Bluefin at the time. In the past 12 months, TrueNAS Scale has been through FOUR major OS versions, with a fifth already announced. At least one of those involved a release train switch so, despite diligently checking for updates in the dashboard, I was left in the dust with an obsolete OS, and didn’t find out until it was already a huge hassle to upgrade.
I’ve been really happy with the utility and benefit of having this tool, but holy smokes how is anybody supposed to keep up with all of this? This is far from my only hobby, and I simply do not have the time, patience, or interest for a constant race to keep up with vetting new release versions and fixing what breaks every 3 weeks. I have enough tinkering hobbies as it is.
On top of that, there’s the whole blow up with TrueCharts, which has also left me with an entire suite of obsolete albatrosses around my NAS that I need to deal with. Am I still waiting for them to figure out an upgrade path? I don’t even know anymore.
Sorry for the rant, but I guess what I’m looking for is: how do you keep up with the constant maintenance and updates, and where do I go from here, in February 2025, with a system running Bluefin 22.12, a 32TB ZFS pool (RAIDZ1) that has to remain intact, and a handful of TrueCharts apps that I don’t want to lose the data from (e.g. Jellyfin configs/watch history)?
Constant maintenance? What’s that?
Here’s my setup:
- OS - openSUSE Leap - I upgrade when I remember
- software - Docker images in a docker compose file; upgrading is a simple docker command, and I’ll only do it if I need something in the update
- hardware - old desktop; I’ll upgrade when I have extra hardware
I honestly don’t think about it. I run updates when I get to it (every month or so), and I’ll do an OS upgrade a little while after a new release is available (every couple years?). Software gets updated periodically if I’m bored and avoiding more important things. And I upgrade one thing at a time, so I don’t end up with everything breaking at once. BTRFS snapshots means I can always roll back if I break something.
I don’t even know what TrueCharts is. Maybe that’s your issue?
I use NixOS so if an update breaks, I just roll back. And since it’s effectively a rolling release distribution there isn’t any risk of being left behind on an outdated version.
Just subscribe to the release channel. That varies from OS to OS or Software, but is worth it.
Use tools that are universal. For example, I have not used TrueNAS Scale because they did not support native docker at the time. OS specific solutions are more likely to break then universal once (truecharts vs docker)
To get up and running again after a complete failure i can just download the latest config and data from my backup and set up any distro that supports docker and my system is running again.
I do OS upgrades when they are available, usually within 1 or 2 days and containers are updated with watchtower daily.
You might want to think about running a “stable” or “LTS” OS and spin up things in Docker instead. That way you only have to do OS level updates very rarely.
I learned this the hard way as well… I did a big OS update on mine once and it broke almost every application running on it. Docker worked perfectly still. I transferred everything I could to Docker after that.
I have everything containerized (Podman) on my Debian PC and use Diun to check for updates and send notifications to a Discord server that I monitor. I do all of my updates manually so I don’t update unless I have time to troubleshoot; if it breaks I still have the configs and data so I can delete the container and start over.
I also do monthly backups to cold storage (yeah, they should be weekly/biweekly but it’s just personal data that I’m okay with losing). I don’t use a RAID config or BTFS/ZFS like some do, so it’s pretty easy to just set it and forget it. It really depends on what you’re trying to do, how bulletproof it needs to be, and how you like to organize things.
I’ve got backups. Haven’t updated or looked at my server in months. If I’m ever compromised by missing security updates, I just load a backup and regenerate all keys.
I don’t put any critical data on public facing servers.
Ansible.
How does that help here?
For automating maintenance and updates? How exactly does it not?
In the business world it’s pretty common to do staged or switchover upgrades: test new version in a lab environment, iron out the install/config details. Then upgrade a single production server and do a test with a small group of users. Or, build new servers with the new stuff, have a set of users run on it for a while, in this way you can always just move those users back to a known good server.
How do you do this at home? VMs for lots of stuff, or duplicate hardware for NAS type stuff (I’ve read of running TrueNAS in a VM).
To borrow from the preparedness community: if you have 1 you have none, if you have 2 you have 1. As an example, the business world often runs mission-critical systems in a redundant setup in regionally-different data centers, so a storm won’t take them down. The question is how to reproduce this idea in a home lab environment.
This is not practical for a home setup. Not because it would be expensive for more hardware or whatever, but because as soon as you have multiple systems doing the same thing, their state diverges and for pretty much anything that is popular for selfhosting you cannot merge them again or mirgrate users between them without loosing anything. Distributed databases alone are a huge pita, and maintaining such redundant setups would be a million times more effort than just making sure that you can easily and quickly atomically roll back failed updates
If it works, I don’t update unless I’m bored or something. I also spread things out on multiple machines, so there’s less chance of stuff happening like you describe with the charts feature going away. My NAS is pretty much just a NAS now.
You can probably backup your configs/data, upgrade, then deploy jellyfin again, restore, and reconfigure. You should probably backup your data on your ZFS pool. But, I recently updated to the latest TrueNas Scale from ~5 year old FreeBSD version of TrueNas and the pools still worked fine (none of the “apps” or jails worked, obviously). The upgrade process even ported my service configurations over. I didn’t care about much of the data in the pools, so only backed up the most important stuff.
I don’t update unless I’m bored
Hahahaha, one of my kind!
My upgrades usually occur because I’m setting up a new system anyway, that way my effort is building for tomorrow in addition to the upgrades, and I get testing time to ensure changeover is pretty smooth.
I have rss feeds for my main service updates so I know what new features I have, the services mostly run in podman containers and update automatically each Monday. I also have daily backups (timed to run just before the update on monday) in case anything does break.
If it breaks I fix it depending on how much I want/need it, mostly it’s a matter of half an hour to fix it and with my current NixOS/Podman system I haven’t yet needed to fix anything this year so it breaks infrequently.
Also why are you using Kubernetes on a single host if you want minimal maintenance? XD
My recommendation is to switch to just managing containers, you should just be able to export the volumes out of kubernetes and import them as normal volumes, as long as they’re mounted in the right place you keep your data and if it doesn’t work just try again. Not like you need to destroy the current system to slowly replace it.
Edit: I also recommend to update and reboot frequently, this stops updates and unstable configurations from piling up.
Use Debian LTS or Ubuntu LTS (10 years support with free Ubuntu Pro). Turn on automatic unattended updates. Upgrade OS when you’re bored one of those years.
Keywords:
- Debian
- Ubuntu
- LTS
- ZFS
- Docker (compose)
Unraid + Unifi network equipment. Everything is scheduled and automatic, with the exception of large Unraid updates, but those are only every ~6 months. Every night mover from cache SSD - > HDD array, then checks for plugin updates, then docker container updates, if Monday morning SSD trim, and if 1st of the month does an array parity check/repair.
After all that if it’s Monday morning, Unifi will check for firmware then software updates.
Sometimes a docker container will get a breaking update maybe once a year, and then I just go look @ documentation and see what needs to be changed to the config to fix.
In life? Amphetamines.
I have automatic updates on everything. If it breaks, I fix it when I have time. If I don’t, it remains broken.
I could also just not do updates, but I like new features.
Is it exposed to the internet?
Mine is local only so I’m not as diligent with updates. I push them like once every 2-3 weeks. Some containers automatically update but some don’t because in the past that has broken associated scripts