Proxmox Thread - Home Lab / Virtualization

The processor for M920x tiny i5 9500t is adequate and so is the Ram (for start). What you would end up having trouble with is storage in terms of mini pc. For setting up RAID it is advised to have the storage of same make and model. What you can do is boot proxmox from a 2.5 inch SSD and place 2 X nvme storage in raid.
Considering that you would want to offload all you cloud data here, Nextcloud would do the trick but then, the storage limitations would remain. Hence, you can also look into sff providing same specifications and future upgrade path as they also would have 2/3 or more sata ports. These would also provide you with decidated pcie and in some cases, multiple pcie in case you want to add network adapter or GPU.
 
The processor for M920x tiny i5 9500t is adequate and so is the Ram (for start). What you would end up having trouble with is storage in terms of mini pc. For setting up RAID it is advised to have the storage of same make and model. What you can do is boot proxmox from a 2.5 inch SSD and place 2 X nvme storage in raid.
Considering that you would want to offload all you cloud data here, Nextcloud would do the trick but then, the storage limitations would remain. Hence, you can also look into sff providing same specifications and future upgrade path as they also would have 2/3 or more sata ports. These would also provide you with decidated pcie and in some cases, multiple pcie in case you want to add network adapter or GPU.
I am preferring M920x tiny due to it's small form factor. My current storage requirements are 1-2 TB and long term should not go beyond 6-8TB. How I am planning to solve this is get a cheap NVME SSD 250GB [this] and will put this in NVME enclosure and connect to PC via USB 3.2 port. Will install Proxmox on 250GB NVME and keep OS and VMs in this. Will add a PCIe to dual NVME card [this] to add one 2TB M.2 NVME and 2 TB M.2 SATA . Will use this M.2 Key WiFi A+E to M.2 NVME adapter [this] to hook another 2 TB M.2 NVME (this works as confimed by @napstersquest . Will also add 2 TB M.2 NVME to the dedicated slots. With this I will be able to reach 10 TB of storage at max. The only downside would be I won't be able to use the PCIe port to hook dual Gigabit NIC card.
 
I had a humbling experience today. Hopefully, there's something in this wall of text that might be helpful to someone else. TLDR at the end.

Each of my proxmox clusters has a Lenovo Tiny that I use as a supervisor node. This supervisor node's job is to report which nodes are online and offline through /etc/pve/.members and this information is pulled in through ssh from another Tiny that's in my automation/monitoring/stats cluster.

That other Tiny automates shutdowns and startups since I don't have consistent power available to me. We see multiple power outages a day.

Today, the automated startup failed for one of my clusters after a power outage. When I went to see what happened, I saw that the supervisor node's ethernet lights were off. I have all my systems facing back-to-front for both das blinkenlights and the aesthetics of cable management:


photo_2024-08-17 12.35.27.jpeg


Reseated the power connector, but still no ethernet lights. Tested the power connector with a multimeter and saw 0V between the centre pin and outside. A dead power adapter? But how? I have a 5000VA mainline stabilizer connected to an inverter which is then connected to an online ups, with MCBs and surge protectors in between all of them. No surge should've made it through.

I spent most of the night turning half of the house upside down looking for a spare Lenovo adapter — I had a spare Tiny so I should've had a power adapter for it somewhere. Didn't find it anywhere.

So I cut off the connector so that I could splice in another adapter. But I wanted to make sure this adapter was dead — and it wasn't! Measured 20V on the bare stripped wire.

I went to check the continuity on the other half of the wire I'd cut off, the one with the connector. There was none. No continuity between the centre pin and the bare wire I'd stripped off. Was there a loose or bad connection from fatigue?

I tried twisting the strain relief and intermittently heard a beep from the multimeter. A few more beeps later and I realize the centre pin is not a power pin, the power contacts are on the inside of the connector, and the centre pin is probably just a sense pin.

The power adapter was perfectly fine. I hastily spliced the wire back together and checked for 20V, found and it plugged it in. But still no ethernet lights.

So now I'm thinking I need to swap in my spare Tiny. Took this one out and started disassembling it since I needed the processor, memory and storage from it. Briefly considered it might not be starting because of a bad BIOS battery. I've seen that in the past. Battery tests 3.3V. Put it back in and tried powering it on with just one stick of RAM and processor and nothing else.

It powered up! Unbelievable. Did this mean that something died and prevented it from starting up? Tried two sticks of RAM, and it turned on. Plugged in the cooling fan, it turned on. Slotted in the SSD, it turned on. Added the ethernet adapter, and it turned on. What? I closed up the Tiny and it powered up just fine.

Connected the ethernet cables, no lights. Brought in a spare keyboard and monitor and I see Proxmox's login screen. Got in and did a ip a. All interfaces accounted for except the physical ones showed NO CARRIER.

It's been seven hours into this and there was nothing wrong with this Tiny. It was powering on just fine, I just couldn't see the front fascia with how it was installed so I went by just the ethernet lights and wrongly assumed it was not powering on. The first lesson learned: check both front and rear lights before assuming something is dead.

But why were the ethernet lights off? Turns out when I redid the aesthetic cable management a few months ago, I plugged the cable for this Tiny into a switch that would be powered off during extended power outages.

During such outages, the network is powered off five whole minutes after the cluster is powered off, plenty of time for the automation cluster to see that the nodes went offline. The node in charge of safe shutdowns and startups will only execute a cluster startup if all the nodes are offline, since it does the startup by toggling smart plugs and you don't want to be in a situation where a running node abruptly loses power.

I shifted focus to the automation cluster and the Tiny responsible shutdowns and startups. Uptime was a few hours — it should've been a few months. Something happened. Somehow a surge got through the many surge protectors, the stabilizer, the inverter, and the eye-wateringly-expensive online-freaking-ups that's not in eco mode and this Tiny was restarted.

When it restarted, it could not communicate with the supervisor node since that didn't have network connectivity. Without any way of knowing which nodes were online, it never executed the automated startup sequence. And that is exactly how I intended things to work but forgot all about it.

My brain feels like mush right now. I don't know what safeguards I need to put in place to prevent this from reoccurring, I just know that I need more safeguards. Also, I need to move the Tinys to a DC power system since apparently, AC isn't reliable even with a low-six-digit investment in power rectification. That'll be the other lesson learned.

TLDR: Consider shifting focus to agriculture, it'll be relaxing.
 
Why do you think it was caused by a power surge and not some kind of software/hardware bug? If the automation cluster is just some old thin clients, they could have failed due to any number of other reasons like the power rails on the thin client itself, or some other hardware defect, a bug in the kernal/proxmox/whatever. Maybe even memory errors because you're not using ECC memory.
 
It is my oldest Tiny but it's still a socketed Lenovo M910Q, I don't expect to see hardware failures with that product line. Also, previous uptimes were in the range of months because I only ever turned them off during >7h power outages. Also also the uptime placed the restart within minutes of when power was restored by the electricity department here.

Also also also, I haven't seen any failures from the Dells or Lenovos I've owned in all these years, it's always been the gaming focused desktop-class motherboards that died randomly.

The thin clients that I have are on the less-important home networking cluster, all of my own stuff are socketed systems.
 
Consider shifting focus to agriculture, it'll be relaxing.
r/sysadmin is famous for sheep herding (joke) as a trade after sysadmin. Seems somebody did it successfully.

In agriculture, indoor farming, aquaponics, etc. is very trending in some places now, involves lot of electronics, sensors, data monitoring and collection, etc. Very interesting field.
 
Why do you think it was caused by a power surge

Between your comment here and a couple on r/homelab, I took a look at the UPS logs — it didn't occur to me earlier because I thought I had plenty of backup power.

Anyway it was the UPS, it was configured to cut off power 2 minutes after a low battery alarm, I've now changed it to 30 minutes. I probably need new batteries.
 
Last edited:
Thats an interesting story.

Somewhat unrelated but I don't really ever rely on automations and do things manually even if an automated option exists. Now I am sure I don't possess the software knowledge of many people and also I don't have the need for automations as I don't run multiple PCs/clients/servers but I do recommend that if possible one should do things manually. Call me paranoid but I don't trust computers.

Skynet is an eventuality if the human race survives long enough.
 
Scratching my head here. Need expert advice!

I have a x64 PC as a Proxmox Server, running multiple containers and VMs.

Some of the containers include: Nextcloud (Turnkey), Immich (via Docker-Portainer CT).

I have two main storage pools: ssd and hdd both run ZFS (RAIDZ1/2).

I have some user mapping done for the unprivileged Nextcloud and Immich containers, which is working fine as far as I can tell.

I also have some bind mounts, which are accessible inside CT and both CTs run fine (except for below).

I have a backup job for all containers in snapshot mode, runs at 5:30AM everyday. Total of 6-7 VMs / containers

The local-lvm is 100GB, and I do get a warning like this when I manually try to take a backup:
INFO: create storage snapshot 'vzdump' WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. Logical volume "snap_vm-103-disk-0_vzdump" created. WARNING: Sum of all thin volume sizes (400.00 GiB) exceeds the size of thin pool pve/data and the amount of free space in volume group (16.00 GiB).

Many a times all the VMs get stuck in locked state, and do not start.
I see error in history for Start job as : TASK ERROR: CT is locked (snapshot)
I have to restart the host to get them to start.

Sometimes it happens for all VMs, sometimes only for Immich, or Nextcloud. Other containers/VMs are HomeAssistant VM (ttek helper script), debian LXC for docker, File Sharing Server (Turnkey), Alpine LXC for docker.
 
Have you checked what the disk usage is inside each of your vm's? Sometimes log files get super bloated.

Those are warnings are to be expected when you overprovision your storage, but that shouldn't be an issue unless if your VM's actually end up using all that storage.
 
None of the VMs have crossed 10% of disk usage, allocation is 16/32/64 GB and does not cross the local-lvm even if all become full.
Only thing that has 16GB allocation on the host is:
root@proxmox:~# df -h
Filesystem Size Used Avail Use% Mounted on udev 16G 0 16G 0% /dev tmpfs 3.2G 1.6M 3.2G 1% /run /dev/mapper/pve-root 94G 5.3G 84G 6% / tmpfs 16G 34M 16G 1% /dev/shm
 
Started moving the services running in Raspberry Pi's to Proxmox. I have only few VMs and in the Summary page of VM the Memory Usage graph shows high memory utilization. But when we check the OS (Linux) is actually using very little memory. Any idea why is it showing like this ?
 
I had a humbling experience today. Hopefully, there's something in this wall of text that might be helpful to someone else. TLDR at the end.

Each of my proxmox clusters has a Lenovo Tiny that I use as a supervisor node. This supervisor node's job is to report which nodes are online and offline through /etc/pve/.members and this information is pulled in through ssh from another Tiny that's in my automation/monitoring/stats cluster.

That other Tiny automates shutdowns and startups since I don't have consistent power available to me. We see multiple power outages a day.

Today, the automated startup failed for one of my clusters after a power outage. When I went to see what happened, I saw that the supervisor node's ethernet lights were off. I have all my systems facing back-to-front for both das blinkenlights and the aesthetics of cable management:


View attachment 204976


Reseated the power connector, but still no ethernet lights. Tested the power connector with a multimeter and saw 0V between the centre pin and outside. A dead power adapter? But how? I have a 5000VA mainline stabilizer connected to an inverter which is then connected to an online ups, with MCBs and surge protectors in between all of them. No surge should've made it through.

I spent most of the night turning half of the house upside down looking for a spare Lenovo adapter — I had a spare Tiny so I should've had a power adapter for it somewhere. Didn't find it anywhere.

So I cut off the connector so that I could splice in another adapter. But I wanted to make sure this adapter was dead — and it wasn't! Measured 20V on the bare stripped wire.

I went to check the continuity on the other half of the wire I'd cut off, the one with the connector. There was none. No continuity between the centre pin and the bare wire I'd stripped off. Was there a loose or bad connection from fatigue?

I tried twisting the strain relief and intermittently heard a beep from the multimeter. A few more beeps later and I realize the centre pin is not a power pin, the power contacts are on the inside of the connector, and the centre pin is probably just a sense pin.

The power adapter was perfectly fine. I hastily spliced the wire back together and checked for 20V, found and it plugged it in. But still no ethernet lights.

So now I'm thinking I need to swap in my spare Tiny. Took this one out and started disassembling it since I needed the processor, memory and storage from it. Briefly considered it might not be starting because of a bad BIOS battery. I've seen that in the past. Battery tests 3.3V. Put it back in and tried powering it on with just one stick of RAM and processor and nothing else.

It powered up! Unbelievable. Did this mean that something died and prevented it from starting up? Tried two sticks of RAM, and it turned on. Plugged in the cooling fan, it turned on. Slotted in the SSD, it turned on. Added the ethernet adapter, and it turned on. What? I closed up the Tiny and it powered up just fine.

Connected the ethernet cables, no lights. Brought in a spare keyboard and monitor and I see Proxmox's login screen. Got in and did a ip a. All interfaces accounted for except the physical ones showed NO CARRIER.

It's been seven hours into this and there was nothing wrong with this Tiny. It was powering on just fine, I just couldn't see the front fascia with how it was installed so I went by just the ethernet lights and wrongly assumed it was not powering on. The first lesson learned: check both front and rear lights before assuming something is dead.

But why were the ethernet lights off? Turns out when I redid the aesthetic cable management a few months ago, I plugged the cable for this Tiny into a switch that would be powered off during extended power outages.

During such outages, the network is powered off five whole minutes after the cluster is powered off, plenty of time for the automation cluster to see that the nodes went offline. The node in charge of safe shutdowns and startups will only execute a cluster startup if all the nodes are offline, since it does the startup by toggling smart plugs and you don't want to be in a situation where a running node abruptly loses power.

I shifted focus to the automation cluster and the Tiny responsible shutdowns and startups. Uptime was a few hours — it should've been a few months. Something happened. Somehow a surge got through the many surge protectors, the stabilizer, the inverter, and the eye-wateringly-expensive online-freaking-ups that's not in eco mode and this Tiny was restarted.

When it restarted, it could not communicate with the supervisor node since that didn't have network connectivity. Without any way of knowing which nodes were online, it never executed the automated startup sequence. And that is exactly how I intended things to work but forgot all about it.

My brain feels like mush right now. I don't know what safeguards I need to put in place to prevent this from reoccurring, I just know that I need more safeguards. Also, I need to move the Tinys to a DC power system since apparently, AC isn't reliable even with a low-six-digit investment in power rectification. That'll be the other lesson learned.

TLDR: Consider shifting focus to agriculture, it'll be relaxing.
single-event upsets (SEUs) by cosmic rays/alfa particles. Try a Faraday cage
 
I am preferring M920x tiny due to it's small form factor. My current storage requirements are 1-2 TB and long term should not go beyond 6-8TB. How I am planning to solve this is get a cheap NVME SSD 250GB [this] and will put this in NVME enclosure and connect to PC via USB 3.2 port. Will install Proxmox on 250GB NVME and keep OS and VMs in this. Will add a PCIe to dual NVME card [this] to add one 2TB M.2 NVME and 2 TB M.2 SATA . Will use this M.2 Key WiFi A+E to M.2 NVME adapter [this] to hook another 2 TB M.2 NVME (this works as confimed by @napstersquest . Will also add 2 TB M.2 NVME to the dedicated slots. With this I will be able to reach 10 TB of storage at max. The only downside would be I won't be able to use the PCIe port to hook dual Gigabit NIC card.
Someone on reddit did a similar setup with 2 nodes, with the other acting as a backup node.
Reddit iCloud Replacement

My main issue with moving completely out of the cloud is that with a home solution, you still have a single point of failure. If the data is important, a 3-2-1 backup rule is generally recommended( 3 copies on 2 different media with 1 offsite).
 
Back
Top