Proxmox Thread - Home Lab / Virtualization

AwAcS · Aug 14, 2024

The processor for M920x tiny i5 9500t is adequate and so is the Ram (for start). What you would end up having trouble with is storage in terms of mini pc. For setting up RAID it is advised to have the storage of same make and model. What you can do is boot proxmox from a 2.5 inch SSD and place 2 X nvme storage in raid.
Considering that you would want to offload all you cloud data here, Nextcloud would do the trick but then, the storage limitations would remain. Hence, you can also look into sff providing same specifications and future upgrade path as they also would have 2/3 or more sata ports. These would also provide you with decidated pcie and in some cases, multiple pcie in case you want to add network adapter or GPU.

inc0d3r · Aug 15, 2024

AwAcS said:
The processor for M920x tiny i5 9500t is adequate and so is the Ram (for start). What you would end up having trouble with is storage in terms of mini pc. For setting up RAID it is advised to have the storage of same make and model. What you can do is boot proxmox from a 2.5 inch SSD and place 2 X nvme storage in raid.
Considering that you would want to offload all you cloud data here, Nextcloud would do the trick but then, the storage limitations would remain. Hence, you can also look into sff providing same specifications and future upgrade path as they also would have 2/3 or more sata ports. These would also provide you with decidated pcie and in some cases, multiple pcie in case you want to add network adapter or GPU.

I am preferring M920x tiny due to it's small form factor. My current storage requirements are 1-2 TB and long term should not go beyond 6-8TB. How I am planning to solve this is get a cheap NVME SSD 250GB [this] and will put this in NVME enclosure and connect to PC via USB 3.2 port. Will install Proxmox on 250GB NVME and keep OS and VMs in this. Will add a PCIe to dual NVME card [this] to add one 2TB M.2 NVME and 2 TB M.2 SATA . Will use this M.2 Key WiFi A+E to M.2 NVME adapter [this] to hook another 2 TB M.2 NVME (this works as confimed by @napstersquest . Will also add 2 TB M.2 NVME to the dedicated slots. With this I will be able to reach 10 TB of storage at max. The only downside would be I won't be able to use the PCIe port to hook dual Gigabit NIC card.

rsaeon · Aug 17, 2024

I had a humbling experience today. Hopefully, there's something in this wall of text that might be helpful to someone else. TLDR at the end.

Each of my proxmox clusters has a Lenovo Tiny that I use as a supervisor node. This supervisor node's job is to report which nodes are online and offline through /etc/pve/.members and this information is pulled in through ssh from another Tiny that's in my automation/monitoring/stats cluster.

That other Tiny automates shutdowns and startups since I don't have consistent power available to me. We see multiple power outages a day.

Today, the automated startup failed for one of my clusters after a power outage. When I went to see what happened, I saw that the supervisor node's ethernet lights were off. I have all my systems facing back-to-front for both das blinkenlights and the aesthetics of cable management:

Reseated the power connector, but still no ethernet lights. Tested the power connector with a multimeter and saw 0V between the centre pin and outside. A dead power adapter? But how? I have a 5000VA mainline stabilizer connected to an inverter which is then connected to an online ups, with MCBs and surge protectors in between all of them. No surge should've made it through.

I spent most of the night turning half of the house upside down looking for a spare Lenovo adapter — I had a spare Tiny so I should've had a power adapter for it somewhere. Didn't find it anywhere.

So I cut off the connector so that I could splice in another adapter. But I wanted to make sure this adapter was dead — and it wasn't! Measured 20V on the bare stripped wire.

I went to check the continuity on the other half of the wire I'd cut off, the one with the connector. There was none. No continuity between the centre pin and the bare wire I'd stripped off. Was there a loose or bad connection from fatigue?

I tried twisting the strain relief and intermittently heard a beep from the multimeter. A few more beeps later and I realize the centre pin is not a power pin, the power contacts are on the inside of the connector, and the centre pin is probably just a sense pin.

The power adapter was perfectly fine. I hastily spliced the wire back together and checked for 20V, found and it plugged it in. But still no ethernet lights.

So now I'm thinking I need to swap in my spare Tiny. Took this one out and started disassembling it since I needed the processor, memory and storage from it. Briefly considered it might not be starting because of a bad BIOS battery. I've seen that in the past. Battery tests 3.3V. Put it back in and tried powering it on with just one stick of RAM and processor and nothing else.

It powered up! Unbelievable. Did this mean that something died and prevented it from starting up? Tried two sticks of RAM, and it turned on. Plugged in the cooling fan, it turned on. Slotted in the SSD, it turned on. Added the ethernet adapter, and it turned on. What? I closed up the Tiny and it powered up just fine.

Connected the ethernet cables, no lights. Brought in a spare keyboard and monitor and I see Proxmox's login screen. Got in and did a ip a. All interfaces accounted for except the physical ones showed NO CARRIER.

It's been seven hours into this and there was nothing wrong with this Tiny. It was powering on just fine, I just couldn't see the front fascia with how it was installed so I went by just the ethernet lights and wrongly assumed it was not powering on. The first lesson learned: check both front and rear lights before assuming something is dead.

But why were the ethernet lights off? Turns out when I redid the aesthetic cable management a few months ago, I plugged the cable for this Tiny into a switch that would be powered off during extended power outages.

During such outages, the network is powered off five whole minutes after the cluster is powered off, plenty of time for the automation cluster to see that the nodes went offline. The node in charge of safe shutdowns and startups will only execute a cluster startup if all the nodes are offline, since it does the startup by toggling smart plugs and you don't want to be in a situation where a running node abruptly loses power.

I shifted focus to the automation cluster and the Tiny responsible shutdowns and startups. Uptime was a few hours — it should've been a few months. Something happened. Somehow a surge got through the many surge protectors, the stabilizer, the inverter, and the eye-wateringly-expensive online-freaking-ups that's not in eco mode and this Tiny was restarted.

When it restarted, it could not communicate with the supervisor node since that didn't have network connectivity. Without any way of knowing which nodes were online, it never executed the automated startup sequence. And that is exactly how I intended things to work but forgot all about it.

My brain feels like mush right now. I don't know what safeguards I need to put in place to prevent this from reoccurring, I just know that I need more safeguards. Also, I need to move the Tinys to a DC power system since apparently, AC isn't reliable even with a low-six-digit investment in power rectification. That'll be the other lesson learned.

TLDR: Consider shifting focus to agriculture, it'll be relaxing.

variablevector · Aug 17, 2024

Why do you think it was caused by a power surge and not some kind of software/hardware bug? If the automation cluster is just some old thin clients, they could have failed due to any number of other reasons like the power rails on the thin client itself, or some other hardware defect, a bug in the kernal/proxmox/whatever. Maybe even memory errors because you're not using ECC memory.

rsaeon · Aug 17, 2024

It is my oldest Tiny but it's still a socketed Lenovo M910Q, I don't expect to see hardware failures with that product line. Also, previous uptimes were in the range of months because I only ever turned them off during >7h power outages. Also also the uptime placed the restart within minutes of when power was restored by the electricity department here.

Also also also, I haven't seen any failures from the Dells or Lenovos I've owned in all these years, it's always been the gaming focused desktop-class motherboards that died randomly.

The thin clients that I have are on the less-important home networking cluster, all of my own stuff are socketed systems.

TEUser2K1 · Aug 17, 2024

rsaeon said:
Consider shifting focus to agriculture, it'll be relaxing.

r/sysadmin is famous for sheep herding (joke) as a trade after sysadmin. Seems somebody did it successfully.

In agriculture, indoor farming, aquaponics, etc. is very trending in some places now, involves lot of electronics, sensors, data monitoring and collection, etc. Very interesting field.

rsaeon · Aug 17, 2024

variablevector said:
Why do you think it was caused by a power surge

Between your comment here and a couple on r/homelab, I took a look at the UPS logs — it didn't occur to me earlier because I thought I had plenty of backup power.

Anyway it was the UPS, it was configured to cut off power 2 minutes after a low battery alarm, I've now changed it to 30 minutes. I probably need new batteries.

Decadent_Spectre · Aug 17, 2024

Thats an interesting story.

Somewhat unrelated but I don't really ever rely on automations and do things manually even if an automated option exists. Now I am sure I don't possess the software knowledge of many people and also I don't have the need for automations as I don't run multiple PCs/clients/servers but I do recommend that if possible one should do things manually. Call me paranoid but I don't trust computers.

Skynet is an eventuality if the human race survives long enough.

napstersquest · Sep 26, 2024

Scratching my head here. Need expert advice!

I have a x64 PC as a Proxmox Server, running multiple containers and VMs.

Some of the containers include: Nextcloud (Turnkey), Immich (via Docker-Portainer CT).

I have two main storage pools: ssd and hdd both run ZFS (RAIDZ1/2).

I have some user mapping done for the unprivileged Nextcloud and Immich containers, which is working fine as far as I can tell.

I also have some bind mounts, which are accessible inside CT and both CTs run fine (except for below).

I have a backup job for all containers in snapshot mode, runs at 5:30AM everyday. Total of 6-7 VMs / containers

The local-lvm is 100GB, and I do get a warning like this when I manually try to take a backup:

INFO: create storage snapshot 'vzdump'
  WARNING: You have not turned on protection against thin pools running out of space.
 WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
  Logical volume "snap_vm-103-disk-0_vzdump" created.
 WARNING: Sum of all thin volume sizes (400.00 GiB) exceeds the size of thin pool pve/data and the amount of free space in volume group (16.00 GiB).

Many a times all the VMs get stuck in locked state, and do not start.
I see error in history for Start job as : TASK ERROR: CT is locked (snapshot)
I have to restart the host to get them to start.

Sometimes it happens for all VMs, sometimes only for Immich, or Nextcloud. Other containers/VMs are HomeAssistant VM (ttek helper script), debian LXC for docker, File Sharing Server (Turnkey), Alpine LXC for docker.

rsaeon · Sep 27, 2024

Have you checked what the disk usage is inside each of your vm's? Sometimes log files get super bloated.

Those are warnings are to be expected when you overprovision your storage, but that shouldn't be an issue unless if your VM's actually end up using all that storage.

napstersquest · Sep 27, 2024

None of the VMs have crossed 10% of disk usage, allocation is 16/32/64 GB and does not cross the local-lvm even if all become full.
Only thing that has 16GB allocation on the host is:
root@proxmox:~# df -h

Filesystem                                                                               Size  Used Avail Use% Mounted on
udev                                                                                      16G     0   16G   0% /dev
tmpfs                                                                                    3.2G  1.6M  3.2G   1% /run
/dev/mapper/pve-root                                                                      94G  5.3G   84G   6% /
tmpfs                                                                                     16G   34M   16G   1% /dev/shm

bobbyprajan · Sep 29, 2024

Started moving the services running in Raspberry Pi's to Proxmox. I have only few VMs and in the Summary page of VM the Memory Usage graph shows high memory utilization. But when we check the OS (Linux) is actually using very little memory. Any idea why is it showing like this ?

vishalrao · Sep 29, 2024

bobbyprajan said:
Started moving the services running in Raspberry Pi's to Proxmox. I have only few VMs and in the Summary page of VM the Memory Usage graph shows high memory utilization. But when we check the OS (Linux) is actually using very little memory. Any idea why is it showing like this ?

Memory "over commit" maybe?
.

AI to the rescue -> https://chatgpt.com/share/66f95793-8c54-8003-b959-08d6ea41b157

bobbyprajan · Sep 29, 2024

vishalrao said:
Memory "over commit" maybe?
.

AI to the rescue -> https://chatgpt.com/share/66f95793-8c54-8003-b959-08d6ea41b157

Thank you, I thought it was due to some misconfiguration here

Mann · Oct 2, 2024

rsaeon said:
I had a humbling experience today. Hopefully, there's something in this wall of text that might be helpful to someone else. TLDR at the end.

Each of my proxmox clusters has a Lenovo Tiny that I use as a supervisor node. This supervisor node's job is to report which nodes are online and offline through /etc/pve/.members and this information is pulled in through ssh from another Tiny that's in my automation/monitoring/stats cluster.

That other Tiny automates shutdowns and startups since I don't have consistent power available to me. We see multiple power outages a day.

Today, the automated startup failed for one of my clusters after a power outage. When I went to see what happened, I saw that the supervisor node's ethernet lights were off. I have all my systems facing back-to-front for both das blinkenlights and the aesthetics of cable management:

View attachment 204976

Reseated the power connector, but still no ethernet lights. Tested the power connector with a multimeter and saw 0V between the centre pin and outside. A dead power adapter? But how? I have a 5000VA mainline stabilizer connected to an inverter which is then connected to an online ups, with MCBs and surge protectors in between all of them. No surge should've made it through.

I spent most of the night turning half of the house upside down looking for a spare Lenovo adapter — I had a spare Tiny so I should've had a power adapter for it somewhere. Didn't find it anywhere.

So I cut off the connector so that I could splice in another adapter. But I wanted to make sure this adapter was dead — and it wasn't! Measured 20V on the bare stripped wire.

I went to check the continuity on the other half of the wire I'd cut off, the one with the connector. There was none. No continuity between the centre pin and the bare wire I'd stripped off. Was there a loose or bad connection from fatigue?

I tried twisting the strain relief and intermittently heard a beep from the multimeter. A few more beeps later and I realize the centre pin is not a power pin, the power contacts are on the inside of the connector, and the centre pin is probably just a sense pin.

The power adapter was perfectly fine. I hastily spliced the wire back together and checked for 20V, found and it plugged it in. But still no ethernet lights.

So now I'm thinking I need to swap in my spare Tiny. Took this one out and started disassembling it since I needed the processor, memory and storage from it. Briefly considered it might not be starting because of a bad BIOS battery. I've seen that in the past. Battery tests 3.3V. Put it back in and tried powering it on with just one stick of RAM and processor and nothing else.

It powered up! Unbelievable. Did this mean that something died and prevented it from starting up? Tried two sticks of RAM, and it turned on. Plugged in the cooling fan, it turned on. Slotted in the SSD, it turned on. Added the ethernet adapter, and it turned on. What? I closed up the Tiny and it powered up just fine.

Connected the ethernet cables, no lights. Brought in a spare keyboard and monitor and I see Proxmox's login screen. Got in and did a ip a. All interfaces accounted for except the physical ones showed NO CARRIER.

It's been seven hours into this and there was nothing wrong with this Tiny. It was powering on just fine, I just couldn't see the front fascia with how it was installed so I went by just the ethernet lights and wrongly assumed it was not powering on. The first lesson learned: check both front and rear lights before assuming something is dead.

But why were the ethernet lights off? Turns out when I redid the aesthetic cable management a few months ago, I plugged the cable for this Tiny into a switch that would be powered off during extended power outages.

During such outages, the network is powered off five whole minutes after the cluster is powered off, plenty of time for the automation cluster to see that the nodes went offline. The node in charge of safe shutdowns and startups will only execute a cluster startup if all the nodes are offline, since it does the startup by toggling smart plugs and you don't want to be in a situation where a running node abruptly loses power.

I shifted focus to the automation cluster and the Tiny responsible shutdowns and startups. Uptime was a few hours — it should've been a few months. Something happened. Somehow a surge got through the many surge protectors, the stabilizer, the inverter, and the eye-wateringly-expensive online-freaking-ups that's not in eco mode and this Tiny was restarted.

When it restarted, it could not communicate with the supervisor node since that didn't have network connectivity. Without any way of knowing which nodes were online, it never executed the automated startup sequence. And that is exactly how I intended things to work but forgot all about it.

My brain feels like mush right now. I don't know what safeguards I need to put in place to prevent this from reoccurring, I just know that I need more safeguards. Also, I need to move the Tinys to a DC power system since apparently, AC isn't reliable even with a low-six-digit investment in power rectification. That'll be the other lesson learned.

TLDR: Consider shifting focus to agriculture, it'll be relaxing.

single-event upsets (SEUs) by cosmic rays/alfa particles. Try a Faraday cage

manu007 · Oct 2, 2024

inc0d3r said:
I am preferring M920x tiny due to it's small form factor. My current storage requirements are 1-2 TB and long term should not go beyond 6-8TB. How I am planning to solve this is get a cheap NVME SSD 250GB [this] and will put this in NVME enclosure and connect to PC via USB 3.2 port. Will install Proxmox on 250GB NVME and keep OS and VMs in this. Will add a PCIe to dual NVME card [this] to add one 2TB M.2 NVME and 2 TB M.2 SATA . Will use this M.2 Key WiFi A+E to M.2 NVME adapter [this] to hook another 2 TB M.2 NVME (this works as confimed by @napstersquest . Will also add 2 TB M.2 NVME to the dedicated slots. With this I will be able to reach 10 TB of storage at max. The only downside would be I won't be able to use the PCIe port to hook dual Gigabit NIC card.

Someone on reddit did a similar setup with 2 nodes, with the other acting as a backup node.
Reddit iCloud Replacement

My main issue with moving completely out of the cloud is that with a home solution, you still have a single point of failure. If the data is important, a 3-2-1 backup rule is generally recommended( 3 copies on 2 different media with 1 offsite).

burntwingzZz · Nov 22, 2024

Total noob on Proxmox /linux etc but want to jump into proxmox/ home nas bandwagon

How do we setup storage . i do have small 250 gb sata ssd / 2 tb nvmed SSD and multiple old sata HDD with various sizes from 160 to 1tb ( over the period i intend to move to larger drives or even to SSD). Not looking for raid setups but would like to store lot of media etc .Hard drive are old and they could fail and might be replaced by some other old hardrive with various capacity over the time .In windows i encountered a HDD failure and was able to recover almost 100% (3tb data) data using testdisk via a live linux using image method .mounted image run chkdsk it was able to fix the error .But suppose something in maybe true nas /zfs /ceph/goes corrupt etc would i be able to recover the same data without raid a single disk could be recovered ?

To summarize what would be the better option for storage where in multiple VM can can be hosted with their storage requirement and for file share / shared network folder/ ISO storage for OS/ live OS /PXE boot ,and maybe for nextcloud etc .

Will i be able to directly passon NTFS hard drive /it changes the file format and data will be retained or everything would be formatted.

I have a intel quad port nic i dont have a HBA card. A ryzen 1700 & b350 board with intel nic.

what would be my VM`s ,a Sophos home edition FW / opn sense/pfsense , Vlan for LAN ( wifi and home devices) .Dedicated vlan for VPN like services ,Dedicated vlan for maybe nextcloud /private cloud like apps .Internal apps would be in a dedicated vlan.

one of the most critical aspect i would like is to over subscribe my cpu cores for multiple containers or small apps services

4 -cores for Sophos ( it has its own requirements)
8-cores for Eve-ng / GNS3 vm

remaining 4 cores i want to over subscribe across multiple services/app containers etc.

++bump
++bump
@rockyo27 any help on storage and cpu over subscription ?

Rayman · Nov 23, 2024

Its hard to understand what you want exactly but will try to answer a few for starters,
Yes, you can configure storage easily but its tricky if you want to proxmox to use an existing drive with data. Ideally you would wipe it after plugging in but there is a workaround if you dont want to format the disk. Dont recall the website rn
Yes, you can overcommit your CPU cores without any issues regardless of whether you are using VMs or LXCs. If you want to setup LXCs easily you can check out community scripts for proxmox (tteck passed away so it was taken up by other people)

Now, coming to the actual implementation, its difficult to say what will work with your hardware. You have to test out a lot of stuff. So happy tinkering!!!

rsaeon · Nov 23, 2024

burntwingzZz said:
Will i be able to directly passon NTFS hard drive /it changes the file format and data will be retained

Yes, it would appear as a native drive to the VM however Proxmox wouldn't be able to access it except through the VM.

For example if you passthrough a 1tb drive to a nas vm, and set it up as a share, proxmox can access that share when the vm is online, but it won't be able to read the drive directly even when the vm is offline.

When you pass through a drive or any hardware, it becomes unavailable to the host as long as it is attached to the vm. Think of it as slicing out a piece of the host and attaching it to the vm.

Any backups you do of the vm with proxmox, it will only backup the vm system drive, not the passed through drives.

I don't use ZFS with proxmox mostly because I don't understand it.

On my Chia cluster, I set up a ssd as the main drive and put 8gb for root for Proxmox to use during install and the remaining is used up as a virtual disk for the Windows vm. Another vm on the same machine is OpenMediaVault which also has a virtual disk on the same ssd. All of the Chia drives are passed to OMV, which shares them as read-only. Chia node/wallet runs on the Windows VM and accesses the plots through samba.

So you can skip ZFS and use a single ssd for both Proxmox and VM virtual disks, and then pass through any number of additional disks to the VM's seperately. These disks can be SATA, USB, or NVME.

But since you're not using ZFS, you'll need some kind of redundancy in the form of backups.

burntwingzZz said:
one of the most critical aspect i would like is to over subscribe my cpu cores for multiple containers or small apps services

This part is pretty easy. Some of my 6C12T nodes have running vm's that total over 100 cores. I stop overprovisioning when average cpu usage of the node is around 70%.

NJ06 · Nov 29, 2024

Hi, I started dabbling in Homelab a few months back. Have a tiny pc with 12th Intel as my proxmox node and just added a NAS PC running TrueNas a week back.
I need advice from you guys about a peculiar issue I am facing. My TrueNas pc shuts down after some time on its own. There are no errors when I restart it using the hardware power button.
On my proxmox, I have a VM running a mediaserver using docker along with some LXCs for adguard, NPM, etc. It runs fine as long as the mediaserver VM is not running. But when I run it, the whole machine goes in reachable, even though the tiny PC is on. I have had it do this even while watching a video from the mediaserver through Plex. It seems high read write crashes it.
Any help, especially on the NAS machine shutting down would be much appreciated.

Search

Search

Proxmox Thread - Home Lab / Virtualization

AwAcS

inc0d3r

rsaeon

variablevector

rsaeon

TEUser2K1

rsaeon

Decadent_Spectre

napstersquest

Not Harold

rsaeon

napstersquest

Not Harold

bobbyprajan

vishalrao

Global Moral Police

bobbyprajan

Mann

manu007

burntwingzZz

Rayman

rsaeon

NJ06