There's a healthy interest for Chia farming here by members but the atmosphere in this forum is a little unkind towards farming/mining in general so I'm flooded with messages in private instead. However, there's valuable information that should be publicly shared so I'll update this thread intermittently with new developments and discoveries about plotting/farming Chia plots without dragging other member's names who are also plotting/farming and are looking for ways to streamline/optimize their setup.
Up until now, I've been creating these 101GB K32 plot files under Windows 10 VMs across a cluster of Proxmox hypervisors. This is because my little server farm has prior commitments that cannot afford a bare metal install in any of its systems. Because of this there is some performance loss but they should be minimal and you will see better numbers if you're going to be following along with a native install.
The most powerful systems I have are with a Ryzen 5900x (purchased local), with an Aorus B550 Pro V2 motherboard (Amazon) and a 64GB (2x32GB) kit of GSkill DDR4 3600 (PrimeABGB). System drive is a pulled 512GB Samsung PM961 sourced from a refurbishing dealer, and installed on PCH lanes. Temporary storage is handled by 3x 1TB Silicon Power P34A80 (Amazon) that are all on CPU lanes, one in an M.2 slot and two in a m.2 to PCIe Adapter that MSI bundles with some of its motherboards.
This B550 has a surprisingly flexible bios that allows you to bifurcate the 16x CPU lanes in 8x + 8x or 4x + 4x + 4x + 4x configuration. If I was doing this all over again, I'd choose the Aorus B550 Master, this is the only AM4 motherboard that has both of it's m.2 slots connected to the CPU, and so the first 16x slot operates at 8x in this configuration. This allows you to install 3 NVMe drives without any expensive/rare PCIe adapters (with the third one using a simple 4x adapter). B550 motherboards also appear to be perfectly happy to have the primary GPU installed in PCH lanes.
Going back to the 1TB drives, they each have a 150GB SLC cache so in a software raid configuration, they're good for 450GB of continuous writes before speeds drop to around 750MB/s each, or 2250MB/s in Windows Storage Spaces or Linux MDADM raid. (Samsung's 970 series are best suited for this, both the 1TB Evo and the 1TB Pro. The 980 Pro is a downgrade from even the 970 Evo.) The NVMe drives are passed with their IOMMU groups directly to the virtual machine.
The other other VM on a 5900x system is OpenMediaVault with MergerFS. It's assigned 2 cores and 4GB of memory. The main plotting VM is assigned all 24 cores and 56GB, leaving 4GB for the hypervisor. Having the storage VM and plotting VM on the same system allows me to use a virtualized 10G ethernet between the two to transfer the plots, and the storage VM can later be migrated to another system for long term farming.
People aiming for over 20TB/day prefer Intel's Clear Linux but I have not been able to get that working without issues in a VM, and I lost about three precious days trying various configurations (Linux is not my forte). Yesterday, I decided to revisit installing Linux because of apparent performance improvements brought upon by an unofficial code optimization published here:
https://github.com/pechy/chiapos/tree/combined
I installed Ubuntu Server 21.04, along with standard installs of
chia-blockchain
and
Swar-Chia-Plot-Manager
in the home directory. Try and resist installing them outside of the home folder (causes permission issues) or renaming these folders after you've configured them, because it will break the install and/or the python virtual environment (I also learned this the hard way). The 3x 1TB drives were configured with
optimal
aligned partitions using 95% of available space in Raid 0 and formatted XFS, mounted with
defaults,discard,noatime
in
/etc/fstab
. I'll later set up a cronjob for daily
fstrim
to keep the SSD's performant.
Assigning 6 threads and 3584MB of memory in the config file for Swar, these are the numbers I achieved for a single K32 plot with the official chiapos (Chia Proof of Space executable):
$ cat /mnt/md0/logs/official_chiapos* | grep Time
Time for phase 1 = 4965.465 seconds. CPU (216.350%) Sat Jun 5 12:45:43 2021
Time for phase 2 = 3413.913 seconds. CPU (99.980%) Sat Jun 5 13:42:37 2021
Time for phase 3 = 6781.013 seconds. CPU (99.940%) Sat Jun 5 15:35:38 2021
Time for phase 4 = 363.775 seconds. CPU (99.970%) Sat Jun 5 15:41:41 2021
That's about 4 hours and 18 minutes per plot.
After compiling and substituting the chiapos from pechy's combined branch, the improvements were substantial:
$ cat /mnt/md0/logs/pechy_combined* | grep Time
Time for phase 1 = 2970.301 seconds. CPU (286.930%) Sun Jun 6 01:38:11 2021
Time for phase 2 = 3069.988 seconds. CPU (101.820%) Sun Jun 6 02:29:21 2021
Time for phase 3 = 2625.610 seconds. CPU (145.270%) Sun Jun 6 03:13:07 2021
Time for phase 4 = 247.394 seconds. CPU (103.440%) Sun Jun 6 03:17:14 2021
That's 2 hours and 28 minutes! Overall a sustantial improvment at minimal increased CPU usage. Memory is usage is apparently much higher from other people's reports though I have not been able to log this through the CLI.
That's it for now. More after I figure out Input/Output errors for CIFS shares under Linux.