It's always exciting to follow along HEDT builds.
Thank you so much for taking time to share your views here. You are one the most well versed and expert Proxmox user on the forum and your insights will greatly help me in this project.
Do you know if your application scales linearly with more cores in a single system? There's very few that do, and almost all computing these days benefit greatly from more workers than more cores.
Yes, most of my applications are able to scale linearly with more cores. I will need to do some practical testings with machines with 64 cores or more processors to notice the practical performance improvements though.
One advantage that I have is that my data files have ditto same schema throughout and they can be broken down into smaller pieces and the workflow can be run in parallel on those individual files. For example, instead of running a single workflow on a 10 GB data file, I can first split this file into two parts of 5 GB each and then run two separate workflows on both of these 5 GB files, to get the output quickly.
The workflow is going to remain the ditto same, the input files will also remain ditto same "just half in size" and the output file will also remain ditto same, in both these cases. But the workflow running time will get reduced, because I will be running both these processes in parallel rather then waiting for first workflow to complete.
This cannot be done in each and every case, but it can be done in more then 90% of my data files, which are all related to the stock market historical data only.
I started out with 16C/32T systems but I found myself being limited by memory (256GB for HEDT) well before I became CPU limited. Subsequently, I dropped down to 6C/12T with 128GB each and even then CPU usage didn't reach 50% before memory became a limiting factor.
I will need to do similar testings myself for becoming totally clear regarding the thresholds where my workflows would start to experience the bottleneck because of CPU Limit or Memory Limit or SSD I/O limits etc.
For my use case (cpu crunching/web scraping) it made more sense to have a large number of smaller machines than a less number of high core count machines. Parts were cheaper, downtime was much lower, ROI was far quicker, and power consumption was much lower. With larger servers, more of your productivity goes down if any maintenance or troubleshooting needs to be done. I have about a dozen 128GB machines now and it's a barely a blip in output if any one of them goes down for a reboot.
These points makes so much of sense. Having multiple machines with lower configuration definitely have their plus points for sure. It just depends from workload to workload. If this approach suits my requirements more, then I will definitely explore the idea of running smaller size virtual machines on different servers, instead of having one single huge virtual machine using up the hardware from multiple servers.
With Proxmox, a VM is configurable when it's offline but not while running. However, I've never needed to scale back resources because Proxmox is very good at dealing with provisioning.
I am totally fine with the requirement of turning off the server machine and make it offline before reconfiguring the VM size, depending upon my requirement at that moment. I dont really need all these hardware resources when I am not running some backtesting or some other heavy data analysis workflow etc. At that time, a simple 8 core VM is more then sufficient for me, for all my routine activities. Therefor I will be physically turning these server machines on and off, according to the requirement of the processing power and I will re-configure the Windows VM size accordingly from time to time. All this will be done for the sake of saving electricity costs etc. as there is no point in keep all these servers running, while they are not being used for data analysis.
For example, I had 50 dual-core Windows VMs running off a single quad core system with 32GB of memory. These were used for downvoting my enemies on reddit, and they worked very well.
You must me be KIDDING ME, right?
But if you are serious, then I would say to your enemies on reddit, that it is a very bad idea, to mess up with a guy who has such powerful multiple VM's at his disposal.
As for monitors, I have five and I want to add two more some time when my budget allows — most of those screen are used for real-time monitoring of resources across all of my servers. Eight screens would be a dream.
Thank God, finally I find someone who understands the practical value of having big screen real estate. Any type of real time monitoring becomes so much easier and effortless, when you are able to spread that information on multiple screens. Otherwise just the task of continuous flipping across multiple tabs/windows/screens creates additional stress.
Thanks again for your inputs, I request you to please visiting this thread in the future as well and keep on guiding people like me, who wants to build their home servers.
Thanks a lot.