Need help with Debian C states

badwhitevision

Forerunner
I recently built a homelab with the following specs:

1. i3-14100 with stock cooler.
2. MSI Pro H610 M-E.
3. 16GB Crucial DDR5 4800 MHz.
4. Gigabyte 256 GB SSD Gen 3 (GSM2NE3256GNTD).
5. HPE 12 TB.
6. Seagate Exos 7E10 8 TB.
7. Cooler Master 450W MWE Bronze PSU.
8. 3 12V 3 pin Chassis Fans.

At idle, the current system on Debian with all my containers pulls 50w and
Code:
powertop
only shows upto C2_ACPI.
Powertop.png

Considering this to be too high and wanting to isolate the issue between hardware and software, I tried Ubuntu Live and was able to see until C7 immediately. (However it was just C7 and not C7_ACPI). However, at C7, there was no appreciable drop in power measured at the wall. (Was stilll idling around 40-45W)

Then, thinking OS difference might cause the issue, I tried Debian Live and found out that there also, the minimum was only C2_ACPI.

Running
Code:
powertop --auto-tune
does not change wall power consumption meaningfully in any of the 3 Operating Systems.

Running
Code:
turbostat
gives CPU power consumption (PkgWatt) at idle to be as low as 4 or 6W.

Now, I am confused as to whether the C7 in Ubuntu is equal to the C2_ACPI in Debian.

A point to note:

The 2 spinning rust drives are made to keep spinning always. I have changed the firmware to ensure that no spindown happens.
Similarly, the system is set to never suspend and runs in power saving mode. However, changing this to Balanced or Performance also doesn't have any appreciable difference, so there's that.
BIOS has the settings for power management at Auto. Changing them all to manual and C10 also doesn't change anything at the OS level.
Power measurements are made via a smart plug, who's power factor I had to calibrate, so there's that too.

What are my expectations/questions?

1. What exactly is C2_ACPI and why doesn't it drop more than that. I know ACPI stands for Advanced Configuration and Power Interface, but what exactly it does, I have no understanding.
2. Why the difference between Ubuntu and Debian live when Ubuntu is literally based off of Debian?
3. What configurations/mods can I do to reduce this to my expected power levels (25-30W)
4. I had a drive die because of the frequent spindown caused by a USB enclosure (or so my belief) I was using previously, hence since then, I've configured my drives to run 100% of the time. Should I change this? If I do change this, wouldn't the frequent spin up and down caused by the drives also contribute to power draw?
5. Since when running Debian Live. I am only getting the same power level as when I am running my full container setup, does this mean all my containers are optimised for power or can this only be ascertained after actually confirming that the OS is the problem?
6. Fellow homelab users who are not on a hypervisor, what OS do you use and how has your power consumption fared?

Thank you for taking the time to help a fellow member out.

Cheers.!!
 
3. What configurations/mods can I do to reduce this to my expected power levels (25-30W)
4. I had a drive die because of the frequent spindown caused by a USB enclosure (or so my belief) I was using previously, hence since then, I've configured my drives to run 100% of the time. Should I change this? If I do change this, wouldn't the frequent spin up and down caused by the drives also contribute to power draw?
3.) I am not sure about the Linux part of your question, but with a few percent error of the meter itself, the power levels seem correct for your hardware.

Consider that Bronze PSU willl have 60% or less efficiency below about 30% load.

Each HDD will draw about 6-8W when motors are active but heads are not. Add the fans, about 2W each for four fans (8W total), and a CPU, 6-10W. VRM efficiency will be around 70% at low loads (all switched supplies share this characteristic), but for the chosen board it may be lower (as it uses very low quality VRM), let's say this is drawing about 10-12W from the power supply including the SSD, memory and chipset as well. Adding all of this up gets you to about 35-ish watts. With the low PSU efficiency at this load, 50W from the wall is about right.

I would consider something like a 200W Meanwell PSU and a Pico-PSU to quickly drop the losses in the PSU. You can save maybe 8-10 watts right there. You're not going to like it, but the only other way to drop this further is to get the drives to sleep (might save around 10W for those durations). If you can't do that, this is pretty much the max. C states will not help you counter VRM losses. I'm not sure if this helps answer your question.

4.) There is a balance between "Never" and "Too often". I use hd-idle (apt install hd-idle) to let the drives sleep after an hour. I've experienced what you have, it's because of low-end Chinese junk being sold on Indian marketplaces. Lost three drives that way.
 
Thank you cranky for gour detailed answer. Could you point me to your sources on VRM chips and how one grades them? I'd like to make sure in my next build, I'm able to optimize power draw to ROI.

A point for the PSU, aren't the ratings supposed to measure them at 20%, 50% and 100% and have them be atleast 80% efficient or more at these loads? I read this while searching for the balance between price and efficiency. Not that it matters in any case, since my build is drawing somewhere around 10% now.

As for the pico PSU, one reason I avoided them was the lack of connectors for HDDs and in any case, I'm not in a situation to sink more money into this build right now, so that will have to wait.

I shall look into hd-idle to optimise drive power this week.

Also tagging rsaeon for any pointers.