Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow RPM to be sent directly to hardware, having no fan PID loop #32

Open
Krellan opened this issue Feb 22, 2023 · 8 comments
Open

Allow RPM to be sent directly to hardware, having no fan PID loop #32

Krellan opened this issue Feb 22, 2023 · 8 comments

Comments

@Krellan
Copy link
Member

Krellan commented Feb 22, 2023

Right now, the architecture of phosphor-pid-control assumes each thermal zone will have 2 cascaded PID loops:

  • The thermal loop, to decide on the necessary fan RPM to keep the system cool
  • The fan PID loop, to decide on the correct PWM percentage value, to command to the fan motors, to achieve that desired RPM

Newer hardware, with a smart fan controller, has the possibility of accepting a RPM value sent directly to hardware. The controller then becomes responsible for dealing with the fan motors to achieve that. It would often be more desirable to do this. The workaround is to have a dummy fan PID loop that is 1:1 input and output, but this is cumbersome.

To fully realize this feature, we will need:

  • Linux hwmon/kernel support for allowing RPM to be written directly, like PWM is today (this may involve toggling a mode in the hardware, not just writing a value, because some smart fan controllers operate in either "RPM mode" or "PWM mode" and you have to tell them what mode you are using in advance)
  • OpenBMC D-Bus sensor interface support for allowing RPM to be written directly, like PWM is today (perhaps a separate Control interface becomes necessary, as writing to an OpenBMC sensor merely fakes out the value (useful for testing), and does not actually push a value to the hardware)
  • A way for phosphor-pid-control to know that it should operate in this mode (perhaps this would be obvious by simply not having any fan PID loop in the thermal zone)

Anything else I forgot?

@Krellan
Copy link
Member Author

Krellan commented Feb 27, 2023

As for the dummy fan PID loop, here's how:

  • Set the feed-forward gain coefficient to 1.

  • Set all other coefficients to zero.

This gives you a PID loop that has no purpose but to make a 1:1 copy of its input to its output.

Use this as your fan PID loop. Set the output to your RPM sensors, not your PWM sensors.

However, a big gotcha. You will need to modify the sensor drivers accordingly, because usually the RPM sensor is not settable directly. You might need to perform the same customizations that were done for fan PWM sensors, to set up a control interface, such that the writing can be sent directly to the hardware, and not just to the sensor value itself (which would have no effect on the actual hardware).

Look at the special relationship between the TachSensor and FanSensor classes in the dbus-sensors package for more details: https://github.com/openbmc/dbus-sensors/blob/master/src/FanMain.cpp

@harveyquta
Copy link
Contributor

harveyquta commented Mar 1, 2023

I have a question about Fan RPM but not very relevant to this issue.
If there are two or more fan models in one case, it seems that using PWM control is better? Even if the smart fan controller has RPM mode. Because the fan's max/min RPM may not the same.

Another question, if using RPM control, it seems the second PID(fan PID loop) has to be ignored, like what you setting. Because the sys file fanX_input isn't the monitor RPM value(Maybe there are some differences between every smart fan controller), it cannot use fanX_input as input to calculate PID. Is my understanding correct?

@Krellan
Copy link
Member Author

Krellan commented Mar 10, 2023

That is a good point. If min/max RPM is different between your fans, then you have different models of fans installed in the same system. I would think that the best approach then would be to use more than one thermal zone. Use one thermal zone per fan, unless you have fans of identical model running in parallel to cool the same components. It is OK (and encouraged) to reuse the same temperature sensor as input into multiple thermal zones. Then, your PID loops, in each thermal zone, could adapt to the differences in your fan models accordingly.

As for using RPM control, as with a smart fan controller that does not need PWM, I am not sure about what you are asking. You want the fan PID loop to be a no-op. The units of it do not matter, because you are just multiplying it by 1, which will result in the output equalling the input, regardless of what units are in use. You could probably use the same sensor for both the input and the output in this case, since the PID loop would not be making any differences to the calculation, I think (I have not tested this approach, though).

@Krellan
Copy link
Member Author

Krellan commented Apr 4, 2023

To dummy out a PID loop, here's some advice and correction to the above. There's a difference between copying input to output, and copying setpoint to output.

To copy input to output, set the KP coefficient to 1, set the setpoint to 0, set everything else zero.

To copy setpoint to output, disregarding input, set the feed-forward Gain coefficient to 1, set everything else zero.

This gives you a 1:1 copy, without doing any further processing.

@namjoshiniks
Copy link

Thanks Josh! One nice to have feature is for swampd to provide a way to run open loops.

Today, it seems if we want to disable the fan loop, we still need to provide mock tach sensors (with rpm set to 0 ) as input (along with the other things you mentioned above), to make it work like an open loop.

@Krellan
Copy link
Member Author

Krellan commented Apr 4, 2023

Sounds reasonable. One way to specify an open loop would be to have a PID loop with an Inputs array that is completely empty (or the Inputs field itself missing). With no configured inputs at all, this could tell phosphor-pid-control to just blindly copy the setpoint to the output, and do no further processing. Right now, this configuration is disallowed and considered an error, so this wouldn't break any existing behavior.

@Krellan
Copy link
Member Author

Krellan commented May 17, 2023

Thought about it a little more. Another limitation came up, in the way that RPM sensors are currently handled.

There's no way to tell the difference between the desired value (as set by the user) and the actual value (as coming from the hardware). This is not important now for PWM, because PWM changes are essentially one-way, from user to hardware. A properly working fan hardware driver will not unilaterally change the PWM value on its own.

However, this will become important when we gain the ability to set the RPM value directly, by bypassing the PWM-to-RPM pipeline. There needs to be a way to maintain visibility into both of these values, to distinguish the desired RPM from the actual physical RPM, because the fan motor can not respond instantly. There will always be a little delay. Knowing this delay is important for a thermal engineer doing thermal tuning.

One nice thing about the PWM sensors is that they have a separate Control interface, which is in a separate D-Bus object. This is in addition to the normal D-Bus Sensor object, which provides the traditional read-only Value interface. This is a good solution for PWM, as it separates outbound control from inbound value. We will need to replicate something like this design, if we are to successfully add the ability to set RPM directly.

@Krellan
Copy link
Member Author

Krellan commented Jun 8, 2023

The usefulness of this feature also applies to another use case as well.

Sometimes, the RPM value is not needed at all by the user. The thermal PID loops are tuned to output the desired PWM percentage, which is sent directly to the fan hardware. RPM does not factor into the calculations at all. This is another case in which the user will need to create a dummy fan PID loop, to just do a 1:1 copy of the setpoint to the output.

It would be nice to make the configuration of phosphor-pid-control flexible enough that the existence of the second stage of the PID loop pipeline becomes optional, just to avoid the overhead of having to run a dummy PID loop in its place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants