Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add core temperatures for Rockchip RK3588 SoC #1411

Merged
merged 1 commit into from
Aug 29, 2024

Conversation

segabor
Copy link
Contributor

@segabor segabor commented Mar 11, 2024

My attempt to map SoC core temperatures to virtual cores. Rather a hack than a well reasoned change. There must be a proper way to determine chipset so temperature mapping can be done if condition is met (a refactor will be needed).
Also, data[0] is not set. We have two candidates here, center_thermal and soc_thermal values.

soc_temp_values_under_load

SoC Thermal Drivers

sensors utility bundled with lm-sensors package detected the following thermal drivers

npu_thermal-virtual-0
Adapter: Virtual device
temp1:        +35.2 C  

center_thermal-virtual-0
Adapter: Virtual device
temp1:        +35.2 C  

bigcore1_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  

soc_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  (crit = +115.0 C)

nvme-pci-0100
Adapter: PCI adapter
Composite:    +41.9 C  (low  =  -0.1 C, high = +71.8 C)
                       (crit = +89.8 C)

gpu_thermal-virtual-0
Adapter: Virtual device
temp1:        +35.2 C  

littlecore_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  

bigcore0_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  

Future Additions

  • NPU temperature
  • GPU temperature

References

@segabor
Copy link
Contributor Author

segabor commented Mar 11, 2024

This PR addresses #1404

@BenBE BenBE added the Linux 🐧 Linux related issues label Mar 11, 2024
@BenBE BenBE linked an issue Mar 11, 2024 that may be closed by this pull request
@BenBE
Copy link
Member

BenBE commented Mar 11, 2024

I think, instead of hacks on top of other hacks, the overall libsensors-to-cpucore mapping should be addressed properly.

Comment on lines 216 to 242
/* Map temperature values to Rockchip cores
*
* littlecore -> cores 1..4
* bigcore0 -> cores 5,6
* bigcore1 -> cores 7,8
*/
if (existingCPUs == 8 && String_eq(chip->prefix, "littlecore_thermal")) {
data[1] = temp;
data[2] = temp;
data[3] = temp;
data[4] = temp;
coreTempCount+=4;
continue;
}
if (existingCPUs == 8 && String_eq(chip->prefix, "bigcore0_thermal")) {
data[5] = temp;
data[6] = temp;
coreTempCount+=2;
continue;
}
if (existingCPUs == 8 && String_eq(chip->prefix, "bigcore1_thermal")) {
data[7] = temp;
data[8] = temp;
coreTempCount+=2;
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better solution here, instead of hard-coding CPU counts would be to determine the cores associated with each of the groups and assigning the values from there.

Also: There's some indentation issue …

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix the indentation. You're right about the code, it's just a POC. I like @Explorer09 's idea about syncing with the team behind libsensors.

@Explorer09
Copy link
Contributor

I think, instead of hacks on top of other hacks, the overall libsensors-to-cpucore mapping should be addressed properly.

I believe the libsensors-to-cpu-core mapping shouldn't be handled in htop's side. Perhaps we can ask this question in libsensors upstream to see if there is a portable way to describe the mappings.

My impression is that libsensors.conf is supposed to do the job. Or, maybe, ACPI thermal zone (even though I personally hate ACPI).
Once we have a "protocol" to read the mapping from, we would no longer need these chipset-specific hacks.

@BenBE
Copy link
Member

BenBE commented Mar 13, 2024

@Explorer09 Mind to open a discussion with the libsensors people and link it here for reference? It's not just this bug, but quite a few more, that basically boil down to how to perform this mapping. TIA.

@Wer-Wolf
Copy link

Maybe it would be better to create some sort of naming convention for the temperature channels used for the cpu cores.

I am strongly against adding driver-specific code to libsensors, it was done once and proved to be an maintainance nightmare.

With a proper naming convention in place, sensors.conf can be used to do the necessary adjustments.

@BenBE
Copy link
Member

BenBE commented Mar 14, 2024

We'd also strongly prefer not to deal with any of the driver-specific stuff in htop. And honestly, there are two places things could go: Either into sysfs alongside each sensor providing the core/thread information we need (can figure the rest from procfs -> cpuinfo). Or we get this information from the library we use for fetching the temperature information.

Both have their drawbacks, but the alternative would be, that each user of lmsensors would have to write their own code to figure this out instead, which is effectively worse …

@BenBE
Copy link
Member

BenBE commented Mar 14, 2024

@segabor Please squash your commits in this PR. Cf. the styleguide for details.

@BenBE
Copy link
Member

BenBE commented Mar 23, 2024

The latest rebase looks strange …

@segabor
Copy link
Contributor Author

segabor commented Mar 23, 2024

Sorry for the mess @BenBE , hope it's fixed now

Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mess seems fixed now …

linux/LibSensors.c Outdated Show resolved Hide resolved
Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor code style stuff and a small suggestion.

linux/LibSensors.c Outdated Show resolved Hide resolved
linux/LibSensors.c Outdated Show resolved Hide resolved
linux/LibSensors.c Outdated Show resolved Hide resolved
linux/LibSensors.c Outdated Show resolved Hide resolved
linux/LibSensors.c Outdated Show resolved Hide resolved
@isarrider
Copy link

@segabor highly appreacted as I have many rk3588...

@BenBE BenBE added this to the 3.4.0 milestone Aug 29, 2024
@segabor segabor marked this pull request as ready for review August 29, 2024 12:52
@BenBE BenBE merged commit 6de810f into htop-dev:main Aug 29, 2024
18 checks passed
@isarrider
Copy link

wohoo - so happy that my thank you triggered a merge ;)

@farlongsignal
Copy link

thank you for this, i easily adapted this for Qualcomm SDM845 (Oneplus 6 enchilada running PostmarketOS with mainline kernel), it reports temperatures in similar way.
Although i could imagine this getting really redundant if you wanted to support all the other possible Qualcomm chips.
Here in case someone wants it.

@BenBE
Copy link
Member

BenBE commented Sep 18, 2024

thank you for this, i easily adapted this for Qualcomm SDM845 (Oneplus 6 enchilada running PostmarketOS with mainline kernel), it reports temperatures in similar way. Although i could imagine this getting really redundant if you wanted to support all the other possible Qualcomm chips. Here in case someone wants it.

Could you please link the exact commit for future reference? That way you allow people even when your repo at some point contains more commits to easily find the exact change you are referring to in your comment. TIA.

Also looking at your commit I notice a few improvements that could/should be made:

  • Use sscanf in the code assigning the actual values to avoid redundancy. Alternatively you could String_startsWith(str, "cpu") && String_eq(str + 4, "_thermal") and pull the actual index (with range check) from str[3]-'0'+1 if it's a valid digit.
  • Proper alignment in the list of possible drivers at the top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Linux 🐧 Linux related issues
Projects
None yet
6 participants