Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uavobjectmanager: Pack UAVObjEvent, saves 4% CPU for whatever the hell. #2211

Open
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

glowtape
Copy link
Member

This and #2210 brings CPU on F3 back to old levels of a few weeks back.

It works. Why? I have no idea.

@tracernz
Copy link
Member

tracernz commented May 20, 2018

Seems like a secondary effect? Unaligned access is slower (otherwise why wouldn't the compiler pack everything?).
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Cihjffga.html

Unaligned word or halfword loads or stores add penalty cycles. A byte aligned halfword load or store adds one extra cycle to perform the operation as two bytes. A halfword aligned word load or store adds one extra cycle to perform the operation as two halfwords. A byte-aligned word load or store adds two extra cycles to perform the operation as a byte, a halfword, and a byte. These numbers increase if the memory stalls. A STR or STRH cannot delay the processor because of the write buffer.

@mlyle
Copy link
Member

mlyle commented May 20, 2018

@tracernz -- one possible explanation is that it uses core-coupled memory more efficiently (and more stuff ends up in CCM as a result).

Sometimes on modern stuff packing is advantageous because it uses caches better, but that's not really a factor on M4.

@tracernz
Copy link
Member

one possible explanation is that it uses core-coupled memory more efficiently (and more stuff ends up in CCM as a result).

I didn't think we used all the CCM on any targets? Maybe that's changed in the last few months.

Sometimes on modern stuff packing is advantageous because it uses caches better, but that's not really a factor on M4.

Yes, but we don't need to worry about fitting into cache lines when there's no cache. 😛

@glowtape
Copy link
Member Author

I don't think it's CCM related. Yea, on the SPRF3e, we run out, but when I allocate everything in normal RAM, the CPU savings don't disappear. I'm not sure what else would slip into CCM that'd make up for it. On F4, we have plenty of CCM free after boot, and the packing still has an effect there, too.

@glowtape glowtape added this to the Inconceivable milestone May 21, 2018
@mlyle mlyle modified the milestones: Inconceivable, Ludicrous May 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants