Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AArch32: vpop (sreg variant) had faulty mult_addr ordering #6605

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Sleigh-InSPECtor
Copy link
Contributor

As part of a research project testing the accuracy of the SLEIGH specifications compared to real hardware, we observed an unexpected behaviour in the vpop instruction for both, AArch32 (ARM:LE:32:v8) & Thumb (ARM:LE:32:v8T).

According to the manual, it loads multiple consecutive Advanced SIMD and floating-point register file registers from the stack. However, we noticed the output was incorrect.


e.g, for AArch32 with,

Instruction: 0x020abd7c, vpopvc {s0,s1}
initial_memory: { "0x0": [ 0xbb, 0xda, 0x60, 0xca, 0xe8, 0x7c, 0x26, 0xcf ] }

We get:

Hardware: { "q0": 0xcf267ce8ca60dabb, "sp": 0x8 }
Patched Spec: { "q0": 0xcf267ce8ca60dabb, "sp": 0x8 }
Existing Spec: { "q0": 0xca60dabbcf267ce8, "sp": 0x8 }


e.g, for Thumb with,

Instruction: 0xbdec020a, vpop {s0,s1}
initial_memory: { "0x0": [ 0xbb, 0xda, 0x60, 0xca, 0xe8, 0x7c, 0x26, 0xcf ] }

We get:

Hardware: { "q0": 0xcf267ce8ca60dabb, "sp": 0x8 }
Patched Spec: { "q0": 0xcf267ce8ca60dabb, "sp": 0x8 }
Existing Spec: { "q0": 0xca60dabbcf267ce8, "sp": 0x8 }


Note: The patched spec does not introduce any disassembly changes to the best of our knowledge.


vpopSdList: "{"^buildVpopSdList^"}" is TMode=0 & D22 & c1215 & c0007 & buildVpopSdList [ regNum=(c1215<<1)+D22-1; counter=c0007-1; ]
{ mult_addr = sp; sp = sp + c0007 * 4; build buildVpopSdList; }
{ mult_addr = sp + ((c0007 - 1) * 4); sp = sp + c0007 * 4; build buildVpopSdList; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should counter be used in place of c0007 - 1 for clarity?

@@ -4676,14 +4676,14 @@ vpushSd64List: "{"^buildVpushSd64List^"}" is TMode=1 & thv_D22 & thv_c1215 & thv
}

buildVpopSdList: Sreg is counter=0 & Sreg [ regNum=regNum+1; ]
{ tmp:4 = *mult_addr; Sreg = zext(tmp); mult_addr = mult_addr + 4; }
{ tmp:4 = *mult_addr; Sreg = zext(tmp); mult_addr = mult_addr - 4; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operational encoding for vldm uses addition, not subtraction. The sleigh for the vldm* instructions do the same thing. So why is this instruction that is an alias for vldm using subtraction?

@Sleigh-InSPECtor
Copy link
Contributor Author

Sleigh-InSPECtor commented Jul 15, 2024

Yeah, checking through the implementation more carefully, it looks like the actual discrepancy between the vpop semantics and the vldm semantics is caused by a missing build statement.

The existing unpatched Ghidra generates Pcode that looks something like:

020abdec "vpop {s0,s1}"

mult_addr = sp
sp = sp + 0x8:4
tmp:4 = *[ram]mult_addr
s1 = tmp
mult_addr = mult_addr + 0x4:4
tmp:4 = *[ram]mult_addr
s0 = tmp

i.e., s0 and s1 are loaded in the wrong order.

The old 'fix' kept the operations in the same order, but adjusted the way mult_addr was computed so the values for s0 and s1 were loaded from the correct addresses.

The PR has been updated with a much simpler fix that simply adds the missing build so that the operations are ordered correctly (matching the vldm and being closer to reference manual):

Pcode after new patch:

mult_addr = sp
sp = sp + 0x8:4
tmp:4 = *[ram]mult_addr
s0 = tmp
mult_addr = mult_addr + 0x4:4
tmp:4 = *[ram]mult_addr
s1 = tmp

The same thing is true for #6604 (which has also been updated).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Processor/ARM Status: Triage Information is being gathered
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants