Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitread: Don't overwrite last frame #1335

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tmichalak
Copy link
Contributor

@tmichalak tmichalak commented May 19, 2020

This PR targets the following issue #1285

Background

The bitstream can be divided into 4 parts:

  • Header
  • Fpga logic configuration prefix packets
  • Configuration data aka bits
  • Fgpa logic configuration suffix packets

bitread cares only about the configuration data part and fills in the underlying data structure which maps the frame addresses with the configuration words. Since FDRI writes can be in type 2 packets, where the FAR address is auto-incremented, the part.yaml file is used to make sure the frame addresses are actually present in the part.

Analysis

By looking at the output of bittool list_config_packets with the bitstream provided the configuration part for the BLOCK RAM column in the bottom part of the fpga is as follows:

 [Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
       c00000  << BOTTOM Row= 0 Column= 0 Minor= 0 Type=Block RAM

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:
       0        0        0        0 
       (...) << configuration words for frame c00000
       0 

[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
       c00000 << BOTTOM Row= 0 Column= 0 Minor= 0 Type=Block RAM

[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
1e640f57 

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:
       0        0        0        0 
       (...) << configuration words for frame c00001
       0 
[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
       c00001 
[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
12d781cb 

(packets for frames between c00002 and 0xc0017e)

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:
       0        0        0        0 
    8000        0        0        0  
   (...) << configuration words for frame c0017f
       0 
[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
  c0017f 
[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
f8a17b70 

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:
       0        0        0        0 
      (...) << zero frame
       0 
[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
  c0017f 
[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
22075150 

There seems to be a zero frame (frame with all 0s) at the end of the configuration data.
Such a zero frame is present at the end of the configuration words for each row.
The bug only applies to the last row which happened to configure the targeted BRAM_R tile.
This is because between the previous rows there was a write to the CMD register:

[Write Type=1 Address= 4 Length=         1 Reg="Command"]

which realigns the addresses. This packet is not present after the configuration data for the last row.

Bug

Due to the fact that the there is no WCFG write to the CMD register the zero frame is used to overwrite the content of the last configuration frame.

Fix

The fix makes sure the last frame does not get overwritten by adding a flag that signifies the end of the configuration data packets just before the zero frame.

Testing

Added a google unit test that reads the bitstream and checks that that the last configuration frame contains the word that was being overwritten before.

@litghost
Copy link
Contributor

So I don't really understand what the original problem was. Can you make a better description for the issue and the solution? Maybe some diagrams?

Copy link
Contributor

@litghost litghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like a better explaination for what the original problem was, and how this change fixes it. Maybe some diagrams? At a minimum, a comment in the code.

@tmichalak
Copy link
Contributor Author

@litghost I added more details to the PR description.

@litghost
Copy link
Contributor

litghost commented May 26, 2020

Bug

Due to the fact that the there is no write to the CMD register the zero frame is used to overwrite the content of the last configuration frame.

I think I know what is going on, and it has to do with how bitread models the FAR address register. I believe your analysis is not correct. If my analysis is correct, then the solution present in this PR is not correct. I'll take a look at the FAR address modelling code, and come back with a comment in a bit.

@litghost
Copy link
Contributor

litghost commented May 26, 2020

@tmichalak:

I believe the bug is actually located here: https://github.com/SymbiFlow/prjxray/blob/master/lib/include/prjxray/xilinx/configuration.h#L253

This switch statement likely needs to advance "frame_address_register" by 1 frame when the CRC command is recieved. I'll annotate your example:

 [Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
       c00000  << BOTTOM Row= 0 Column= 0 Minor= 0 Type=Block RAM

FAR = 0xc00000 because it was set via the "Frame Address" write

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:
       0        0        0        0 
       (...) << configuration words for frame c00000
       0 

FAR = 0xc00001, because "Frame Data Input" write advanced it.

[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
       c00000 << BOTTOM Row= 0 Column= 0 Minor= 0 Type=Block RAM

FAR = 0xc00000 because it was set via the "Frame Address" write.

[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
1e640f57 

FAR = 0xc00001 because it was set via the "Frame Address" write. Not that next write is a "Frame Data Input", because CRC has already advanced the FAR to the correct location!

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:
       0        0        0        0 
       (...) << configuration words for frame c00001
       0 
[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
       c00001 
[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
12d781cb 

(packets for frames between c00002 and 0xc0017e)

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:
       0        0        0        0 
    8000        0        0        0  
   (...) << configuration words for frame c0017f
       0 
[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
  c0017f 
[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
f8a17b70 

At this point the FAR = 0xc00180 because the CRC command advanced it 1 frame! So this zero frame is writing into 0xc00180, not overwriting 0xc0017f! This is where I believe the bug occurred.

[Write Type=1 Address= 2 Length=       101 Reg="Frame Data Input"]
Data in hex:c0017f
       0        0        0        0 
      (...) << zero frame
       0 
[Write Type=1 Address= 1 Length=         1 Reg="Frame Address"]
Data in hex:
  c0017f 
[Write Type=1 Address= 0 Length=         1 Reg="CRC"]
Data in hex:
22075150 
GitHub
Documenting the Xilinx 7-series bit-stream format. - SymbiFlow/prjxray

@tmichalak
Copy link
Contributor Author

@litghost
First of all this example is only for the last row, but such scenario applies to all other rows as well. I guess a broader picture is needed for this. So for the entire bitstream the critical frames look as follows:

WCFG Command

FAR 0
FDRI writes for FAR 0
FAR 0
CRC for FAR 0

FDRI writes for 1
FAR 1
CRC for FAR 1

FDRI writes n-1
FAR n-1
CRC for n-1

FDRI write n
FAR n
CRC for n

FDRI write zero frame
FAR n
CRC for zero frame and FAR n

WCFG command

FAR 40000 (0 in next row)
FDRI writes for FAR 40000
FAR 40000
CRC for FAR 40000

FDRI writes for 40001
FAR 40001
CRC for FAR 40001

FDRI writes 40002
....

FAR = 0xc00001 because it was set via the "Frame Address" write. Not that next write is a "Frame Data Input", because CRC has already advanced the FAR to the correct location!

This is true, but only for the first frame in a configuration row and this is how it is implemented. For the rest it is advanced by the FDRI write. Why? Because of how the configuration data is layed out in the bitstream (presented in the snippet above) and the fact that Type 2 writes don't have explicit FAR writes in between frames (it is auto-incremented).

This switch statement likely needs to advance "frame_address_register" by 1 frame when the CRC command is recieved.

Won't work for Type 2 writes where there is CRC only at the end of the packet.

At this point the FAR = 0xc00180 because the CRC command advanced it 1 frame! So this zero frame is writing into 0xc00180, not overwriting 0xc0017f! This is where I believe the bug occurred.

Incorrect. First of all there is no 0xc00180 frame in the FPGA, the FAR counter in this particular case advances to 0 since it reaches the end of the available address range. Secondly look at the broader snippet and all above points.

This being said I stand by my point. Maybe I should be more precise in the bug root cause which should say:
Due to the fact that the there is no WCFG write to the CMD register the zero frame is used to overwrite the content of the last configuration frame.

This entire discussion proves that there should be an extension to the bitstream format chapter of the documentation where all of the above should go into.

@litghost
Copy link
Contributor

#1335 (comment)

Your more detailed example doesn't change my point. The FAR is clearly advanced when the CRC command is issued, and if the bitread modeled that, no special logic would be required.

@litghost
Copy link
Contributor

Type 2 packets remember the previous register target (e.g. FDRI vs CRC vs CMD), but the FAR register behavior is likely independent of the Type 1/2 behavior.

@tmichalak
Copy link
Contributor Author

Analyzed the case with bits2rbt and bitread (with aux support required for bits2rbt) (PR)
To compare the original RBF (vivado generated) with the reconstructed (bits2rbt generated).

Without the fix, ie where the bits files are equal the RBF test fails:

573766c573766
< 00000000000000001000000000000000
---
> 00000000000000000000000000000000
573866c573866
< 11111000101000010111101101110000
---
> 00100010000001110101000101010000
bits2rbt_bram_xc7 FAIL

clearly there is one missing bit (bit_00c0017f_004_15) which causes the CRC value to be invalid

With the fix the bit bit_00c0017f_004_15 is in bits, but the RBT comparison fails on different CRCs:

573812c573812
< 00000000000000000000000000000000
---
> 00000000000000000001001110101111
573866c573866
< 11111000101000010111101101110000
---
> 01000011101111001000011010011110
573871a573872
> 00000000000000001000000000000000
573917,573918c573918
< 00000000000000000000000000000000
< 00000000000000000000000000000000
---
> 00000000000000000001001110101111
573972c573972
< 00100010000001110101000101010000
---
> 01000011101111001000011010011110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants