Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vitis HLS synthesis front-end OOM crash for some MVAU configs #1214

Open
fpjentzsch opened this issue Oct 15, 2024 · 0 comments
Open

Vitis HLS synthesis front-end OOM crash for some MVAU configs #1214

fpjentzsch opened this issue Oct 15, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@fpjentzsch
Copy link
Collaborator

fpjentzsch commented Oct 15, 2024

In some cases clang memory usage during the initial HLS phase explodes, resulting in the clang process being killed. The user receives ERROR: [HLS 200-1715] Encountered problem during source synthesis in the vitis_hls.log with no further explanation.

Example configs for the test_fpgadataflow_mvau_rtlsim unit test I have tested:

#@pytest.mark.parametrize("mem_mode", ["internal_decoupled"])
#@pytest.mark.parametrize("act", [DataType["UINT7"]])
#@pytest.mark.parametrize("wdt", [DataType["INT7"]])
#@pytest.mark.parametrize("idt", [DataType["INT8"]])
#@pytest.mark.parametrize("nf", [5])
#@pytest.mark.parametrize("sf", [720])
#@pytest.mark.parametrize("mw", [3600])
#@pytest.mark.parametrize("mh", [400])

Result: As soon as

remark: /home/felixj/WD/finn/deps/finn-hlslib/mvau.hpp:234:33: Applying array_partition to 'accu': Complete partitioning on dimension 1. Complete partitioning on dimension 2.

is reached in \sol1\.autopilot\db\a.g.ld.0.bc.clang.err.log, RAM usage explodes to > 40 GB, followed by

clang: error: unable to execute command: Killed
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 7.0.0
Target: fpga64-xilinx-none
Thread model: posix
InstalledDir: /tools/Xilinx/Vitis_HLS/2022.2/lnx64/tools/clang-3.9-csynth/bin
clang: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

This suggests that the problem is encountered during or directly after array partitioning. Tested with 2022.2 and 2023.1.

Curiously, if I keep SIMD=5 & PE=80 but reduce MW & MH to get folding factors of 1:

@pytest.mark.parametrize("mem_mode", ["internal_decoupled"])
@pytest.mark.parametrize("act", [DataType["UINT7"]])
@pytest.mark.parametrize("wdt", [DataType["INT7"]])
@pytest.mark.parametrize("idt", [DataType["INT8"]])
@pytest.mark.parametrize("nf", [1])
@pytest.mark.parametrize("sf", [1])
@pytest.mark.parametrize("mw", [5])
@pytest.mark.parametrize("mh", [80])

Result: Test passes, RAM usage peaks at ~12 GB. The clang log contains only one additional line beyond the line it failed at previously:

remark: /home/felixj/WD/finn/deps/finn-hlslib/mvau.hpp:237:30: Applying array_partition to 'w': Complete partitioning on dimension 1.

See also this AMD forum thread.

@fpjentzsch fpjentzsch added the bug Something isn't working label Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant