Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeChecker not parsing correctly windows escaped paths in compile_commands.json #4277

Open
UmbraMalison opened this issue Jun 25, 2024 · 12 comments · May be fixed by #4374
Open

CodeChecker not parsing correctly windows escaped paths in compile_commands.json #4277

UmbraMalison opened this issue Jun 25, 2024 · 12 comments · May be fixed by #4374
Labels
analyzer 📈 Related to the analyze commands (analysis driver) bug 🐛 platform-Windows 🖥

Comments

@UmbraMalison
Copy link

UmbraMalison commented Jun 25, 2024

Describe the bug
Rightly or wrongly, the compile_commands.json file is being created with both unix forward slash, and escaped windows backward slash.
if you tell me that the bug is further upchain so to speak, because the database should be consistent in this regard then I would need to unwrap Zephyr RTOS and CMake. To start my investigation, this seems like the logical place.

This is an extract from the database as created by Zephyr RTOS/CMake:

{
  "directory": "A:/projects/AMPD/AE/board/firmware/build",
  "command": "A:\\tools\\zephyr\\zephyr-sdk-0.16.8\\arm-zephyr-eabi\\bin\\arm-zephyr-eabi-gcc.exe

as you can see the directory is in unix format, but the command is in windows format, with the assumption of one layer of escaping.
It is only the directory that is in unix format, the rest of the contents is formatted like the command shown above. (i'm missing the rest because in poking around I have now broken my build env, but want to shoot this report off ASAP as it's a fundamental blocker for me right now).

I'm developing on windows with a mix of windows native tools and mingw tools. (Could that be a factor?)

My build logs (for the codechecker stage, the main build is successful)

Failed to get and parse version of: A:toolszephyrzephyr-sdk-0.16.8arm-zephyr-eabibinarm-zephyr-eabi-gcc.exe
[WinError 2] The system cannot find the file specified

CodeChecker version

[INFO 2024-06-25 08:53] - CodeChecker analyzer version:
---------------------------------------------------------------
Kind                 | Version
---------------------------------------------------------------
Base package version | 6.23.1
Package build date   | 2023-12-14T14:38
Git commit ID (hash) | 2a8fa6e711a4ff591280a79fe8798dee2507d984
Git tag information  | 6.23.1
---------------------------------------------------------------

[INFO 2024-06-25 08:53] - CodeChecker web version:
------------------------------------------------------------------------------
Kind                                | Version
------------------------------------------------------------------------------
Base package version                | 6.23.1
Package build date                  | 2023-12-14T14:38
Git commit ID (hash)                | 2a8fa6e711a4ff591280a79fe8798dee2507d984
Git tag information                 | 6.23.1
Server supported Thrift API version | 6.54
Client Thrift API version           | 6.54
------------------------------------------------------------------------------

To Reproduce
Steps to reproduce the behaviour:

(On Windows)

  1. Follow the Zephyr getting started page to install tools: https://docs.zephyrproject.org/latest/develop/getting_started/index.html
  2. Follow the Zephyr CodeChecker page to install and demo codechecker: https://docs.zephyrproject.org/latest/develop/sca/codechecker.html
  3. observe the file path to gcc that does not contain any folders path separators.

Expected behaviour
I expected it to parse the path to GCC correctly.

Desktop (please complete the following information)

  • OS: Windows
  • Browser: N/A
  • Version: 23H2 22631.3593

Additional context
Mix of tooling. CMake was installed via Winget, make is coming from msys2, as is ninja.
This isn't as per the STR, which uses chocolatey, i'm in the process of verifying that aspect of the setup.

@UmbraMalison
Copy link
Author

also, I found these similar reports:
swiftlang/sourcekit-lsp#1028
https://gitlab.kitware.com/cmake/cmake/-/issues/25580

@whisperity whisperity added bug 🐛 analyzer 📈 Related to the analyze commands (analysis driver) tools 🛠️ Meta-tag for all the additional tools supplied with CodeChecker: plist2html, tu_collector, etc. ld-logger 📃 platform-Windows 🖥 labels Jun 25, 2024
@UmbraMalison
Copy link
Author

I've validated the STR. I see no reason why one would not see the exact same error.

@whisperity
Copy link
Contributor

N.b.: CodeChecker isn't officially supported on Windows, we don't have Windows users and Windows-based development infrastructure to develop or test. Is the tool/project using CodeChecker's logging mechanism to create the (faulty) build log, or is this the compilation database emitted by CMake? (If the former, what happens if you try to analyse using a CMake-provided compilation database?)

@UmbraMalison
Copy link
Author

The tool (ecosystem) is zephyr, which uses CMake. My zephyr project has it's own CMakeLists (that I own) , but also a zephyr layer of CMake configuration exists (somewhere) that must instruct CMake to output the compilation database (as i'm not required to stipulate such configuration). I don't believe Zephyr would create the db it's self (but I am not a zephyr developer).

I did not pick up any cues that Windows was not officially supported, but are you saying unofficially it should work?
if I've understood the CodeChecker (which at this stage seems unlikely), and if I've understood the marketplace and options for what CodeChecker provides, I can't fathom why you would not have Windows users.

It seems like Zephyr did not identify this either, as Zephyr is cross platform and makes no mention of CodeChecker being a non-windows option.

Moving forward, I guess you are suggesting that you won't be adding support for windows paths in the CDB.
So my only option would be to find out who is responsible for paths in the CBD, Zephyr or CMake? (and hope that makes CodeChecker unofficially work).

Obviously it would be more robust if CodeChecker did manage this scenario, as apparently it can happen.

@pdgendt
Copy link
Contributor

pdgendt commented Jun 26, 2024

The compilation database emitted from zephyr is CMake vanilla:

https://github.com/zephyrproject-rtos/zephyr/blob/7f8cc43a0b5e675569866b7364ab34ba8ebd4e0a/cmake/sca/codechecker/sca.cmake#L8-L9

# CodeChecker uses the compile_commands.json as input
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

@whisperity
Copy link
Contributor

Moving forward, I guess you are suggesting that you won't be adding support for windows paths in the CDB.

All I am saying is that we do not have the necessary environment for developing on Windows, i.e., even if we did develop something, we would not be able to test it meaningfully. There might be kidney shots down the line with socket handling or environment-related stuff. But it also might not be a tremendous effort, but we don't know just yet. Likely it can be done easily for the paths part, because pathlib is now a thing that we've moved to and perhaps even beyond Python 3.8. But I think we would be happy to collaborate with someone who does have a Windows ecosystem in getting this done. (See #555, #562, …).

The compilation database emitted from zephyr is CMake vanilla:

But then why is CMake creating two different styles of path handling (POSIX-y / in the directory, but properly escaped \\ in the command) in the output?

@UmbraMalison Could you please post a full compilation database entry (one full { ... }), including the file field, and potentially the -I include path compile options as passed to the compiler? If there is something proprietary in the directory structure, feel free to obfuscate it, what interests me right now is how many different styles of paths and escaping exists therein.

@whisperity whisperity removed tools 🛠️ Meta-tag for all the additional tools supplied with CodeChecker: plist2html, tu_collector, etc. ld-logger 📃 labels Jun 26, 2024
@UmbraMalison
Copy link
Author

UmbraMalison commented Jun 28, 2024

@whisperity apologies for my delay.

but i have more news. I see this in every CDB i've looked at (on my work laptop).

So here is non-zephyr example, this is a CMake based STM32 project:

{
  "directory": "A:/projects/div/grp/board/firmware/build/Debug",
  "command": "C:\\ST\\STM32CubeCLT\\1.15.1\\GNU-tools-for-STM32\\bin\\arm-none-eabi-gcc.exe -DDEBUG -DSTM32G484xx -DUSE_HAL_DRIVER -IA:/projects/div/grp/board/firmware/modules/board/inc -IA:/projects/div/grp/board/firmware/modules/tasks/inc -IA:/projects/div/grp/board/firmware/modules/ethercat/inc -IA:/projects/div/grp/board/firmware/modules/libs/inc -IA:/projects/div/grp/board/firmware/modules/libs/devices/inc -IA:/projects/div/grp/board/firmware/modules/libs/motors/devices/tmc5160/inc -IA:/projects/div/grp/board/firmware/modules/libs/motors/controllers/inc -IA:/projects/div/grp/board/firmware/modules/libs/misc/inc -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Core/Inc -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/STM32G4xx_HAL_Driver/Inc -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/STM32G4xx_HAL_Driver/Inc/Legacy -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/CMSIS/Device/ST/STM32G4xx/Include -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/CMSIS/Include  -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard  -Wall -Wextra -Wpedantic -fdata-sections -ffunction-sections -O0 -g3 -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard  -Wall -Wextra -Wpedantic -fdata-sections -ffunction-sections -O0 -g3 -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard  -Wall -Wextra -Wpedantic -fdata-sections -ffunction-sections -O0 -g3 -g -std=gnu11 -o CMakeFiles\\firmware.dir\\Core\\Src\\sysmem.c.obj -c A:\\projects\\div\\grp\\board\\firmware\\Core\\Src\\sysmem.c",
  "file": "A:\\projects\\div\\grp\\board\\firmware\\Core\\Src\\sysmem.c",
  "output": "CMakeFiles\\firmware.dir\\Core\\Src\\sysmem.c.obj"
},

what is common between this occurance and the previous:

  1. Windows
  2. CMake
  3. VSCode

what is different:

  1. GCC/Toolchain
  2. sources/libraries/flags

@UmbraMalison
Copy link
Author

by replacing the windows escaped paths for unix style, i was able to analyse the files.
in doing so, codechecker creates a CDB called compile_cmd.json, located inside ./.codechecker.

  {
    "directory": "A:/projects/div/grp/board/firmware/build/Debug",
    "command": "C:/ST/STM32CubeCLT/1.15.1/GNU-tools-for-STM32/bin/arm-none-eabi-gcc.exe -DDEBUG -DSTM32G484xx -DUSE_HAL_DRIVER -IA:/projects/div/grp/board/firmware/modules/board/inc -IA:/projects/div/grp/board/firmware/modules/tasks/inc -IA:/projects/div/grp/board/firmware/modules/ethercat/inc -IA:/projects/div/grp/board/firmware/modules/libs/inc -IA:/projects/div/grp/board/firmware/modules/libs/devices/inc -IA:/projects/div/grp/board/firmware/modules/libs/motors/devices/tmc5160/inc -IA:/projects/div/grp/board/firmware/modules/libs/motors/controllers/inc -IA:/projects/div/grp/board/firmware/modules/libs/misc/inc -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Core/Inc -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/STM32G4xx_HAL_Driver/Inc -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/STM32G4xx_HAL_Driver/Inc/Legacy -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/CMSIS/Device/ST/STM32G4xx/Include -IA:/projects/div/grp/board/firmware/cmake/stm32cubemx/../../Drivers/CMSIS/Include  -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard  -Wall -Wextra -Wpedantic -fdata-sections -ffunction-sections -O0 -g3 -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard  -Wall -Wextra -Wpedantic -fdata-sections -ffunction-sections -O0 -g3 -g -std=gnu11 -o CMakeFiles/firmware.dir/modules/board/board_comms.c.obj -c A:/projects/div/grp/board/firmware/modules/board/board_comms.c",
    "file": "A:\\projects\\div\\grp\\board\\firmware\\modules\\board\\board_comms.c",
    "output": "CMakeFiles/firmware.dir/modules/board/board_comms.c.obj"
  },

Just FYI, even your own CDB has mixed paths in it. 🤷

@UmbraMalison
Copy link
Author

@whisperity any further thoughts on your side? Could this make it's way into the backlog? (to support windows paths?)

@whisperity
Copy link
Contributor

whisperity commented Jul 15, 2024

Unfortunately, that is not my decision to make. Although I do have a Windows computer currently (and unfortunately...), it is not a development-capable machine, i.e., I don't have admin rights, can't install Python, etc. I am only using it because they force it on me, but once Windows has booted, I immediately fullscreen a Linux VM. 🤣

I'll try to figure something out, perhaps we can request development machines with Windows, but it will require some forethought. Or someone from the team who does have installer rights on their Windows computers could do it. It won't be done in 6.24.1, as far as I can tell, we have a lot more critical bugs to fix (unfortunately from internal users who do not report on GitHub ☹️).

@marc-h38
Copy link

marc-h38 commented Sep 4, 2024

So here is non-zephyr example, this is a CMake based STM32 project:

The Zephyr environment is quite big and a bit time-consuming to setup. If you have a lighter and faster example you should edit the issue description and switch to it (as long as everything can still be downloaded for free and from reputable places).

by replacing the windows escaped paths for unix style, i was able to analyse the files.

The CMD.EXE exception aside, forward slashes have been working on Windows since practically forever. If you don't care about CMD.EXE, the fastest and quickest way to victory is: forward slashes everywhere.

@tuix tuix linked a pull request Oct 23, 2024 that will close this issue
@tuix
Copy link

tuix commented Oct 23, 2024

I found the issue in shlex.split() in extend_compilation_database_entries() in analyzer/codechecker_analyzer/buildlog/log_parser.py. When posix argument is not set it defaults to True. It works in windows when posix=False. Created a draft PR for further discussion and testing of others affected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer 📈 Related to the analyze commands (analysis driver) bug 🐛 platform-Windows 🖥
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants