Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librhsp getInterfacePacketID errors with segmentation fault. #133

Open
felipetrentin opened this issue Jul 19, 2024 · 2 comments
Open

librhsp getInterfacePacketID errors with segmentation fault. #133

felipetrentin opened this issue Jul 19, 2024 · 2 comments

Comments

@felipetrentin
Copy link

My configuration:

  • REV Expansion Hub running Firmware 1.8.2 updated from REV Hardware Client
  • Intel x64 pc running Ubuntu 22.04
  • hub connected directly via USB. No child hubs connected.

What I'm trying to accomplish:

I'm trying to use a REV Expansion Hub to control a robot using ROS2 Humble by creating a custom ROS node in C++

I was able to connect to the hub and change the LED color, by using rhsp_setModuleLedColor by using the library tests as an example.

The problem:

rhsp_setServoConfiguration, rhsp_setMotorChannelMode, rhsp_getEncoderPosition and other commands give the same segmentation fault.

it appears as any command containing rhsp_getInterfacePacketID creates the same behavior, apparently caused by the strcmp in getInterfaceByName

debugging in GDB i get the following output:

Thread 1 "main" received signal SIGSEGV, Segmentation fault.
__strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116
116	../sysdeps/x86_64/multiarch/strcmp-avx2.S: No such file or directory.

Thanks in advance.

@felipetrentin
Copy link
Author

update, I found the bug and was able to fix it

(but probably in not a very good way lol)

it turns out that ´rhsp_getInterfacePacketID´ does a transaction with the hub in the following order:

  1. test if requested interface is in the interfaces list already,
    1.1. if the interface is already in the computer memory, return it.
    1.2. if the interface is not in the computer, query it from the hub and store it on the memory
  2. test again if the requested interface is in the computer memory and return it.

the bug:

the code was failing in step 2 where it would check if there was an interface after querying.
more especifically, the code was failing in getInterfaceByName where there is a string comparison. This could mean a couple of things:

  • the logic of getInterfaceByName was wrong and was accessing non allocated memory;
  • some string was missing a null terminator, so string operators end up accessing prohibited memory;
  • some memory allocation function was not properly cleaned, resulting in a garbage value on some variable.
    these were the main hypothesis regarding the segmentation fault.

First of all, in getInterfaceByName there is a comparison between int and size_t that could yield an infinite loop. Thus, i changed the function to be:

static const RhspModuleInterface* getInterfaceByName(const RhspModuleInterfaceListInternal* list,
                                                     const char* interfaceName)
{
    for (size_t i = 0; i < list->numberOfInterfaces; i++)
    {
        if (strcmp(interfaceName, list->interfaces[i].name) == 0)
        {
            return &list->interfaces[i];
        }
    }
    return NULL;
}

take note that if the value of list->numberOfInterfaces is wrong, the loop will try accessing an illegal part of the memory.

as the bug was still not fixed, I focused in understanding what was happening inside rhsp_queryInterface, since a wrong value inside this function could lead to an invalid data structure that would in turn fail when processed by getInterfacePacketIdInternal

my first suspicion was that some string could be missing a null terminator, causing string operators to fail. thus, i added this line after the allocation of memory to intf_.name and memcpy

intf_.name[interfaceNameLength-1] = 0;

this also did not fix the issue, despite not breaking anything else.

Therefore, the only other suspicion was that some garbage value was being stored in list->numberOfInterfaces resulting in a out of bounds memory acces in getInterfaceByName. The question was "where is this value set and changed, and what could make it out of bounds?"

then I turned my focus to rhsp_queryInterface where this value probably was last changed before failing. Upon further inspection, I found that it called addInterface

as I had already confirmed that the values set before the calling of this function were right, it was the main culprit for the segmentation fault. It turns out, that if hub->interfaceList is a null pointer, the function allocates its memory but never changes the values. The code probably worked using other compilers, but running Colcon and, in the background, GCC, makes such that malloc only allocates memory and does not clean it.

so it was possible that numberOfInterfaces was not zero after allocation.

so, to test if that was the case i added the following line after the allocation:

((RhspModuleInterfaceListInternal*) hub->interfaceList)->numberOfInterfaces = 0;

And it worked!!
but the code is still unsafe. There are a lot of arguments that could result in big accidents like these. This is still not good enough for a competition-ready robot running ROS2.
I hope this brought some light to the issue and maybe helped someone. I would be interested in contributing to this repository. I would also suggest to make librhsp separate from this repository, since it has very different goals in mind.

@NoahAndrews
Copy link
Member

Wow! Thank you for reporting and investigating this bug. We're not actively working on this library at the moment, but we do plan to return to it, and I'm sure we'll want to dive into this then.

Moving librhsp to its own repo is something we want to do at some point (there's some historical reasons why that's not currently the case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants