Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang reports use after free when deleting all hashmap contents #128

Open
Dorregaray opened this issue Jul 7, 2017 · 6 comments · May be fixed by #180
Open

Clang reports use after free when deleting all hashmap contents #128

Dorregaray opened this issue Jul 7, 2017 · 6 comments · May be fixed by #180

Comments

@Dorregaray
Copy link

The clang reports "use after free" when analyzing the following piece of code:

#include <stdio.h>   /* gets */
#include <stdlib.h>  /* atoi, malloc */
#include <string.h>  /* strcpy */
#include "uthash.h"

struct my_struct {
    int id;                    /* key */
    char name[10];
    UT_hash_handle hh;         /* makes this structure hashable */
};

struct my_struct *users = NULL;

void add_user(int user_id, char *name) {
    struct my_struct *s;

    HASH_FIND_INT(users, &user_id, s);  /* id already in the hash? */
    if (s==NULL) {
      s = (struct my_struct*)malloc(sizeof(struct my_struct));
      s->id = user_id;
      HASH_ADD_INT( users, id, s );  /* id: name of key field */
    }
    strcpy(s->name, name);
}

void delete_all() {
  struct my_struct *current_user, *tmp;

  HASH_ITER(hh, users, current_user, tmp) {
    HASH_DEL(users, current_user);  /* delete it (users advances to next) */
    free(current_user);             /* free it */
  }
}


int main(int argc, char *argv[]) {
    char in[10];
    int id=1, running=1;
    struct my_struct *s;
    unsigned num_users;

    add_user(1, "111111");
    add_user(2, "222222222");
    add_user(3, "3");
    
    delete_all();  /* free any structures */
    return 0;
}

Clang report:

uthash.c:30:5: warning: Use of memory after it is freed
    HASH_DEL(users, current_user);  /* delete it (users advances to next) */
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:421:5: note: expanded from macro 'HASH_DEL'
    HASH_DELETE(hh,head,delptr)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:369:5: note: expanded from macro 'HASH_DELETE'
    HASH_DELETE_HH(hh, head, &(delptr)->hh)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:376:17: note: expanded from macro 'HASH_DELETE_HH'
    uthash_free((head)->hh.tbl->buckets,                                         \
    ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:89:34: note: expanded from macro 'uthash_free'
#define uthash_free(ptr,sz) free(ptr)     /* free fcn                        */
                                 ^
uthash.c:30:5: warning: Use of memory after it is freed
    HASH_DEL(users, current_user);  /* delete it (users advances to next) */
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:421:5: note: expanded from macro 'HASH_DEL'
    HASH_DELETE(hh,head,delptr)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:369:5: note: expanded from macro 'HASH_DELETE'
    HASH_DELETE_HH(hh, head, &(delptr)->hh)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:382:23: note: expanded from macro 'HASH_DELETE_HH'
    if (_hd_hh_del == (head)->hh.tbl->tail) {                                    \
                      ^~~~~~~~~~~~~~
2 warnings generated.
@Quuxplusone
Copy link
Collaborator

Seems similar to the false positive in #111. Can you find out for me whether this is a true positive or not? One way to do that would be to #define uthash_free(p, n) to something that overwrites all the freed memory with 0x00 before deallocating it, and then look for segfaults.

If this is a false positive in the static analyzer, then we can't really do anything about it — unless you have a patch that fixes it, and the patch doesn't hurt the performance or clarity of the code, and you don't mind it bit-rotting. I mean, Clang could start complaining about ten new things tomorrow; uthash has no control over that.

@jhi
Copy link

jhi commented Jul 8, 2017

... or compile with clang -fsanitize=address and run the code, and see if the sanitizer runtime catches the same problem. If it does, then uthash needs changing. If it does not, that is more ammunition for reporting the false positive to clang.

@Dorregaray
Copy link
Author

It is false positive. Valgrind and sanitizer doesn't see the problem. Redefining uthash_free also didn't segfault.

inye added a commit to inye/uthash that referenced this issue Jul 29, 2019
Fixes troydhanson#128. Clang static analyzer was thinking that it's possible to
free() list head without updating head pointer. This assert assures
static analyzer that branch without head pointer update could be taken
only when we are freeing non-head element.
@inye inye linked a pull request Jul 29, 2019 that will close this issue
@acbits
Copy link

acbits commented Aug 10, 2023

Hello,

I am a developer of gcc plugins and I came across uthash and wanted to test it. I do see use-after-free for the last element deleted.

My testcase was add 3 entries and delete them like this

 HASH_ITER(hh, ht, cur_elem, tmp){
        printf("---BEGIN LOOP BODY---\n");
        printf("Deleting entry:|%p|:key:%d, RC:%d\n", cur_elem, cur_elem->key, REFTRACK_COUNT(cur_elem));
        HASH_DEL(ht, cur_elem);
        my_struct_removeref(cur_elem);
        printf("---END LOOP BODY---\n");
    }

where ht is the hash table, REFTRACK_COUNT() returns the number of active references and my_struct_removeref() is code instrumentation that removes a reference to the given element. When there are no more references to a heap allocated object, the object would be deleted. The +1, -1 in the trace log below implies addition of a reference and removal of a reference.

The trace from the instrumented code is

 Allocated |0x0x23072b8| of size |64| bytes at |hash_test.c:15|
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x23072b8|:-1
 Allocated |0x0x2307988| of size |64| bytes at |hash_test.c:15|
my_struct:|0x0x2307988|:+1
my_struct:|0x0x2307988|:+1
my_struct:|0x0x2307988|:-1
 Allocated |0x0x23079e8| of size |64| bytes at |hash_test.c:15|
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x23079e8|:-1
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x2307988|:+1
---BEGIN LOOP BODY---
Deleting entry:|0x23072b8|:key:1, RC:3
my_struct:|0x0x23072b8|:-1
---END LOOP BODY---
my_struct:|0x0x2307988|:+1
my_struct:|0x0x23072b8|:-1
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x2307988|:-1
---BEGIN LOOP BODY---
Deleting entry:|0x2307988|:key:2, RC:2
my_struct:|0x0x2307988|:-1
---END LOOP BODY---
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x2307988|:-1
 releasing object |0x0x2307988| type |my_struct|
my_struct:|0x0x23079e8|:-1
---BEGIN LOOP BODY---
Deleting entry:|0x23079e8|:key:3, RC:2
my_struct:|0x0x23079e8|:-1
my_struct:|0x0x23079e8|:-1
 releasing object |0x0x23079e8| type |my_struct|
---END LOOP BODY---
 Invalid pointer/use-after-free |0x0x23079e8| to |my_struct_removeref|

Further, if I comment out the HASH_DEL() statement, there are no issues. I hope this info is helpful in tracking down the issue.

@Quuxplusone
Copy link
Collaborator

@acbits: That doesn't seem related to this issue about a Clang-static-analyzer false positive.
Please post a complete, compilable, minimal example demonstrating your problem, as a new issue. But I hope you have already noticed the historical pattern with reports of memory-safety bugs in C code: The mistake will turn out to be in your own code; the best you can hope for is that I (or someone) will be able to pinpoint your bug for you.

In fact, we can already tell that the output you pasted doesn't come from the loop you pasted. In the loop, you do printf("---END LOOP BODY---\n"); and then immediately printf("---BEGIN LOOP BODY---\n"); to begin the next iteration. In the output, you can see lines appearing in between iterations, like this:

---END LOOP BODY---
my_struct:|0x0x2307988|:+1
my_struct:|0x0x23072b8|:-1
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x2307988|:-1
---BEGIN LOOP BODY---

Your program might secretly be doing something with multiple threads. Make sure every thread is done using the hash table before you try to destroy it.

@acbits
Copy link

acbits commented Aug 11, 2023

Please post a complete, compilable, minimal example demonstrating your problem, as a new issue. But I hope you have already noticed the historical pattern with reports of memory-safety bugs in C code: The mistake will turn out to be in your own code; the best you can hope for is that I (or someone) will be able to pinpoint your bug for you.

The reason I didn't post the code is because it uses a gcc attribute that is not available to public yet.(Might change in future)

#define REFTRACK_DEBUG
#define REFTRACK_TRACE

#include "hrcmm.h"
#include "uthash.h"
#include <assert.h>

typedef int key_t;

REFTRACK_STRUCT(my_struct){
    key_t key;
    float value;
    UT_hash_handle hh;
};
REFTRACK_EPILOG(my_struct);

typedef struct my_struct my_struct;

int main(int argc, char *argv[]){
    my_struct *ht = NULL;

    int entries = 3;
    atexit(print_mem_stats);
    
    for(int i = 0; i < entries; i++){
        my_struct *vo = my_struct_create();
        vo->key = i+1;
        vo->value = vo->key*0.5;
        // need to add reference manually as uthash doesn't preserve the type of my_struct internally.
        my_struct_addref(vo); 
        HASH_ADD_INT(ht, key, vo);
    }

    assert(HASH_COUNT(ht) == entries);
    
    my_struct *tmp = NULL;
    key_t search_key = entries;
    HASH_FIND_INT(ht, &search_key, tmp);

    if (tmp){
        assert(tmp->value == (entries)*0.5);
    }
    else{
        printf("Entry not found:%d\n", search_key);
    }

    // delete all entries

    my_struct *cur_elem = NULL;
    
    HASH_ITER(hh, ht, cur_elem, tmp){
        printf("---BEGIN LOOP BODY---\n");
        printf("Deleting entry:|%p|:key:%d, RC:%d\n", cur_elem, cur_elem->key, REFTRACK_COUNT(cur_elem));
        HASH_DEL(ht, cur_elem);
        my_struct_removeref(cur_elem);
        printf("---END LOOP BODY---\n");
    }
 
   
}

In fact, we can already tell that the output you pasted doesn't come from the loop you pasted. In the loop, you do printf("---END LOOP BODY---\n"); and then immediately printf("---BEGIN LOOP BODY---\n"); to begin the next iteration. In the output, you can see lines appearing in between iterations, like this:

That is correct. The log inbetween ---END LOOP BODY--- and ---BEGIN LOOP BODY--- is generated by instrumented code.

BTW, I don't use uthash. I just selected uthash as test case for the gcc plugin and encountered the same issue as reported here.

I provided the trace so that it would be helpful to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants