Description
Hello
The parser works nicely, until parsing too large data. This is not a RAM problem, because i've got plenty of it (~ 64 Go).
I'm parsing data like this :
SELECT {fields} FROM {table} WHERE "T">= {t0} AND "T" <= {tf} ORDER BY "T" ASC
with fields being BIG INT, FLOAT, FLOAT
i'm using the following schema
test_schema = Schema("table", [
num('T', int=True) # ,
#num('p'),
# num('q')
])
if i'm adding 'p' and 'q', the parser still works, but when inspecting the dataframe, the code produces a seg fault shortly after the data gets bigger than ~ 1.2 Go, so if I'm parsing only T, tf_ms-t0_ms
can be bigger than if I'm parsing T, p and q
I don't think this is a data corruption problem because i'm able to reduce or shift [t0_ms ; tf_ms ]
to parse the data (I could parse approximately 1Go of data at a time for example and do it in several shots)
When running gdb, i find the segfault being here :
start copy expert 1624877358.4578984
sql query copied to store 1624877401.6580575
start BINARY read
--Type <RET> for more, q to quit, c to continue without paging--
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fffd2361a4e in __pyx_f_11psql_binary_read_reversed2 (__pyx_v_pos=<synthetic pointer>, __pyx_v_src=0x7ffed63a8030 "PGCOPY\n\377\r\n", __pyx_v_target=0x7fffffffd1c2 "\001") at psql_binary.c:3705
3705 __pyx_f_11psql_binary_read_reversed2(((char *)(&__pyx_v_column_count)), (&(*((char *) ( /* dim=0 */ (__pyx_v_f.data + __pyx_t_15 * __pyx_v_f.strides[0]) )))), (&__pyx_v_pos));
also, when compiling psql_binary.pyx, i get a warning here, not sure if it's related :
warning: psql_binary.pyx:289:47: Buffer unpacking not optimized away.
warning: psql_binary.pyx:289:47: Buffer unpacking not optimized away.
Please share your thoughts if you have an idea about the issue, this is such a great tool.
Thank you for your attention