Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INSERT INTO hypertable using a PL/pgSQL function results in high RAM usage #7053

Closed
Kazmirchuk opened this issue Jun 20, 2024 · 8 comments · Fixed by #7088
Closed

INSERT INTO hypertable using a PL/pgSQL function results in high RAM usage #7053

Kazmirchuk opened this issue Jun 20, 2024 · 8 comments · Fixed by #7088
Assignees
Labels

Comments

@Kazmirchuk
Copy link

What type of bug is this?

Performance issue

What subsystems and features are affected?

Data ingestion, Query executor, Query planner

What happened?

A query of the form

INSERT INTO hypertable (...)
SELECT some_plpgsql_function(column)
FROM postgres_table

performs badly and results in a very visible spike in RAM usage that might end up in Linux OOM killing Postgres.

The bug does not manifest, if:

  • I use a pure SQL function
  • or if I insert into a normal Postgres table
  • or I select from generate_series()

htop screenshots when my system is idle:

idle

and when the query is running:

running_query

TimescaleDB version affected

2.15.2

PostgreSQL version used

14.8

What operating system did you use?

Red Hat 8.7

What installation method did you use?

RPM

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

1. Create a usual Postgres table:
CREATE TABLE import_param
(
    pk serial PRIMARY KEY,
    obt timestamptz NOT NULL,  -- obt=onboard time
    device_id int,
    raw_value text,
    eng_value text
)

2. Populate it with random values, 700K rows, so that roughly 50% of raw_value and eng_value are valid float8 (as text) and others are just text strings like 'foobar'.

3. Create a hypertable:
CREATE TABLE tm_param
(
    obt timestamptz NOT NULL,
    device_id int,
    raw_value float8,
    eng_value float8
);
SELECT * FROM create_hypertable('tm_param', 'obt', 
chunk_time_interval => INTERVAL '15m', 
create_default_indexes => FALSE);
Adjust the interval, so that importing data from import_param will result in a reasonable number of chunks - I had 4 in my tests.

4. Create a function to convert text to float8, because PostgreSQL doesn't have try_cast:
CREATE OR REPLACE FUNCTION to_double(_in text, INOUT _out double precision)
    LANGUAGE plpgsql
    IMMUTABLE
    parallel safe
AS $$
BEGIN
   SELECT CAST(_in AS double precision)
   INTO  _out;
EXCEPTION WHEN others THEN
   -- do nothing: _out already carries default
END;
$$;

5. Copy all data from import_param to tm_param:

INSERT INTO tm_param (obt, device_id, raw_value, eng_value)
SELECT obt, device_id,
to_double(raw_value), 
to_double(eng_value)
from imp_param;

6. Monitor RAM consumption while the query is running.
@Kazmirchuk Kazmirchuk added the bug label Jun 20, 2024
@Kazmirchuk
Copy link
Author

btw I'm not talking about huge amount of data here:

SELECT pg_size_pretty(table_bytes) FROM hypertable_detailed_size('tm_param');

=> 39 MB

@Kazmirchuk
Copy link
Author

I have managed to solve my specific problem by rewriting my function to pure SQL, but the general question of using PL/pgSQL in such context remains. Unfortunately, adding "parallel safe" made no difference.

@nikkhils
Copy link
Contributor

@Kazmirchuk not sure if this indicates a real problem. The Linux OS does memory cleanup as and when needed.

@Kazmirchuk
Copy link
Author

@Kazmirchuk not sure if this indicates a real problem. The Linux OS does memory cleanup as and when needed.

PostgreSQL crashed every time. IMHO this is a real problem that does not occur when using standard PostgreSQL tables.

@nikkhils
Copy link
Contributor

nikkhils commented Jul 1, 2024

@Kazmirchuk thanks for the repro steps. I can confirm that there's a memory leak when plpgsql functions are involved in the INSERT queries. We will continue to investigate this.

@nikkhils nikkhils self-assigned this Jul 1, 2024
@nikkhils
Copy link
Contributor

nikkhils commented Jul 1, 2024

When I dump the memory context I can see multiple entries for CurTransactionContext: which is weird. And there's an additional entry every time the plpgsql function gets invoked for each tuple. So, basically a 8KB leak everytime which can quickly add up when a lot of function calls are involved.

TopTransactionContext: 8192 total in 1 blocks; 6944 free (3 chunks); 1248 used
  CurTransactionContext: 8192 total in 1 blocks; 7928 free (1 chunks); 264 used
  SPI Exec: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
  SPI Proc: 8192 total in 1 blocks; 7216 free (0 chunks); 976 used
    PLpgSQL per-statement data: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
  CurTransactionContext: 8192 total in 1 blocks; 7928 free (1 chunks); 264 used
  CurTransactionContext: 8192 total in 1 blocks; 7928 free (1 chunks); 264 used
  ExecutorState: 8192 total in 1 blocks; 7392 free (3 chunks); 800 used
Grand total: 65536 bytes in 8 blocks; 61192 free (9 chunks); 4344 used

@nikkhils
Copy link
Contributor

nikkhils commented Jul 1, 2024

this happens in SPI_finish when we delete the SPI Proc memory context. It has next and prevchild entries set up and then this CurTransactionContext context which is there in the prevchild gets chained into the TopTransactionContext leading to the leak.

nikkhils added a commit to nikkhils/timescaledb that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes timescale#7053
nikkhils added a commit to nikkhils/timescaledb that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes timescale#7053
nikkhils added a commit to nikkhils/timescaledb that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes timescale#7053
nikkhils added a commit to nikkhils/timescaledb that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes timescale#7053
nikkhils added a commit that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes #7053
github-actions bot pushed a commit that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes #7053

(cherry picked from commit ebbca2d)
nikkhils added a commit that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes #7053

(cherry picked from commit ebbca2d)
timescale-automation pushed a commit that referenced this issue Jul 2, 2024
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes #7053

(cherry picked from commit ebbca2d)
@Kazmirchuk
Copy link
Author

many thanks for the quick fix! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants