Improve memory footprint of columnar deserialization using relevant filter #895

TonyXiang8787 · 2025-02-13T13:32:55Z

Background

When using PGM deserialization, the user can select a relevant filter for columnar data format. The de-serializer will return only relevant attributes for that component.

Internally, the Python routine still make buffers of all attributes. Then the attributes with only nan are discarded. This is un-wanted memory footprint.

Improvement

In general, we cannot know which attributes are relevant before parse the whole dataset. However, if the condense attribute list is given in the dataset header, and there are no map-like elements in the data, we know for sure that the relevant attributes can only be those attributes specified in the header. In this way, we can ask the Python routine to only create those buffers. This should save memory footprint of the de-serializer.

Implementation

Signed-off-by: Tony Xiang <[email protected]>

mgovers · 2025-02-13T15:59:04Z

power_grid_model_c/power_grid_model/include/power_grid_model/auxiliary/dataset.hpp

@@ -336,6 +339,21 @@ template <dataset_type_tag dataset_type_> class Dataset {
        add_component_info_impl(component, elements_per_scenario, total_elements);
    }

+    void enable_atrribute_indications(std::string_view component)


typo

Suggested change

void enable_atrribute_indications(std::string_view component)

void enable_attribute_indications(std::string_view component)

mgovers · 2025-02-13T16:00:01Z

...d_model_c/power_grid_model/include/power_grid_model/auxiliary/serialization/deserializer.hpp

@@ -152,6 +152,14 @@ struct DefaultNullVisitor : msgpack::null_visitor {
    }
 };

+struct NullVisitorCheckMap : DefaultNullVisitor {


it's not really a null-visitor anymore, right?

Suggested change

struct NullVisitorCheckMap : DefaultNullVisitor {

struct CheckHasMap : DefaultNullVisitor {

mgovers · 2025-02-13T16:02:49Z

power_grid_model_c/power_grid_model_c/include/power_grid_model_c/dataset.h

+ * @brief Return if a component has attribute indications.
+ *
+ * Attribute indications are used to indicate the presence of meaningful attributes for a certain component in the
+ * dataset.


maybe it's good to also mention the behavior if it has both preamble attribute indications and in-data attributes

sonarqubecloud · 2025-02-20T12:43:50Z

Quality Gate passed

Issues
7 New issues
0 Accepted issues

Measures
0 Security Hotspots
85.7% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

TonyXiang8787 and others added 19 commits January 18, 2025 14:42

start visitor

b4b8eae

Signed-off-by: Tony Xiang <[email protected]>

parse has map in the first run

32b637b

Signed-off-by: Tony Xiang <[email protected]>

enable attribute indications

fd8ff8c

Signed-off-by: Tony Xiang <[email protected]>

set attribute indications to dataset

d983c77

Signed-off-by: Tony Xiang <[email protected]>

optimize process

cc6c422

Signed-off-by: Tony Xiang <[email protected]>

skip whole scenario

af44232

Signed-off-by: Tony Xiang <[email protected]>

skip whole scenario

926993b

Signed-off-by: Tony Xiang <[email protected]>

commments

9db08f9

Signed-off-by: Tony Xiang <[email protected]>

C-API definition

4d8e19e

Signed-off-by: Tony Xiang <[email protected]>

c api implementation

d12b5f5

Signed-off-by: Tony Xiang <[email protected]>

Merge branch 'main' into feature/columnar-deserializer-improvement

f41351c

fix bug on offset

d0887bc

Signed-off-by: Tony Xiang <[email protected]>

fix a bug on relevant filter

0060c9a

Signed-off-by: Tony Xiang <[email protected]>

down merge

f4e780c

Signed-off-by: Tony Xiang <[email protected]>

change name

159eae8

Signed-off-by: Tony Xiang <[email protected]>

set c binding

2cdd407

Signed-off-by: Tony Xiang <[email protected]>

format

34614ca

Signed-off-by: Tony Xiang <[email protected]>

get indications

81fa0a3

Signed-off-by: Tony Xiang <[email protected]>

python api passes all existing tests

474f468

Signed-off-by: Tony Xiang <[email protected]>

TonyXiang8787 marked this pull request as draft February 13, 2025 13:33

fix lint

96d7c79

Signed-off-by: Tony Xiang <[email protected]>

mgovers reviewed Feb 14, 2025

View reviewed changes

Merge branch 'main' into feature/columnar-deserializer-improvement

1dd9f1c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve memory footprint of columnar deserialization using relevant filter #895

Improve memory footprint of columnar deserialization using relevant filter #895

TonyXiang8787 commented Feb 13, 2025 •

edited

Loading

mgovers Feb 13, 2025

mgovers Feb 13, 2025

mgovers Feb 13, 2025

sonarqubecloud bot commented Feb 20, 2025

	void enable_atrribute_indications(std::string_view component)
	void enable_attribute_indications(std::string_view component)

	struct NullVisitorCheckMap : DefaultNullVisitor {
	struct CheckHasMap : DefaultNullVisitor {

Improve memory footprint of columnar deserialization using relevant filter #895

Are you sure you want to change the base?

Improve memory footprint of columnar deserialization using relevant filter #895

Conversation

TonyXiang8787 commented Feb 13, 2025 • edited Loading

Background

Improvement

Implementation

mgovers Feb 13, 2025

Choose a reason for hiding this comment

mgovers Feb 13, 2025

Choose a reason for hiding this comment

mgovers Feb 13, 2025

Choose a reason for hiding this comment

sonarqubecloud bot commented Feb 20, 2025

Quality Gate passed

TonyXiang8787 commented Feb 13, 2025 •

edited

Loading