Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: display export limits and optimize export query #155

Merged
merged 8 commits into from
Jan 25, 2024

Conversation

ClemDoum
Copy link
Collaborator

@ClemDoum ClemDoum commented Jan 24, 2024

PR description

Fixes #151
Fixes #152

Addressed both issues in the same PR. Optimizing the export query implied some API changes and hence a frontend update, I made the most of this update to also display the limit used when exporting:

trimmed.mov

Query optimization

Updated export queries from

MATCH (node)
OPTIONAL MATCH (node)-[r]-(other)
RETURN *
LIMIT 1000

to

MATCH (doc:Document)
WITH doc
LIMIT 1000
ORDER BY doc.path ASC
OPTIONAL MATCH (doc)-[rel:APPEARS_IN|SENT|RECEIVED]-(ne:NamedEntity)
RETURN apoc.coll.toSet(collect(doc) + collect(ne) + collect(rel)) AS values

Optimizations consisted in:

  • specifying nodes labels and relationships types to speedup path matching
  • applying on the number of match nodes (earlier in the query)
  • deduplicating exports nodes and relationships

Changes

src (Java)

Changed

  • refactored the DumpQuery to be less generic and reflect more closely the type of dump queries executed in practice. Previously the DumpQuery was very close the a Neo4jUtils.Query (match path + where + order by + limit), now the dump query is mean to initialized a single query restrincting the documents nodes to be dumped. The rest of the query (optional match of named entities, deduplication of the return values, applying limits) is computed in a rigid manner while calling asValidated(defaultLimit)
  • exposed the export document node limit to the frontend through GET /graphs/dump/node-limit
  • lazyly retrieve enterprise support as well export document node limit from the Python app

Removed

  • removed unsued sorted graph dump API

plugins/neo4j-graph-widget (frontend)

Added

  • added an overlay to explain graph limits and prevent user from closing download tab
  • updated the extension state to store the export document node limit (retrieved from the backend)

Changed

  • updated calls to the backend to the new API

neo4j-app (Python)

Added

  • added AppConfig.neo4j_app_max_dumped_documents and AppConfig.neo4j_export_batch_size configuration attributes to size exports
  • added a AppConfig.supports_neo4j_parallel_runtime attribute to indicate whether the neo4j instance supports parallel runtime

Changed

  • optimized default export query and export configuration used in dump_graph, as well as updated the function to use a document node limit and forwarded configuration export batch size
  • update the AppConfig to compute both enterprise and parallel runtime support in with_neo4j_support. Refactored the dependency codebase to compute this support when the neo4j driver is ready and updated the FastAPI app configuration with the updated configuration (see dependencies.py)

Fixed

  • fixed a bug when logging progress in an empty project

neo4j script

Changed

  • updated neo4j docker test image to the most recent 4.x (4.4.29)

) -> StreamingResponse:
with log_elapsed_time_cm(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log was remove on purpose here, the elapsed time here can't be correclty computed as we're streaming, this returns right away

@ClemDoum ClemDoum marked this pull request as ready for review January 24, 2024 10:25
@ClemDoum ClemDoum requested a review from a team as a code owner January 24, 2024 10:25
Copy link
Contributor

@caro3801 caro3801 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the UI : the comments are non blocking for the next steps. Looks nice!

plugins/neo4j-graph-widget/components/WidgetNeo4jGraph.vue Outdated Show resolved Hide resolved
plugins/neo4j-graph-widget/components/WidgetNeo4jGraph.vue Outdated Show resolved Hide resolved
plugins/neo4j-graph-widget/components/WidgetNeo4jGraph.vue Outdated Show resolved Hide resolved
@@ -120,17 +154,20 @@ export default {
{ value: GRAPHML, text: 'GraphML' },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In props, for object, a good practice is to put a default value if its not required:

 props: {
    widget: {
      type: Object,
      default:()=>{}
    }
},

if required:

 props: {
    widget: {
      type: Object,
      required:true
    }
},

@ClemDoum ClemDoum merged commit 42f7775 into main Jan 25, 2024
8 checks passed
@ClemDoum ClemDoum deleted the feature/display-limit branch January 25, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants