Project Impact Graph Performance Issue #2

wesley-weiming · 2023-12-18T02:40:47Z

Plan A

In the initial version of project-impact-graph.yaml, we use glob pattern to represent a project, so when using the path to query the project, we need to use the glob library for pattern matching.

https://github.com/tiktok/project-impact-graph/tree/main

projects:
  "A":
    includedGlobs:
      - "projects/A/**"
    excludedGlobs:
      - "projects/A/README.md"
    dependentProjects:
      - "A"
      - "B"
      - "C"

During performance testing, we found that using glob for path matching has a great loss on performance.
We used two glob libraries, minimatch and micromatch, for testing. As can be seen from the table below, when the number of paths reaches 1000, it takes more than 20 seconds.

Plan B

We try to use folder strings to represent projects (the same way used by rush.json), so that we can use string matching to determine which project the path belongs to. This method can greatly improve performance.

https://github.com/tiktok/project-impact-graph/tree/feat/add-performance-report

projects:
  "A":
    includedPrefixs:
      - "projects/A/"
    excludedPrefixs:
      - "projects/A/README.md"
    dependentProjects:
      - "A"
      - "B"
      - "C"

Performance Data

nodeCount	edgeCount	pathCountA	pathCountB	String.startWith（plan B）	minimatch（plan A）	micromatch（plan A）
1000	5000	1	1	0.489s	0.359s	0.419s
1000	5000	10	10	0.425s	1.094s	0.475s
1000	5000	100	100	0.448s	9.324s	2.483s
1000	5000	1000	1000	0.845s	90.951s	20.477s
2000	10000	1	1	0.913s	0.609s	0.482s
2000	10000	10	10	0.936s	2.248s	0.832s
2000	10000	100	100	1.072s	19.119s	5.087s
2000	10000	1000	1000	1.776s	185.561s	48.235s
3000	100000	1	1	9.46s	1.855s	1.961s
3000	100000	10	10	10.533s	4.498s	2.173s
3000	100000	100	100	11.029s	29.82s	9.449s
3000	100000	1000	1000	11.984s	498.062s	75.94s

The text was updated successfully, but these errors were encountered:

octogonz · 2023-12-18T20:33:55Z

Interesting! Seems like there's still room for optimization.

String.startsWith() might be too limiting, for example suppose we wanted to exclude *.md or **/*.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Impact Graph Performance Issue #2

Project Impact Graph Performance Issue #2

wesley-weiming commented Dec 18, 2023

octogonz commented Dec 18, 2023

Project Impact Graph Performance Issue #2

Project Impact Graph Performance Issue #2

Comments

wesley-weiming commented Dec 18, 2023

Plan A

Plan B

Performance Data

octogonz commented Dec 18, 2023