Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parallel tests #192

Merged
merged 6 commits into from
Oct 9, 2023
Merged

Conversation

ezra-varady
Copy link
Collaborator

@ezra-varady ezra-varady commented Oct 5, 2023

This PR adds support for a set of parallel tests to ensure that lantern behaves properly under a more realistic workload. At the moment they are not especially interesting, just a PoC, but this includes the changes to support the concept more generally. I have for the time being added these tests to make test to see how they'll interact with CI on this PR. They should eventually be restricted to make test-parallel @var77 if you have ideas about how best to include them in the github actions workflow any advice would be awesome.

Some notes:

  • To run the new tests alone run make test-parallel (currently they'll also run if you run make test but I'd like to change this before merging
  • The schedule for the new tests is test/parallel_schedule.txt
  • The tests themselves live in test/parallel they have the same layout as the existing regression tests
  • At the moment these tests only bring in sift10k do a few inserts and selects and check some obvious invariants at the end if anyone has suggestions for additional tests/invariants to add please let me know and I will include them

Fixes #177

Copy link
Collaborator

@var77 var77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good, great changes! Added some small comments. For the CI we may modify run-tests-linux.sh and run-tests-mac.sh files. to run the parallel tests as well

Copy link
Contributor

@Ngalstyan4 Ngalstyan4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple nits. otherwise ready to merge. Well done!

# if tests are parallel we only do this for the begin tests as we won't be dropping the database until the end
# begin will handle initialization specific to the tests but expects the database already exists
if [ "$PARALLEL" -eq 0 ]; then
psql "$@" -U ${DB_USER} -d postgres -v ECHO=none -q -c "DROP DATABASE IF EXISTS ${TEST_CASE_DB};" 2>/dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this not happen in both branches of the if?

if [ "$PARALLEL" -eq 0 ]; then
psql "$@" -U ${DB_USER} -d postgres -v ECHO=none -q -c "DROP DATABASE IF EXISTS ${TEST_CASE_DB};" 2>/dev/null
db_init
psql "$@" -U ${DB_USER} -d ${TEST_CASE_DB} -v ECHO=none -q -f utils/common.sql 2>/dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this not happen in both branches as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I'll rework this

else
TEST_FILES=$(cat schedule.txt | grep '^test:' | sed -e 's/^test://' | tr " " "\n" | sed -e '/^$/d')
if [[ "$pgvector_installed" == "1" ]]; then
TEST_FILES=$(cat $SCHEDULE | grep -E '^(test:|test_pgvector:)' | sed -E -e 's/^test:|test_pgvector://' | tr " " "\n" | sed -e '/^$/d')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but still important: Do we currently run pgvector compat tests anywhere in ci/cd or release pipeline?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we are running them in the pipeline, as we're installing the pgvector there

@@ -0,0 +1,11 @@
CREATE OR REPLACE FUNCTION random_int_array(dim integer, min integer, max integer) RETURNS integer[] AS $BODY$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this would be useful in regular tests as well. Should be in the utils for regular tests.

… standard tests, cleanup logic in test runner
@ezra-varady
Copy link
Collaborator Author

merged the two branches in the test runner and made copies of a couple utilities

@Ngalstyan4 Ngalstyan4 merged commit 9e3e17b into lanterndata:main Oct 9, 2023
26 checks passed
ezra-varady added a commit to ezra-varady/lanterndb that referenced this pull request Oct 11, 2023
Uses pg_regress  to run tests in parallel against the database.
Allows custom DB initialization and finalization which can be used to load relevant data in the beginning
and check relevant invariants in the end
Ngalstyan4 pushed a commit that referenced this pull request Oct 13, 2023
Uses pg_regress  to run tests in parallel against the database.
Allows custom DB initialization and finalization which can be used to load relevant data in the beginning
and check relevant invariants in the end
Ngalstyan4 pushed a commit that referenced this pull request Oct 13, 2023
Uses pg_regress  to run tests in parallel against the database.
Allows custom DB initialization and finalization which can be used to load relevant data in the beginning
and check relevant invariants in the end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add infrastructure for test suites that run parallel queries on the same table
3 participants