Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrated Boot into Proto-X #31

Merged
merged 29 commits into from
Apr 17, 2024
Merged

Conversation

wangpatrick57
Copy link
Member

@wangpatrick57 wangpatrick57 commented Apr 15, 2024

Summary: Integrated Boot to accelerate the HPO and tuning loops of Proto-X

Demo:
On TPC-H SF10, using Boot (image 1), Proto-X finds a configuration with a "Best Seen Metric" of 69.67s after ~1hr of tuning. Without Boot (image 2), Proto-X only finds a configuration with a "Best Seen Metric" of 103.10s after ~1hr of tuning. Note that "Best Seen Metric" refers to the total runtime of the last run of all 22 queries TPC-H. "Best Metric" is set to "Best Seen Metric" whenever a run has no timed-out queries. Both runs used the same hyperparameters and the same per-query timeout.
Screenshot 2024-04-15 at 10 52 17
Screenshot 2024-04-15 at 10 54 26

Details:

  • Boot is automatically installed when running postgres build.
  • Redis (needed by Boot) is automatically started when Boot is enabled. This Redis does not conflict with the Redis instance used by Ray.
  • Boot can be toggled on and off with the CLI arg --enable-boot-during-[hpo|tune]. Boot is enabled separately for HPO and for tuning.
  • Boot can be configured with a .yaml file passed in through the CLI.
  • Currently, the results from Boot being enabled show the estimated runtime of the TPC-H workload, not the actual runtime. Thus, the results should be taken with a grain of salt. A future PR will resolve this.
  • Boot could potentially be even more effective if the per-query timeout was also decreased to reflect how Boot accelerates queries. This may be investigated in the future.
  • Currently, Boot's entire intelligent cache is cleared every time the configuration changes. A future PR will implement logic to selectively clear parts of the cache when the configuration changes.

@lmwnshn lmwnshn self-requested a review April 17, 2024 15:18
Copy link
Collaborator

@lmwnshn lmwnshn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only comments are minor thoughts as I read it. Nice work independently integrating everything.

INTENDED_PGDATA_HARDWARE=ssd
PGDATA_PARENT_DPATH=/mnt/nvme1n1/phw2/dbgym_tmp/

# space for testing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this is temporary? Not sure if you wanted to remove this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to run specific commands within main.sh. It's inconvenient to copy them directly because they rely on ENV_VARS, so this is my way of "copying" them. I'll add a comment better explaining this, but I imagine it'll be permanent.

INTENDED_PGDATA_HARDWARE=ssd
PGDATA_PARENT_DPATH=/mnt/nvme1n1/phw2/dbgym_tmp/

# space for testing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment, but doesn't matter much.

@@ -208,12 +220,59 @@ def start_with_changes(
"Waiting for postgres to bootup but it is not..."
)

# Move the temporary over since we know the temporary can load.
# Set up Boot if we're told to do so
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaks abstractions between modules somewhat, (proto-x knowing about Boot), but from briefly thinking about it, it doesn't seem like fixing the abstraction leak is worth it.

I think in an ideal world, the connection handling logic would be independent of the tuner -- then stuff like Boot doesn't need to get pushed down into the tuner.

However, given that the tuner currently manages its own connections, this is fine. As long as there aren't more than 3-5 things that leak, it probably won't be too bad.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, that's something I didn't realize. I think there is an argument for why Proto-X should know about Boot though, because when you use Proto-X to tune, you need to indicate whether Boot was used during tuning or not to know whether the numbers are real runtimes or estimated. We can revisit this later though.

@wangpatrick57 wangpatrick57 merged commit 523ceae into cmu-db:main Apr 17, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants