This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add basic functionality + concurrency
- Loading branch information
1 parent
eac554c
commit a9bad65
Showing
20 changed files
with
1,522 additions
and
149 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,3 +16,7 @@ rocksdb = "0.22.0" | |
|
||
pretty_assertions = "0.7" | ||
select = "0.5" | ||
|
||
[dev-dependencies] | ||
tempfile = "3.2.0" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
// Install axios if you haven't already: | ||
// npm install axios | ||
|
||
const axios = require('axios'); | ||
|
||
|
||
let config = { | ||
method: 'get', | ||
maxBodyLength: Infinity, | ||
url: 'http://localhost:3000/namespaces', | ||
headers: { | ||
'Content-Type': 'application/json' | ||
} | ||
}; | ||
|
||
|
||
// Function to make concurrent requests | ||
async function sendConcurrentRequests() { | ||
const numRequests = 10; | ||
const requests = []; | ||
|
||
// Create an array of Axios promises | ||
for (let i = 0; i < numRequests; i++) { | ||
console.log("sending..") | ||
requests.push(axios.request(config) | ||
.then((response) => { | ||
console.log(JSON.stringify(response.data)); | ||
}) | ||
.catch((error) => { | ||
console.log(error); | ||
})); | ||
} | ||
|
||
try { | ||
// Execute all requests concurrently | ||
const responses = await Promise.all(requests); | ||
|
||
// Process the responses (e.g., log data, handle errors) | ||
responses.forEach((response, index) => { | ||
console.log(`Request ${index + 1} status: ${response.status}`); | ||
// Handle response data as needed | ||
}); | ||
} catch (error) { | ||
console.error('Error making requests:', error.message); | ||
} | ||
} | ||
|
||
// Call the function to send concurrent requests | ||
sendConcurrentRequests(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
{ | ||
"name": "client", | ||
"version": "1.0.0", | ||
"description": "", | ||
"main": "index.js", | ||
"scripts": { | ||
"test": "echo \"Error: no test specified\" && exit 1" | ||
}, | ||
"author": "SimranMakhija7", | ||
"license": "ISC" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,162 @@ | ||
# Design Document | ||
|
||
## Overview | ||
|
||
Modern OLAPs are designed to perform analytic queries, with which we often need to do subqueries, joins, and aggregation. The problem is that the query optimizer needs metadata and statistics, such as data distribution and indexes to do a better job. To solve this problem, we need to build a catalog that serves as a “database of metadata for the database” that stores statistics from the execution engine and data discovery from I/O services, and that provides metadata to the planner and data location to schedule the query. | ||
|
||
## Architectural Design | ||
|
||
![Archirectural Diagram](https://github.com/teyenc/15721_catalog_private/assets/56297484/706e834b-9983-4de2-ad28-2c0e4acf3fd2) | ||
|
||
#### Input/Output | ||
|
||
We will be exposing a REST API for interaction with other components. The inputs and results for the same are a part of the API Spec. | ||
|
||
#### Components | ||
|
||
The components of our architecture include the rust service application and rocksDB as an embedded database. | ||
|
||
##### Rust Application | ||
|
||
A Rust application that exposes a REST API to modify database metadata of the OLAP database. The application consists of the following components: | ||
|
||
- A data model that defines the structure and meaning of the metadata we store, such as schemas, tables, columns, measures, dimensions, hierarchies, and levels. The data model is represented by Rust structs that can be serialized and deserialized using Substrait. | ||
- A database layer that interacts with RocksDB. The database layer provides methods for storing and retrieving the database metadata as key-value pairs using the RocksDB crate. | ||
- A service layer that contains the business logic for the REST API, such as validating inputs, checking permissions, handling errors, etc. The service layer depends on the database layer and uses the data model to manipulate the database metadata. | ||
- A controller layer that exposes the service methods as RESTful endpoints using a web framework, such as warp or axum. The controller layer uses the web framework’s features, such as filters, macros, and async functions, to parse the request parameters and format the response. | ||
|
||
##### Database for metadata | ||
|
||
We choose RocksDB as the database in the catalog to store metadata. It is a fast and persistent key-value store that can be used as an embedded database for Rust applications. | ||
|
||
##### RocksDB Schema | ||
|
||
Column Families | ||
|
||
We use column families for a logical separation and grouping of the metadata store. The | ||
|
||
- Table Data | ||
- Name - string (key) | ||
- Number of columns - u64 | ||
- Read properties - json | ||
- Write properties - json | ||
- File URLs array - string array | ||
- Columns arrays, for each column | ||
- Aggregates - json object | ||
- Range of values - int pair | ||
- Lower bound | ||
- Upper bound | ||
- name - string | ||
- isStrongKey - boolean | ||
- isWeakKey - boolean | ||
- Primary Key col name | ||
- Namespace Data | ||
- Name - string (key) | ||
- Properties - json object | ||
- Operator statistics | ||
- Operator string - string (key) | ||
- Cardinality of prev result - u64 | ||
|
||
#### Tuning/Configuration options | ||
|
||
The catalog can be passed a configuration file at bootstrap with the following configuration options: | ||
|
||
1. data-warehouse-location | ||
2. client-pool-size | ||
3. cache-enabled | ||
4. cache-expiration-interval | ||
|
||
## Design Rationale | ||
|
||
An explanation of why you chose the given design. Your justification should discuss issues related to (1) correctness, (2) performance, (3) engineering complexity/maintainability, and (4) testing. It should also include a brief discussion of the other implementations that you considered and why they were deemed inferior. | ||
Most design decisions were made with the assumption that we do not have any schema updates and writes are infrequent with bulk data | ||
|
||
#### Database | ||
|
||
We contemplated two embedded database candidates for catalog service: SQLite and RocksDB. We chose RocksDB because | ||
|
||
1. Better concurrency control: SQLite locks the entire database when dealing with concurrency writing, whereas RocksDB supports snapshots. | ||
2. Flexibility: RocksDB provides more configuration options. | ||
3. Scalability: RocksDB stores data in different partitions, whereas SQLite stores data in one single file, which isn’t ideal for scalability. | ||
4. Storage: RocksDB uses key-value storage, which is good for intensive write operations. In short, RocksDB would provide better performance and concurrency control when we deal with write-intensive workloads. | ||
|
||
#### Why a key-value store? | ||
|
||
1. Based on [1], the catalog for an OLAP system behaves a lot like an OLTP database. They state how using a key-value store in the form of FoundationDB has proved to be beneficial based on performance and scalability. This includes supporting high-frequency reads, and support for dynamic metadata storage. | ||
2. [2] compares and benchmarks the performance of tabular storage vs hierarchical organization of metadata as seen in Iceberg and finds the single node processing in Iceberg performs better than the others for small tables but fails to scale. It concludes that the metadata access patterns have a significant impact on the performance of distributed query processing. | ||
3. Taking these factors into account, we have decided to go ahead with a key-value store for the simplicity and flexibility it provides along with the performance benefits. | ||
|
||
#### Axum | ||
|
||
After looking through several available options to use build APIs, such as Hyper and Actix, we have selected Axum. | ||
|
||
- Axum framework is built on top of Hyper and Tokio and abstracts some of the low level details | ||
- This, however, does not result in any significant performance overhead. | ||
- Benchmarks for frameworks are listed in [3] | ||
|
||
## Testing plan | ||
|
||
A detailed description of how you are going to determine that your implementation is both (1) correct and (2) performant. You should describe the short unit tests and long running regression tests. Some portion of your testing plan must also use your project's public API, thus you are allowed to share testing infrastructure with the other group implementing the same thing. | ||
|
||
### Correctness testing | ||
|
||
#### Unit tests | ||
|
||
For the correctness of the catalog, we plan to conduct unit tests and regression tests. In unit testing, we will test key components and operations such as metadata retrieval, metadata storage, update, and snapshot isolation. | ||
|
||
Basic unit tests for handler functions are underway. | ||
|
||
#### Regression tests | ||
|
||
Currently, we plan to conduct regression tests on | ||
|
||
1. Concurrency and parallelism to ensure data integrity | ||
2. Correctness of all the APIs in API spec documentation. | ||
3. Failure Scenarios: Test the behavior and performance under failure scenarios, such as network errors, server failures, or other exceptional conditions. | ||
|
||
### Performance testing | ||
1. Concurrency Workloads: Test the performance under concurrent access by simulating multiple clients performing various operations simultaneously (e.g., multiple clients creating tables, querying metadata, committing snapshots, etc.). | ||
2. Large Schemas and Datasets: Evaluate the performance with tables having a large number of columns (e.g., hundreds or thousands of columns) and large datasets with many partitions or snapshots. | ||
3. Bulk Operations: Test the performance of bulk operations like importing or exporting table metadata, snapshots, or partitions. | ||
4. Mixed Workloads: Combine different types of operations in a single workload to simulate a more realistic scenario where various operations are performed concurrently. | ||
|
||
|
||
## Trade-offs and Potential Problems | ||
|
||
Describe any conscious trade-off you made in your implementation that could be problematic in the future or any problems discovered during the design process that remain unaddressed (technical debts). | ||
The biggest trade-off being made in the current design is the absence of any optimizations for updates. Updates to any tables will result in the metadata of the tables stored to become stale. Efficiently updating these values is a design challenge. This has not been prioritized based on the assumption that updates in an OLAP system will be infrequent. | ||
Organizing the namespaces and tables as prefixes in the keys for the store may cause problems in terms of maintainability. | ||
Database | ||
We chose RocksDB to store metadata, whereas Iceberg Catalog has its own metadata layer that includes metadata files, manifest lists, and manifests. Using RocksDB could be more straightforward to implement compared to building everything from scratch. The components in Iceberg Catalog are likely to be optimized for Iceberg Catalog, and they could outperform RocksDB, which is not dedicated to catalog service. | ||
|
||
## Support for Parallelism | ||
Our catalog service is designed to support parallelism to enhance performance. This is achieved through the following ways: | ||
1. **Concurrency Control in RocksDB**: RocksDB, our chosen database, supports concurrent reads and writes. This allows multiple threads to read and write to the database simultaneously, improving the throughput of our service. | ||
|
||
2. **Asynchronous API**: The REST API exposed by our Rust application is asynchronous, meaning it can handle multiple requests at the same time without blocking. This is particularly useful for operations that are I/O-bound, such as reading from or writing to the database. | ||
|
||
3. **Thread Pool**: We plan to use a thread pool for handling requests. This allows us to limit the number of threads used by our application, preventing thread thrashing and improving performance. | ||
|
||
## Performance Tuning Plan | ||
Performance tuning is crucial for the efficiency and speed of our catalog service. Here's our plan: | ||
|
||
1. **RocksDB Tuning**: We will tune RocksDB configurations based on our workload. For example, we can adjust the block cache size, write buffer size, and compaction style to optimize for read-heavy or write-heavy workloads. More details can be found in the [RocksDB Tuning Guide](https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide). | ||
|
||
2. **API Optimization**: We will monitor the performance of our API endpoints and optimize the slow ones. This will involve optimizing the database access methods and refactoring the code. | ||
|
||
3. **Load Testing**: We will conduct load testing to understand how our service performs under heavy load. This will help us identify bottlenecks and areas for improvement. | ||
|
||
4. **Monitoring and Metrics**: We will add monitoring and metrics to our service to track performance over time. This will help us identify performance regressions and understand the impact of our tuning efforts. | ||
|
||
## Milestones | ||
|
||
- 75%: Basic API support | ||
- 100%: Support for parallelism and performance tuning | ||
- 125%: Performance testing against Iceberg Catalog | ||
|
||
|
||
### References | ||
|
||
[1] https://www.snowflake.com/blog/how-foundationdb-powers-snowflake-metadata-forward/ | ||
[2] https://15721.courses.cs.cmu.edu/spring2024/papers/18-databricks/p92-jain.pdf | ||
[3] https://github.com/programatik29/rust-web-benchmarks/blob/master/result/hello-world.md |
Oops, something went wrong.