Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DuckDB (Parquet) results #52

Merged
merged 7 commits into from
Nov 25, 2022

Conversation

Mytherin
Copy link
Contributor

@Mytherin Mytherin commented Nov 15, 2022

This PR adds results for DuckDB running over Parquet files. I have used the same Parquet files as clickhouse-local.

Note that there are some issues with the Parquet file metadata (see #18). Specifically the VARCHAR, DATE and TIMESTAMP columns are not correctly marked as such. I have added the following work-around to the view to get this to work:

CREATE VIEW hits AS
SELECT *
	REPLACE
	(epoch_ms(EventTime * 1000) AS EventTime,
	 DATE '1970-01-01' + INTERVAL (EventDate) DAYS AS EventDate)
FROM read_parquet('hits_*.parquet', binary_as_string=True);

Essentially constructing the correct TIMESTAMP and DATE fields from the Parquet.

@alexey-milovidov alexey-milovidov self-assigned this Nov 25, 2022
@alexey-milovidov alexey-milovidov merged commit 965edca into ClickHouse:main Nov 25, 2022
@alexey-milovidov
Copy link
Member

Thank you! It appeared faster than clickhouse-local.

alexey-milovidov added a commit that referenced this pull request Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants