Release notes

1.4.0

Improvement:

minimize memory usage inference-schema and adjust attributes types
inference-schema detects more timestamp pattern

1.3.0

New Feature:

Extract Starlake schema from OpenAPI definition
Fallback to variant type when struct's attribute has invalid field names

1.2.2

Bug Fix:

fix usage of technical columns when load table is set to overwrite
fix gcp log serialization
fix tree validation
fix intermediate format to orc rules
fix conversion from spark schema to starlake schema
fix template resolution
fix merge statement when used with quoted column

1.2.1

Bug Fix:

propagate final name and variant type for nested struct

1.2

New Feature:

Custom DDL mapping for JDBC datawarehouses (Postgres, Snowflake, Redshift, Synapse ...)
Unit tests for transforms
Excel export
Custom write strategies using Jinja2
The following commands now use HDFS client to interact with files and enables cloud storage support such as S3 or GCS:
- extract-schema
- extract-data
- infer-schema
- yml2xls
add stringPartitionFunc attribute to table extraction definition
Support load unit tests
add data and schema extraction from sql query for JDBC connection
migrate command added in order to ease migration between version
domain template used during schema extraction is now splitted in two files: domain-template and table-template. Table-template are prefixed with _table_ and domain template may be prefixed with _domain_ with the config file name. Ex: _table_config.sl.yml.
CURRENT_DATE, CURRENT_TIME and CURRENT_TIMESTAMP are now supported on a per call level during tests
make comet_input_file_name in bigquery native load as precise as in spark when files are grouped
Support VIEW materialization
Testing for load & transform

Bug Fix:

Amazon RedShift write strategy templates
Add support for Int and Short result in stringHashFunc. Some database don't support implicit cast.
Fix incremental next query
Fix restricted renamed column and pk renaming
Fix comet_input_file_name usage in native bigquery load
Inherit property from default transform

Improvement:

add log during long extraction around every 30 seconds
primary and foreign keys extraction failure are not considered as errors anymore
make data extraction in parallel on schema's level

Miscellaneous:

log level switched to error instead of info. In order to restore previous behavior, set SL_LOG_LEVEL to info.

Deprecations removal:

tables attached to domain description or table description in load stage is not authorized starting from 1.2.0 but a config migration is automatically done for prior configs

1.1.1:

Improvement:

add git hash or timestamp if no git info to printed SNAPSHOT version. Requires SBT reload to get new settings.

1.1.0:

BREAKING CHANGE

Data extraction didn't fail on table's extraction failure. In order to keep behaviour, use --ignoreExtractionFailure
Default data extraction output dir is now in 'metadata/extract' instead of 'metadata/load'.
Defining yaml config file without specifying explicitly one container root attribute is now prohibited
In load files, can't use schemas anymore. Use tables instead.
Default timestamp pattern for data extraction is now the iso format 'yyyy-MM-dd'T'HH:mm:ss.SSSXXX'. To restore previous behavior, set timestamp pattern to 'yyyy-MM-dd HH:mm:ss'

Bug Fix:

concurrent schema extraction close resource prematurely
fix versions.sh file for linux
update dockerfile to take environment variable into account
Throw expected exception when no connection ref found.
turn missing additional columns optional on native bigquery csv data ingestion
fix quote on data extraction when no partition is given. Failure occurs when query don't quote with '"'
table's metadata merge during schema extraction now takes sink, ignore and directory attribute into account.
Use default load format during native ingestion
Mysql extraction could fetch wrong table's information
Data extraction fresh enough was done on any success state, it now only consider successful extractions
Align infered schema primitive's type with the one declared types.sl.yml.
Fix dockerfile for latest Alpine by adding bash package
Fix precedence of data extraction mode
Handle null pointer exception on json schema inference
Infer schema from complex json with arrays and struct

Improvement:

added auditConnectionRef to jdbc extract schemas to be on pair with connectionRef behavior
warn when starlake version is not installed yet and user want to use it
added support for mysql extraction
add the ability to rename column during schema and data extraction
enhance user error message when defined types is not declared in types
Schema inference for DSV files improved in type inference

Feature:

generic templating framework for dag generation through the definition of a python library for starlake jobs
load gzip compressed files (.gz extension) into bigquery
add adaptive load and supports the following strategy: OVERWRITE, APPEND, UPSERT_BY_KEY, UPSERT_BY_KEY_AND_TIMESTAMP, OVERWRITE_BY_PARTITION

1.0.0:

BREAKING CHANGE
- STAGE no more used in MetricsJob and Expectations. Existing Metrics Database need to be updated.
- Remove deprecated metadata.partition property. Now part of Sink
- Remove deprecated metadata.xml property. Now part of metadata.options

Feature:

CLI now support multiple version of starlake at once and use the correct one based on sl_versions.sh/cmd in SL_ROOT
CLI can now upgrade all components except HADOOP extra elements on windows
Support any JDBC compliant database
Add archive table support for BigQuery
Configure CSV data extraction output format
Amazon Redshift support
Expectations macro support

Improvement:

Count null partition rows as rejected with dynamic partition overwrite
BREAKING CHANGE Extract-schema sanitize domain name if sanitizeName is true. Have same value for domain name and its folder. By default sanitizaName is false.
'directory' is not mandatory in extract template
load individual domain only in extract-schema

Bug Fix:

Data extraction retrieve last extraction date time but didn't get the right one for partitionned tables.

0.8.0:

** DEPRECATED **
- All date time related variables are now deprecated aka; sl_date, sl_year ...
BREAKING CHANGE
- extract-schema command line option 'mapping' replaced by 'config'
- kafkaload takes now a connection ref parameter
- application.conf replaced with application.sl.yml or application.yml in metadata folder
- SL_FS no more used. Set SL_ROOT to an absolute path instead
- SL_ENGINE no more used. engine is derived from connection
- format renamed to sparkFormat in connections.
- COMET_* env vars replaced definitely with SL_*
- Sinks "name" attribute renamed to "connectionRef"
- extensions no more used in file detection. Table patterns are directly applied to detect correct extensions
- Default connection ref may be defined in the application.yml file
- Sink name in XLs files is now translated to a connection ref name
- "domains" and "jobs" folders renamed to "load" and "transform" respectively
- "load" and "watch" commands are now merged into one command. They both watch for new files and load them
- globalJDBCSchema renamed to default
- SL_DEFAULT_FORMAT renamed to SL_DEFAULT_WRITE_FORMAT
- SINK_ACCEPTED and SINK_REJECTED duration are not logged anymore. Only full time LOAD and TRANSFORM are logged
- configuration files have now the .sl.yml extension
  - On Linux/MacOS, you may have to run the following command to make it work: find . -name "*.comet.yml" -exec rename 's/\.comet.yml$/.sl.yml/' '{}' + # On MacOS install first with brew install rename
  - On Windows, you may have to run the following command to make it work: Get-ChildItem -Path . -Filter "*.comet.yml" -Recurse | Rename-Item -NewName { $_.name -replace '\.comet\.yml$','.sl.yml' }

Feature:

Databricks on Azure is now fully documented
Auto merge support added at the task level. MERGE INTO is used to merge data into the target table automatically.
Use Refs file to configure model references
Support native loading of data into BigQuery
Define JDBC connections and audit connections in metadata/connections.sl.yml
schema extraction and features relying on it benefit from parallel fetching
use load dataset path as default output dir if not defined for schema inference
have same file ingestion behavior as spark with big query native loader. Loader follows the same limit as bq load. Don't support the following ingestion phases:
- line ignore filter
- pre-sql
- post-sql
- detailed rejection
- udf privacy
- data validation
- expectations
- metrics
- distinct on all lines
- unique input file name with grouped ingestion
sink become optional in spark job and can fallback into global connection ref settings
add dynamicPartitionOverwrite sink options. Available for bigquery sink and file sink. No need to set spark.sql.sources.partitionOverwriteMode.

Bug Fix:

BREAKING CHANGE the new database and tenant fields should be added to the audit table.
forceDomainPattern renamed in order to be overridable with environment variable
audit log was not in UTC when loaded from local Please run the following SQL to update your audit table on BigQuery:

  ALTER TABLE audit.audit ADD COLUMN IF NOT EXISTS database STRING;
  ALTER TABLE audit.audit ADD COLUMN IF NOT EXISTS tenant STRING;

Feature:

extract-schema keep original scripted fields and merge attributes' parameters
extract-schema quote catalog, schema and table name.

Deprecated:

BREAKING CHANGE Views have their own sections. Views inside jobs are now deprecated.

Fix:

Take into account extensions in domain / metadata attribute
Deserialization of privacy level was 'null' instead of its default value PrivacyLevel.None
log failure in audit during ingestion job when unexpected behavior occurs
switch audit log and RLS queries as interactive to wait for job output to avoid any async exception and improve job output result accuracy
escape string parameters while using native query

0.7.4:

Deprecated:

BREAKING CHANGE Env vars that start with COMET_ are now replaced with SL_ prefix for starlake.cmd

Feature:

Add SL_PROJECT and SL_TENANT env vars to be used in audit table

Improvements:

retry on retryable bigquery exception: rateLimitExceeded and duplicate
avoid table description update if it didn't change
avoid table's column description update if it didn't change

Bug Fix:

avoid swallowed exception related to BigquerySparkJob

0.7.3:

Feature:

BREAKING CHANGE: Rename "cleanOnExtract" to "clean" in ExtractData CLI
Imply SL_HIVE=true implicitly when running on Databricks

Bug Fix:

report correct JSON accepted lines in audit and added unit test
Resolve env vars in metadata/application.conf

0.7.2:

Feature:

allow full export for tables and use partition column only to speed up extraction
allow to force full export. Useful for re-init cases.
allow to change date and timestamp format during data extraction
allow to prevent extraction if last extraction is recent enough
allow to clean all files of table only when it is extracted
allow to in/exclude schemas and/or tables from extraction
allow to filter table data before sinking it.

Bug Fix:

Use audit connection settings while fetching last export and its column quotes
Division by zero when computing progress bar
Human Readable throw exception when elapsed time is 0
Breaking Change Make table's fetchSize to have higher precedence than the one defined in jdbcSchema.
list command for internal ressources doesn't work for yml2dag
Breaking Change Make default configuration of CSV writer in extract-data to match default value of Metadata.
keep original args in starlake.sh when they have spaces

0.7.1:

Feature:

add support for parallel fetch with String in some databases and give the ability to customize it.
Add support for running as a standalone docker image on any cloud

Performance:

disable table remarks fetch during data extraction
add parallelism option to data extraction and fetch tables in parallel along with their partitions fetching

Refactor:

rely on a csv library during data extraction

Bug Fix:

use different quote for audit connection

0.7.0

Deprecated:

Env vars that start with COMET_ are now deprecated. replace prefix with SL_

New Feature:

Add Project diffs and produce HTML report
Import existing BigQuery Datasets and Tables
Generate Dataset and Table statistics
upsert table description
Support freshness in command line mode (getting ready for dependency mode)
Extract BigQuery Tables infos to dataset_info and table_info tables
Import BigQuery Project into external metadata folder
Automatically switch to ORC when ingesting complex structures (array of records) into bigquery (GoogleCloudDataproc/spark-bigquery-connector#251)
Turn schema extract pattern to a template
Add global jdbc schema attributes in order to set common attributes once
Always propagate domain's metadata to tables
Order column extraction from extract-schema according to database column order
Set domain description if given on bigquery
Add the ability to consider empty string as valid value for required String
Apply trim on all numeric during schema extraction. Have higher precedence than the one defined in the template.
BREAKING CHANGE Add normalized_domain variable for schema extraction. domain keep the original name.
keep user changes if set in domain's metadata or table information. Domain-template and other rules that apply to it still have higher precedence.
Generate domain dags from yml via yml2dag command
Env vars should start with SL_. Starting with COMET_ is snow deprecated
Extract database data using multi threading
Extract Database data in delta mode

Bug Fix:

BREAKING CHANGE: Make directory mandatory only when feature require it. If you rely on exception while generating YML files from xls and vice-versa for any missing directory, you'll have to change.
Upsert table description for nested fields
Restore the ability to override intermediate bq format
Exclude specific BQ partitions when applying Merge with a BQ Table
Apply spark options defined in the job description (sink) when saving into file
archived Spark versions may be now referenced in starlake CLI (@sabino).
Use gs if full uri is given even if default fs is file
Remove lowercase on various name in extract-script in order to make it coherent with extract-schema
remove strip comments
BREAKING CHANGE: support other projectId for bqNativeJob:
close resources for schema extraction
escape replacement value
BREAKING CHANGE domain template in extract-schema is no more interpolated since we plan to use domains accross multiple environment
BREAKING CHANGE use original domain and table's name. Applies to starlake folder and audit logs.

0.6.3

New Feature:

Support task refs in job definition files
Support multiple buckets for between domain files and between domain files and metadata

0.6.2

Bug Fix:

allow bigquery job to work on a dataset of another project based on dataset name combined with projectId

New Feature:

Support BigQuery IAM Policy Tags
Support task refs in autojob file
Support materialized views in autojob
__ Breaking Changes__ XLS and YML readers renamed. Breaking change if you are calling them outside the starlake command line

Bug Fix:

beauty fail when no SQL is defined for a transform task
make it build on windows
fix quickstart bootstrap

Build:

add default sbt options and force test file encoding to be UTF-8

Doc:

enhance quickstart guide
fix some typos

0.6.1

__ Breaking Changes__

Extract has been refactored to 3 different scripts: extract-schema, extract-data and extract-script

0.6.0

New Feature:

Support for Jinja templating everywhere
area property is now ignored in YAML files
Support for Amazon Redshift and Snowflake
Quickstart documentation upgraded
single command setup and run using starlake.sh / starlake.cmd
Updated quickstart with docker use
Infer schema now recognize date as date not timestamp

0.5.2

New Feature:

Domain & Jobs delivery in rest api

0.5.1

Bug Fix:

Support dynamic value for comet metadata through rest api.

0.5.0

New Feature:

Add Server mode Bug Fix:
Extensions may be defined at the domain level

0.4.2

Bug Fix:

Use Spark Project Jetty shaded class to remove extra jetty dependency in Starlake server

0.4.1

New feature:

Added "serve --port 7070" to start starlake in server mode and wait for requests

0.4.0

New feature:

Support any source to any sink using kafkaload including sink and source that are not kafka. This has been possible at the cost of a breaking change
Support table and column remarks extraction on DB2 iSeries databases

CI:

remove support of github registry
Remove scala 2.11 support

0.3.26

New feature:

Support JINJA in autojob
Support external views defined using JINJA
File Splitter allow to split file based on first column or position in line.

0.3.25

New feature:

Add ACL Graph generation

0.3.24

Bug Fix:

Improve GraphViz Generation

0.3.23

Bug Fix:

Generate final name in Graphiz diagram

0.3.22

New feature:

Improve cli doc generation. Extra doc can be added in docs/merge/cli folder
prepare to deprecate xml tag in metadata section.

Bug Fix:

Code improvement: JDBC is handled as a generic sink
add extra parenthesis in BQ queries only for SELECT and WITH requests

0.3.21

New feature:

Reduce assembly size
Update to sbt 1.7.1
Add interactive mode for transform with csv, json and table output formats
Improve FS Sink handling

Bug Fix:

Support empty env files

0.3.20

Bug Fix:

Keep retrocompatibility with scala 2.11

0.3.19

New feature:

Handle Mixed XSD / YML ingestion & validation
Support JSON / XML descriptions in XLS files
Support arrays in XLS files

Bug Fix:

Support file system sink options in autojob

0.3.18

New feature:

Enhance XLS support for escaping char
Support HTTP Stream Source
Support XSD Validation
Transform jobs now report on the number of affected rows.

Bug Fix:

Regression return value of an autojob

0.3.17

New feature:

Support extra dsv options in conf file
support any option stored in metadata.options as an option for the reader.
Support VSCode Development

0.3.16

New feature:

Upgrade Kafka libraries
Simplify removal of comments in autojobs SQL statements.

0.3.15

New feature:

deprecate usage of schema, schemaRefs in domains and dataset in autojobs. Prefer the use of table and tableRefs

Bug Fix:

fix regression on Merge mode without Timestamp option

0.3.14

Bug Fix:

Xls2Yml - Get a correct sheet name based on the schema name field

0.3.13

New feature:

Improve XLS support for long name
Handle rate limit exceeded by setting COMET_GROUPED_MAX to avoid HTTP 429 on some cloud providers.

0.3.12

Bug Fix: reorder transformation on attributes as follows:

rename columns
- run script fields
- apply transformations (privacy: "sql: ...")
- remove ignore fields
- remove input filename column

0.3.11

Bug Fix:

Handle field relaxation when in Append Mode and table does not exist.

0.3.9 / 0.3.10 / 0.3.11

Bug Fix:

Make fields in rejected table optional

0.3.8

New feature:

Rollback on support for kafka.properties files. It is useless since we already have a server-options properties.

0.3.7

New feature:

Improve XLS support for metadata

0.3.6

New feature:

Autoload kafka.properties file from metadata directory.

0.3.5

New feature:

Parallel copy of files when loading and archiving
Support renaming of domains and schemas in XLS

0.3.3 / 0.3.4

Fixing release process

0.3.2

New feature:

import step can be limited to one or more domains

0.3.1

New feature:

Update Kafka / BigQuery libraries
Add new preset env vars
Allow renaming of domains and schemas

0.3.0

New feature:

Vars in assertions are now substituted at load time
Support SQL statement in privacy phase
Support parameterized semantic types
Add support for generic sink
Allow use of custom deserializer on Kafka source

0.2.10

New feature:

Drop Java 1.8 prerequisite for compilation
Support custom database name for Hive compatible metastore
Support custom dataset name in BQ

0.2.9

New feature:

Drop support for Spark 2.3.X
Allow table renaming on write
Any Spark supported input is now allowed
Env vars in env.yml files

0.2.8

New feature:

Generate DDL from YML files with support for BigQuery, Snowflake, Synapse and Postgres #51 / #56
Improve XLS handling: Add support for presql / postsql, tags, primary and foreign keys #59
Add optional application of row & column level security
Databricks Support
Signification reduction of memory consumption
Support application.conf file in metadata folder (COMET_METADATA_FS and COMET_ROOT must still be passed as env variables)

Bug Fix:

Include env var and option when running presql in ingestion mode #58

0.2.7

New feature:

Support merging dataset with updated schema
Support publishing to github packages
Reduce number of dependencies
Allow Audit sink name configuration from environment variable
Dropped support for elasticsearch 6

Bug Fix:

Support timestamps as long in XML & JSOn FIles

0.2.6

New feature:

Support XML Schema inference
Support the ability to reject the whole file on error
Improve error reporting
Support engine on task SQL (query pushdown to BigQuery)
Support last(n) partition on merge
Added new env var to control parititioning COMET_SPARK_SQL_SOURCES_PARTITION_OVERWRITE_MODE
Added env var to control BigQuery materialization on pushdown queries COMET_SPARK_BIGQUERY_MATERIALIZATION_PROJECT, COMET_SPARK_BIGQUERY_MATERIALIZATION_DATASET (default to materalization)
Added env var to control BigQuery read data format COMET_SPARK_BIGQUERY_READ_DATA_FORMAT (default to AVRO)
When COMET_MERGE_OPTIMIZE_PARTITION_WRITE is set and dynamic partition is active, only write partition containing new records or records to be deleted or updated for BQ (handled by Spark by default for FS).
Add VALIDATE_ON_LOAD (comet-validate-on-load) property to raise an exception if one of the domain/job YML file is invalid. default to false
Add custom file extensions property in Domain import default-file-extensions and env var COMET_DEFAULT_FILE_EXTENSIONS Bug Fix:
Loading empty files when the schema contains script fields
Applying default value for an attribute when value in the input data is null
Transformation job with BQ engine fails when no views block is defined
XLS2YML: remove non-breaking spaces from Excel file cells to avoid parsing errors
Fix merge using timestamp option
Json ingestion fails with complex array of objects
Remove duplicates on incoming when existingDF does not exist or is empty
Parse Sink options correctly
Handle extreme cases where audit lock raise an exception on creation
Handle files without extension in the landing zone
Store audit log with batch priority on BigQuery

0.2.4 / 0.2.5

Bug Fix:

Handle Jackson bug

0.2.3

New feature:

Add ability to ignore some fields (only top level fields supported)
BREAKING CHANGE: Handle multiple schemas during extraction. Update your extract configurations before migrating to this version.
Improve InferSchemaJob
Include primary keys & foreign keys in JDBC2Yml

Bug Fix:

Handle rename in JSON / XML files
Handle timestamp fields in JSON / XML files
Do not partition rejected files
Add COMET_CSV_OUTPUT_EXT env var to customize filename extension after ingestion when CSV is active.

0.2.2

New feature:

Use the same variable for Lock timeout
Improve logging when locking file fails
File sink while still the default is now controlled by the sink tag in the YAML file. The option sink-to-file is removed and used for testing purpose only.
Allow custom topic name for comet_offsets
Add ability to coalesce(int) to kafka offloading feature
Attributes may now be declared as primary and or foreign keys even though no check is made.
Export schema and relations(PK / FK) as dot (graphviz) files.
Support saving comet offsets to filesystem instead of kafka using the new setting comet-offsets-mode = "STREAM"

Bug Fix:

Invalid YAML files produce now an error at startup instead of displaying a warning.

0.2.1

Version skipped

0.2.0

New feature:

Export all tables in JDBC2Yml generation
Include table & column names when meeting unknown column type in JDBC source schema
Better logging on forced conversion in JDBC2Yml
Compute Hive Statistics on Table & Partitions
DataGrip support with implementation of substitution for ${} in addition to {{}}
Improve logging
Add column type during for database extraction
The name attribute inside a job file should reflect the filename. This attribute will soon be deprecated
Allow Templating on jobs. Useful to generate Airflow / Oozie Dags from job.sl.yml/job.sql code
Switch from readthedocs to docusaurus
Add local and bigquery samples
Custom var pattern through sql-pattern-parameter in reference.conf

Bug Fix:

Avoid computing statistics on struct fields
Make database-extractor optional in application.conf

0.1.36

New feature:

Parameterize with Domain & Schema metadata in JDBC2Yml generation Bug Fix:

0.1.35

New feature:

Auto compile with scala 2.11 for Spark 2 and with scala 2.12 for Spark 3. [457]
Performance optimization when using Privacy Rules. [459]
Rejected area and audit logs support can have their own write format (default-rejected-write-format and default-audit-write-format properties)
Deep JSON & XML files are now validated against the schema
Privacy is applied on deep JSON & XML inputs [461]
Domains & Jobs may be defined in subdirectories allowing better metatdata files organization [462]
Substitute variables through CLI & env files in views, assertions, presql, main sql and post sql requests [462]
Semantic type Date supports dates with MMM month representation [463]
Split reference.conf into multiple files. [460]
Support kafka Source & Sink through Spark Streaming [460]
Add an alternative way for applying privacy on XML files.[466]
Generate Excel files from YML files
Generate YML file from Database Schema

Bug Fix:

Make Jackson lib provided. [457]
Support Spark 2.3. by not using Dataframe.isEmpty [457]
comet_input_file_name missing when ingesting Position files [466]
Apply postsql queries on the accepted DataFrame [466]
Check that scripted fields are defined at the end of the schema in the YML file [#384]

0.1.34

New feature:

Allow sink options to be defined in YML instead of Spark Submit. [#450] [#454]

Bug Fix:

Parse dates with yyyyMM format correctly [#451]
Fix error when saving a csv with an empty DataFrame [#451]
Keep column description in BQ tables when using Overwrite mode [#453]

0.1.29

Bug Fix:

Support correctly merge mode in BQ [#449]
Fix for sinking XML to BQ [#448]

0.1.27

New feature:

Kafka Support improved

0.1.26

New feature:

Optionally sink to file using property sink-to-file = ${?COMET_SINK_TO_FILE}

Bug Fix:

Sink name was ignored and always considered as None

0.1.23

New feature:

YML files are now renamed with the suffix .sl.yml
Comet Schema is now published on SchemaStore. This allows Intellisense in VSCode & Intellij
Assertions may now be executed as part of the Load and transform processes
Shared Assertions UDF may be defined and stored in COMET_ROOT/metadata/assertions
Views mays also be defined and shared in COMET_ROOT/metadata/views.
Views are accessible in the load and transform processes.
Domain may be now prefixed by the "load" tag. Defining a domain without the "load" tag is now deprecated
AutoJob may be now prefixed by the "transform" tag. Defining a autojob without the "transform" tag is now deprecated

Breaking Changes:

N.A.

Bug Fix:

Use Spark Application Id for JobID information to make auditing easier

0.1.22

New feature:

Expose a REST API to generate a Yaml Schema from an Excel file. [#387]
Support ingesting multiline complex JSON. [#391]
Support nested fields when generating schema for BigQuery tables. [#391]
Enhancements on Spark to BigQuery schema. [#395]
Support merging a part of a BigQuery Table, rather than all the Table. [#397]
Enable setting BigQuery intermediate format when sinking using ${?COMET_INTERMEDIATE_BQ_FORMAT}. [#398] [#400]
Enhancement on Merging mode: do not depend on parquet files when using BigQuery tables.

Dependencies:

Update sbt to 1.4.4 [#385]
Update scopt to 4.0.0 [#390]

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Release notes

1.4.0

1.3.0

1.2.2

1.2.1

1.2

1.1.1:

1.1.0:

1.0.0:

0.8.0:

0.7.4:

0.7.3:

0.7.2:

0.7.1:

0.7.0

0.6.3

0.6.2

0.6.1

0.6.0

0.5.2

0.5.1

0.5.0

0.4.2

0.4.1

0.4.0

0.3.26

0.3.25

0.3.24

0.3.23

0.3.22

0.3.21

0.3.20

0.3.19

0.3.18

0.3.17

0.3.16

0.3.15

0.3.14

0.3.13

0.3.12

0.3.11

0.3.9 / 0.3.10 / 0.3.11

0.3.8

0.3.7

0.3.6

0.3.5

0.3.3 / 0.3.4

0.3.2

0.3.1

0.3.0

0.2.10

0.2.9

0.2.8

0.2.7

0.2.6

0.2.4 / 0.2.5

0.2.3

0.2.2

0.2.1

0.2.0

0.1.36

0.1.35

0.1.34

0.1.29

0.1.27

0.1.26

0.1.23

0.1.22