Skip to content

Releases: dagster-io/dagster

1.8.7 (core) / 0.24.7 (libraries)

12 Sep 20:29
Compare
Choose a tag to compare

New

  • The AssetSpec constructor now raises an error if an invalid group name is provided, instead of an error being raised when constructing the Definitions object.
  • dagster/relation_identifier metadata is now automatically attached to assets which are stored using a DbIOManager.
  • [ui] Streamlined the code location list view.
  • [ui] The “group by” selection on the Timeline Overview page is now part of the query parameters, meaning it will be retained when linked to directly or when navigating between pages.
  • [dagster-dbt] When instantiating DbtCliResource, the project_dir argument will now override the DBT_PROJECT_DIR environment variable if it exists in the local environment (thanks, @marijncv!).
  • [dagster-embedded-elt] dlt assets now generate rows_loaded metadata (thanks, @kristianandre!).
  • Added support for pydantic version 1.9.0.

Bugfixes

  • Fixed a bug where setting asset_selection=[] on RunRequest objects yielded from sensors using asset_selection would select all assets instead of none.
  • Fixed bug where the tick status filter for batch-fetched graphql sensors was not being respected.
  • [examples] Fixed missing assets in assets_dbt_python example.
  • [dagster-airbyte] Updated the op names generated for Airbyte assets to include the full connection ID, avoiding name collisions.
  • [dagster-dbt] Fixed issue causing dagster-dbt to be unable to load dbt projects where the adapter did not have a database field set (thanks, @dargmuesli!)
  • [dagster-dbt] Removed a warning about not being able to load the dbt.adapters.duckdb module when loading dbt assets without that package installed.

Documentation

  • Fixed typo on the automation concepts page (thanks, @oedokumaci!)

Dagster Plus

  • You may now wipe specific asset partitions directly from the execution context in user code by calling DagsterInstance.wipe_asset_partitions.
  • Dagster+ users with a "Viewer" role can now create private catalog views.
  • Fixed an issue where the default IOManager used by Dagster+ Serverless did not respect setting allow_missing_partitions as metadata on a downstream asset.

1.8.6 (core) / 0.24.6 (libraries)

10 Sep 17:38
Compare
Choose a tag to compare

Bugfixes

  • Fixed an issue where runs in Dagster+ Serverless that materialized partitioned assets would sometimes fail with an object has no attribute '_base_path' error.
  • [dagster-graphql] Fixed an issue where the statuses filter argument to the sensorsOrError GraphQL field was sometimes ignored when querying GraphQL for multiple sensors at the same time.

1.8.5 (core) / 0.24.5 (libraries)

06 Sep 18:23
Compare
Choose a tag to compare

New

  • Updated multi-asset sensor definition to be less likely to timeout queries against the asset history storage.
  • Consolidated the CapturedLogManager and ComputeLogManager APIs into a single base class.
  • [ui] Added an option under user settings to clear client side indexeddb caches as an escape hatch for caching related bugs.
  • [dagster-aws, dagster-pipes] Added a new PipesECSClient to allow Dagster to interface with ECS tasks.
  • [dagster-dbt] Increased the default timeout when terminating a run that is running a dbt subprocess to wait 25 seconds for the subprocess to cleanly terminate. Previously, it would only wait 2 seconds.
  • [dagster-sdf] Increased the default timeout when terminating a run that is running an sdf subprocess to wait 25 seconds for the subprocess to cleanly terminate. Previously, it would only wait 2 seconds.
  • [dagster-sdf] Added support for caching and asset selection (Thanks, akbog!)
  • [dagster-dlt] Added support for AutomationCondition using DagsterDltTranslator.get_automation_condition() (Thanks, aksestok!)
  • [dagster-k8s] Added support for setting dagsterDaemon.runRetries.retryOnAssetOrOpFailure to False in the Dagster Helm chart to prevent op retries and run retries from simultaneously firing on the same failure.
  • [dagster-wandb] Removed usage of deprecated recursive parameter (Thanks, chrishiste!)

Bugfixes

  • [ui] Fixed a bug where in-progress runs from a backfill could not be terminated from the backfill UI.
  • [ui] Fixed a bug that caused an "Asset must be part of at least one job" error when clicking on an external asset in the asset graph UI
  • Fixed an issue where viewing run logs with the latest 5.0 release of the watchdog package raised an exception.
  • [ui] Fixed issue causing the “filter to group” action in the lineage graph to have no effect.
  • [ui] Fixed case sensitivity when searching for partitions in the launchpad.
  • [ui] Fixed a bug which would redirect to the events tab for an asset if you loaded the partitions tab directly.
  • [ui] Fixed issue causing runs to get skipped when paging through the runs list (Thanks, @HynekBlaha!)
  • [ui] Fixed a bug where the asset catalog list view for a particular group would show all assets.
  • [dagster-dbt] fix bug where empty newlines in raw dbt logs were not being handled correctly.
  • [dagster-k8s, dagster-celery-k8s] Correctly set dagster/image label when image is provided from user_defined_k8s_config. (Thanks, @HynekBlaha!)
  • [dagster-duckdb] Fixed an issue for DuckDB versions older than 1.0.0 where an unsupported configuration option, custom_user_agent, was provided by default
  • [dagster-k8s] Fixed an issue where Kubernetes Pipes failed to create a pod if the op name contained capital or non-alphanumeric containers.
  • [dagster-embedded-elt] Fixed an issue where dbt assets downstream of Sling were skipped

Deprecations

  • [dagser-aws]: Direct AWS API arguments in PipesGlueClient.run have been deprecated and will be removed in 1.9.0. The new params argument should be used instead.

Dagster Plus

  • Fixed a bug that caused an error when loading the launchpad for a partition, when using Dagster+ with an agent with version below 1.8.2.
  • Fixed an issue where terminating a Dagster+ Serverless run wouldn’t forward the termination signal to the job to allow it to cleanly terminate.

1.8.4 (core) / 0.24.4 (libraries)

30 Aug 18:38
Compare
Choose a tag to compare

Bugfixes

  • Fixed an issue where viewing run logs with the latest 5.0 release of the watchdog package raised an exception.
  • Fixed a bug that caused an "Asset must be part of at least one job" error when clicking on an external asset in the asset graph UI

Dagster Plus

  • The default io_manager on Serverless now supports the allow_missing_partitions configuration option.
  • Fixed a bug that caused an error when loading the launchpad for a partition, when using in Dagster+ with an agent with version below 1.8.2

1.8.3 (core) / 0.24.3 (libraries)

23 Aug 16:16
Compare
Choose a tag to compare

New

  • When different assets within a code location have different PartitionsDefinitions, there will no longer be an implicit asset job __ASSET_JOB_... for each PartitionsDefinition; there will just be one with all the assets. This reduces the time it takes to load code locations with assets with many different PartitionsDefinitions.

1.8.2 (core) / 0.24.2 (libraries)

22 Aug 19:04
Compare
Choose a tag to compare

New

  • [ui] Improved performance of the Automation history view for partitioned assets
  • [ui] You can now delete dynamic partitions for an asset from the ui
  • [dagster-sdf] Added support for quoted table identifiers (Thanks, @akbog!)
  • [dagster-openai] Add additional configuration options for the OpenAIResource (Thanks, @chasleslr!)
  • [dagster-fivetran] Fivetran assets now have relation identifier metadata.

Bugfixes

  • [ui] Fixed a collection of broken links pointing to renamed Declarative Automation pages.
  • [dagster-dbt] Fixed issue preventing usage of MultiPartitionMapping with @dbt_assets (Thanks, @arookieds!)
  • [dagster-azure] Fixed issue that would cause an error when configuring an AzureBlobComputeLogManager without a secret_key (Thanks, @ion-elgreco and @HynekBlaha!)

Documentation

  • Added API docs for AutomationCondition and associated static constructors.
  • [dagster-deltalake] Corrected some typos in the integration reference (Thanks, @dargmuesli!)
  • [dagster-aws] Added API docs for the new PipesCloudWatchMessageReader

1.8.1 (core) / 0.24.1 (libraries)

15 Aug 21:42
Compare
Choose a tag to compare

New

  • If the sensor daemon fails while submitting runs, it will now checkpoint its progress and attempt to submit the remaining runs on the next evaluation.
  • build_op_context and build_asset_context now accepts a run_tags argument.
  • Nested partially configured resources can now be used outside of Definitions.
  • [ui] Replaced GraphQL Explorer with GraphiQL.
  • [ui] The run timeline can now be grouped by job or by automation.
  • [ui] For users in the experimental navigation flag, schedules and sensors are now in a single merged automations table.
  • [ui] Logs can now be filtered by metadata keys and values.
  • [ui] Logs for RUN_CANCELED events now display relevant error messages.
  • [dagster-aws] The new PipesCloudWatchMessageReader can consume logs from CloudWatch as pipes messages.
  • [dagster-aws] Glue jobs launched via pipes can be automatically canceled if Dagster receives a termination signal.
  • [dagster-azure] AzureBlobComputeLogManager now supports service principals, thanks @ion-elgreco!
  • [dagster-databricks] dagster-databricks now supports databricks-sdk<=0.17.0.
  • [dagster-datahub] dagster-datahub now allows pydantic versions below 3.0.0, thanks @kevin-longe-unmind!
  • [dagster-dbt] The DagsterDbtTranslator class now supports a modfiying the AutomationCondition for dbt models by overriding get_automation_condition.
  • [dagster-pandera] dagster-pandera now supports polars.
  • [dagster-sdf] Table and columns tests can now be used as asset checks.
  • [dagster-embedded-elt] Column metadata and lineage can be fetched on Sling assets by chaining the new replicate(...).fetch_column_metadata() method.
  • [dagster-embedded-elt] dlt resource docstrings will now be used to populate asset descriptions, by default.
  • [dagster-embedded-elt] dlt assets now generate column metadata.
  • [dagster-embedded-elt] dlt transformers now refer to the base resource as upstream asset.
  • [dagster-openai] OpenAIResource now supports organization, project and base_url for configurting the OpenAI client, thanks @chasleslr!
  • [dagster-pandas][dagster-pandera][dagster-wandb] These libraries no longer pin numpy<2, thanks @judahrand!

Bugfixes

  • Fixed a bug for job backfills using backfill policies that materialized multiple partitions in a single run would be launched multiple times.
  • Fixed an issue where runs would sometimes move into a FAILURE state rather than a CANCELED state if an error occurred after a run termination request was started.
  • [ui] Fixed a bug where an incorrect dialog was shown when canceling a backfill.
  • [ui] Fixed the asset page header breadcrumbs for assets with very long key path elements.
  • [ui] Fixed the run timeline time markers for users in timezones that have off-hour offsets.
  • [ui] Fixed bar chart tooltips to use correct timezone for timestamp display.
  • [ui] Fixed an issue introduced in the 1.8.0 release where some jobs created from graph-backed assets were missing the “View as Asset Graph” toggle in the Dagster UI.

Breaking Changes

  • [dagster-airbyte] AirbyteCloudResource now supports client_id and client_secret for authentication - the api_key approach is no longer supported. This is motivated by the deprecation of portal.airbyte.com on August 15, 2024.

Deprecations

  • [dagster-databricks] Removed deprecated authentication clients provided by databricks-cli and databricks_api
  • [dagster-embedded-elt] Removed deprecated Sling resources SlingSourceConnection, SlingTargetConnection
  • [dagster-embedded-elt] Removed deprecated Sling resources SlingSourceConnection, SlingTargetConnection
  • [dagster-embedded-elt] Removed deprecated Sling methods build_sling_assets, and sync

Documentation

  • The Integrating Snowflake & dbt with Dagster+ Insights guide no longer erroneously references BigQuery, thanks @dnxie12!

1.8.0 (core) / 0.24.0 (libraries)

08 Aug 18:17
Compare
Choose a tag to compare

Major changes since 1.7.0 (core) / 0.22.0 (libraries)

Core definition APIs

  • You can now pass AssetSpec objects to the assets argument of Definitions, to let Dagster know about assets without associated materialization functions. This replaces the experimental external_assets_from_specs API, as well as SourceAssets, which are now deprecated. Unlike SourceAssets, AssetSpecs can be used for non-materializable assets with dependencies on Dagster assets, such as BI dashboards that live downstream of warehouse tables that are orchestrated by Dagster. [docs].
  • [Experimental] You can now merge Definitions objects together into a single larger Definitions object, using the new Definitions.merge API (doc). This makes it easier to structure large Dagster projects, as you can construct a Definitions object for each sub-domain and then merge them together at the top level.

Partitions and backfills

  • BackfillPolicys assigned to assets are now respected for backfills launched from jobs that target those assets.
  • You can now wipe materializations for individual asset partitions.

Automation

  • [Experimental] You can now add AutomationConditions to your assets to have them automatically executed in response to specific conditions (docs). These serve as a drop-in replacement and improvement over the AutoMaterializePolicy system, which is being marked as deprecated.
  • [Experimental] Sensors and schedules can now directly target assets, via the new target parameter, instead of needing to construct a job.
  • [Experimental] The Timeline page can now be grouped by job or automation. When grouped by automation, all runs launched by a sensor responsible for evaluating automation conditions will get bucketed to that sensor in the timeline instead of the "Ad-hoc materializations" row. Enable this by opting in to the Experimental navigation feature flag in user settings.

Catalog

  • The Asset Details page now prominently displays row count and relation identifier (table name, schema, database), when corresponding asset metadata values are provided. For more information, see the metadata and tags docs.
  • Introduced code reference metadata which can be used to open local files in your editor, or files in source control in your browser. Dagster can automatically attach code references to your assets’ Python source. For more information, see the docs.

Data quality and reliability

  • [Experimental] Metadata bound checks – The new build_metadata_bounds_checks API [doc] enables easily defining asset checks that fail if a numeric asset metadata value falls outside given bounds.
  • [Experimental] Freshness checks from dbt config - Freshness checks can now be set on dbt assets, straight from dbt. Check out the API docs for build_freshness_checks_from_dbt_assets for more.

Integrations

  • Dagster Pipes (PipesSubprocessClient) and its integrations with Lambda (PipesLambdaClient), Kubernetes (PipesK8sClient), and Databricks (PipesDatabricksClient) are no longer experimental.
  • The new DbtProject class (docs) makes it simpler to define dbt assets that can be constructed in both development and production. DbtProject.prepare_if_dev() eliminates boilerplate for local development, and the dagster-dbt project prepare-and-package CLI can helps pull deps and generate the manifest at build time.
  • [Experimental] The dagster-looker package can be used to define a set of Dagster assets from a Looker project that is defined in LookML and is backed by git. See the GitHub discussion for more details.

Dagster Plus

  • Catalog views — In Dagster+, selections into the catalog can now be saved and shared across an organization as catalog views. Catalog views have a name and description, and can be applied to scope the catalog, asset health, and global asset lineage pages against the view’s saved selection.
  • Code location history — Dagster+ now stores a history of code location deploys, including the ability to revert to a previously deployed configuration.

Changes since 1.7.16 (core) / 0.22.16 (libraries)

New

  • The target of both schedules and sensors can now be set using an experimental target parameter that accepts an AssetSelection or list of assets. Any assets passed this way will also be included automatically in the assets list of the containing Definitions object.

  • ScheduleDefinition and SensorDefinition now have a target argument that can accept an AssetSelection.

  • You can now wipe materializations for individual asset partitions.

  • AssetSpec now has a partitions_def attribute. All the AssetSpecs provided to a @multi_asset must have the same partitions_def.

  • The assets argument on materialize now accepts AssetSpecs.

  • The assets argument on Definitions now accepts AssetSpecs.

  • The new merge method on Definitions enables combining multiple Definitions object into a single larger Definitions object with their combined contents.

  • Runs requested through the Declarative Automation system now have a dagster/from_automation_condition: true tag applied to them.

  • Changed the run tags query to be more performant. Thanks @egordm!

  • Dagster Pipes and its integrations with Lambda, Kubernetes, and Databricks are no longer experimental.

  • The Definitions constructor will no longer raise errors when the provided definitions aren’t mutually resolve-able – e.g. when there are conflicting definitions with the same name, unsatisfied resource dependencies, etc. These errors will still be raised at code location load time. The new Definitions.validate_loadable static method also allows performing the validation steps that used to occur in constructor.

  • AssetsDefinitions object provided to a Definitions object will now be deduped by reference equality. That is, the following will now work:

    from dagster import asset, Definitions
    
    @asset
    def my_asset(): ...
    
    defs = Definitions(assets=[my_asset, my_asset]) # Deduped into just one AssetsDefinition.
  • [dagster-embedded-elt] Adds translator options for dlt integration to override auto materialize policy, group name, owners, and tags

  • [dagster-sdf] Introducing the dagster-sdf integration for data modeling and transformations powered by sdf.

  • [dagster-dbt] Added a new with_insights() method which can be used to more easily attach Dagster+ Insights metrics to dbt executions: dbt.cli(...).stream().with_insights()

Bugfixes

  • Dagster now raises an error when an op yields an output corresponding to an unselected asset.
  • Fixed a bug that caused downstream ops within a graph-backed asset to be skipped when they were downstream of assets within the graph-backed assets that aren’t part of the selection for the current run.
  • Fixed a bug where code references did not work properly for self-hosted GitLab instances. Thanks @cooperellidge!
  • [ui] When engine events with errors appear in run logs, their metadata entries are now rendered correctly.
  • [ui] The asset catalog greeting now uses your first name from your identity provider.
  • [ui] The create alert modal now links to the alerting documentation, and links to the documentation have been updated.
  • [ui] Fixed an issue introduced in the 1.7.13 release where some asset jobs were only displaying their ops in the Dagster UI instead of their assets.
  • Fixed an issue where terminating a run while it was using the Snowflake python connector would sometimes move it into a FAILURE state instead of a CANCELED state.
  • Fixed an issue where backfills would sometimes move into a FAILURE state instead of a CANCELED state when the backfill was canceled.

Breaking Changes

  • The experimental and deprecated build_asset_with_blocking_check has been removed. Use the blocking argument on @asset_check instead.
  • Users with mypy and pydantic 1 may now experience a “metaclass conflict” error when using Config. Previously this would occur when using pydantic 2.
  • AutoMaterializeSensorDefinition has been renamed AutomationConditionSensorDefinition.
  • The deprecated methods of the ComputeLogManager have been removed. Custom ComputeLogManager implementations must also implement the CapturedLogManager interface. This will not affect any of the core implementations available in the core dagster package or the library packages.
  • By default, an AutomationConditionSensorDefinition with the name “default_automation_condition_sensor” will be constructed for each code location, and will handle evaluating and launching runs for all AutomationConditions and AutoMaterializePolicies within that code location. You can restore the previous behavior by setting:
    auto_materialize:
      use_sensors: False
    in your dagster.yaml file.
  • [dagster-dbt] Support for dbt-core==1.6.* has been removed because the version is now end-of-life.
  • [dagster-dbt] The following deprecated APIs have been removed:
    • KeyPrefixDagsterDbtTranslator has been removed. To modify the asset keys for a set of dbt assets, implementDagsterDbtTranslator.get_asset_key() instead.
    • Support for setting ...
Read more

1.7.16 (core) / 0.23.16 (libraries)

02 Aug 14:37
Compare
Choose a tag to compare

Experimental

  • [pipes] PipesGlueClient, an AWS Glue pipes client has been added to dagster_aws.

1.7.15 (core) / 0.23.15 (libraries)

25 Jul 20:18
Compare
Choose a tag to compare

New

  • [dagster-celery-k8s] Added a per_step_k8s_config configuration option to the celery_k8s_job_executor , allowing the k8s configuration of individual steps to be configured at run launch time. Thanks @alekseik1!
  • [dagster-dbt] Deprecated the log_column_level_metadata macro in favor of the new with_column_metadata API.
  • [dagster-airbyte] Deprecated load_assets_from_airbyte_project as the Octavia CLI has been deprecated.

Bugfixes

  • [ui] Fix global search to find matches on very long strings.
  • Fixed an issue introduced in the 1.7.14 release where multi-asset sensors would sometimes raise an error about fetching too many event records.
  • Fixes an issue introduced in 1.7.13 where type-checkers interpretted the return type of RunRequest(...) as None
  • [dagster-aws] Fixed an issue where the EcsRunLauncher would sometimes fail to launch runs when the include_sidecars option was set to True.
  • [dagster-dbt] Fixed an issue where errors would not propagate through deferred metadata fetches.

Dagster Plus

  • On June 20, 2024, AWS changed the AWS CloudMap CreateService API to allow resource-level permissions. The Dagster+ ECS Agent uses this API to launch code locations. We’ve updated the Dagster+ ECS Agent CloudFormation template to accommodate this change for new users. Existing users have until October 14, 2024 to add the new permissions and should have already received similar communication directly from AWS.
  • Fixed a bug with BigQuery cost tracking in Dagster+ insights, where some runs would fail if there were null values for either total_byte_billed or total_slot_ms in the BigQuery INFORMATION_SCHEMA.JOBS table.
  • Fixed an issue where code locations that failed to load with extremely large error messages or stack traces would sometimes cause errors with agent heartbeats until the code location was redeployed.