Skip to content

Releases: evidentlyai/evidently

Metrics & Metric Presets

24 Oct 10:39
e311be7
Compare
Choose a tag to compare
Pre-release

Breaking Changes:

All Test Presets were renamed.
TestPreset suffix was added to original names:

  • NoTargetPerformance -> NoTargetPerformanceTestPreset
  • DataQuality -> DataQualityTestPreset
  • DataStability -> DataStabilityTestPreset
  • DataDrift -> DataDriftTestPreset
  • Regression -> RegressionTestPreset
  • MulticlassClassification -> MulticlassClassificationTestPreset
  • BinaryClassificationTopK -> BinaryClassificationTopKTestPreset
  • BinaryClassification -> BinaryClassificationTestPreset

Updates:

Added DataDrift metrics:

  • DatasetDriftMetric
  • DataDriftTable
  • ColumnValuePlot
  • TargetByFeaturesTable

Added DataQuality metrics:

  • ColumnDistributionMetric
  • ColumnQuantileMetric
  • ColumnCorrelationsMetric
  • ColumnValueListMetric
  • ColumnValueRangeMetric
  • DatasetCorrelationsMetric

Added DataIntegrity metrics:

  • ColumnSummaryMetric
  • ColumnMissingValuesMetric
  • DatasetSummaryMetric
  • DatasetMissingValuesMetric

Added Classification metrics:

  • ClassificationQuality
  • ClassificationClassBalance
  • ClassificationConfusionMatrix
  • ClassificationQualityByClass
  • ClassificationClassSeparationPlot
  • ProbabilityDistribution
  • ClassificationRocCurve
  • ClassificationPRCurve
  • ClassificationPRTable
  • ClassificationQualityByFeatureTable

Added Regression metrics:

  • RegressionQualityMetric
  • RegressionPredictedVsActualScatter
  • RegressionPredictedVsActualPlot
  • RegressionErrorPlot
  • RegressionAbsPercentageErrorPlot
  • RegressionErrorDistribution
  • RegressionErrorNormality
  • RegressionTopErrorMetric
  • RegressionErrorBiasTable

Added MetricPresets:

  • DataDriftPreset
  • DataQualityPreset
  • RegressionPreset
  • ClassificationPreset

Added New Statistical Tests

  • Anderson-Darling test for numerical features
  • Cramer Von Mises test for numerical features
  • Hellinger distance test for numerical and categorical features
  • Mann-Whitney U-rank test for numerical features
  • Cressie-Read power divergence test for categorical features

Fixes:
#334
#353
#361
#367

Metrics Generator & Code Checks

30 Sep 18:54
Compare
Choose a tag to compare
Pre-release

Updates:

  • Replaced BaseWidgetInfo with helpers: #326
  • Added metrics generator for column-based metrics: #323
  • Added black and isort: #322

Fixes:

Report Concept Draft

07 Sep 09:04
69e98d0
Compare
Choose a tag to compare
Report Concept Draft Pre-release
Pre-release

Updates:

  • Introduced Report - an object, that unites Dashboard and Profile functionality
  • Introduced MetricPreset - an object, that replaces Tab and ProfileSection
  • Implemented following MetricPresets: DataDrift, DataQuality (limited content), CatTargetDrift, NumTargetDrift, RegressionPerformance, ClassificationPerformance

Fixes:

Automatic Tests Generation

16 Aug 19:52
Compare
Choose a tag to compare
Pre-release

Updates:

  • Implemented function generate_column_tests() to generate similar tests for many columns automatically

Dataset Null-related tests

  • Implemented TestNumberOfNulls to replace TestNumberOfNANs and TestNumberOfNullValues
  • Implemented TestShareOfNulls
  • Implemented TestShareOfColumnsWithNulls
  • Implemented TestShareOfRowsWithNulls
  • Implemented TestNumberOfDifferentNulls

Column Null-related tests

  • Implemented TestColumnNumberOfNulls to replace TestColumnNumberOfNullValues
  • Implemented TestColumnShareOfNulls to replace TestColumnNANShare

Fixes:

  • Fixed metric duplication to reduce an amount of calculations while building TestSuites (basically, now same metrics from the one test suite are not recalculated multiple times)
  • Implemented NAN filtering for all dashboards in way, that each column is filtered separately

Binary Classification: TPR, TNR, FPR, FNR

05 Aug 15:42
Compare
Choose a tag to compare

Updates:

  • added TPR, TNR, FPR, FNR Tests for Binary Classification Model Performance
  • Renamed status "No Group" to "Dataset-level tests" in TestSuites filtering menu

Fixes:

  • #207
  • #265
  • #256
  • fixed unit tests for different versions of python and pandas

Tests Grouping in the UI

29 Jul 16:02
Compare
Choose a tag to compare
Pre-release

Updates:

  1. Updated the UI to let users group tests by the following properties:
  • All tests
  • By Status
  • By feature
  • By test type
  • By test group
  1. New Tests:
  • Added tests for binary probabilistic classification models
  • Added tests for multiclass classification models
  • Added tests for multiclass probabilistic classification models
    The full list of tests will be available in the docs.
  1. New Tests Presets:
  • Regression
  • MulticlassClassification
  • BinaryClassificationTopK
  • BinaryClassification

Data Quality and Integrity Tests

19 Jul 13:29
Compare
Choose a tag to compare
Pre-release
  • added default configurations for Data Quality Tests
  • added default configurations for Data Integrity Tests
  • added visualisation for Data Quality Tests
  • added visualisation for Data Integrity Tests
  • Test descriptions are updated (column names are highlighted)

Test Suites with Individual Tests and Test Presets

07 Jul 11:51
8d17bea
Compare
Choose a tag to compare

Implemented new interfaces to test data and models in a batch: Test Suite.

Implemented the following Individual tests:

  • TestNumberOfColumns()
  • TestNumberOfRows()
  • TestColumnNANShare()
  • TestShareOfOutRangeValues()
  • TestNumberOfOutListValues()
  • TestMeanInNSigmas()
  • TestMostCommonValueShare()
  • TestNumberOfConstantColumns()
  • TestNumberOfDuplicatedColumns()
  • TestNumberOfDuplicatedRows()
  • TestHighlyCorrelatedFeatures()
  • TestTargetFeaturesCorrelations()
  • TestShareOfDriftedFeatures()
  • TestValueDrfit()
  • TestColumnsType()

Implemented the following test presets:

  • Data Quality. This preset is focused on the data quality issues like duplicate rows or null values.  
  • Data Stability. This preset identifies the changes in the data or differences between the batches.
  • Data Drift. This one compares feature distributions using statistical tests and distance metrics.  
  • NoTargetPerformance. This preset combines several checks to run when there are model predictions, there are no actuals or ground truth labels. This includes checking for prediction drift and some of the data quality and stability checks.

v0.1.51.dev0

31 May 20:25
Compare
Choose a tag to compare
v0.1.51.dev0 Pre-release
Pre-release

Updates:

  • Updated DataDriftTab: added target and prediction rows in DataDrift Table widget
  • Updated CatTargetDriftTab: added additional widgets for probabilistic cases in both binary and multiclasss probabilistic classification, particularly widget for label drift and class probability distributions.

Fixes:

  • #233
  • fixed previes in DataDrift Table widget. Now histogram previews for refernce and current data share an x-axis. This means that bins order in refernce and current histograms is the same, it makes visual distribution comparion esier.

Automatic Stattest Selection

19 May 08:33
Compare
Choose a tag to compare
Pre-release

Release scope:

  1. Stat test auto selection algorithm update: https://docs.evidentlyai.com/reports/data-drift#how-it-works

For small data with <= 1000 observations in the reference dataset:

  • For numerical features (n_unique > 5): two-sample Kolmogorov-Smirnov test.
  • For categorical features or numerical features with n_unique <= 5: chi-squared test.
  • For binary categorical features (n_unique <= 2), we use the proportion difference test for independent samples based on Z-score.
    All tests use a 0.95 confidence level by default.

For larger data with > 1000 observations in the reference dataset:

  1. Added options for setting custom statistical test for Categorical and Numerical Target Drift Dashboard/Profile:
    cat_target_stattest_func: Defines a custom statistical test to detect target drift in CatTargetDrift.
    num_target_stattest_func: Defines a custom statistical test to detect target drift in NumTargetDrift.

  2. Added options for setting custom threshold for drift detection for Categorical and Numerical Target Drift Dashboard/Profile:
    cat_target_threshold: Optional[float] = None
    num_target_threshold: Optional[float] = None
    These thresholds highly depends on selected stattest, generally it is either threshold for p_value or threshold for a distance.

Fixes:
#207