Skip to content

Commit

Permalink
Merge branch 'refs/heads/main' into feature/metrics-redesign
Browse files Browse the repository at this point in the history
  • Loading branch information
Liraim committed Dec 17, 2024
2 parents e35ef8e + 9817cc1 commit d451be5
Show file tree
Hide file tree
Showing 77 changed files with 1,479 additions and 1,122 deletions.
4 changes: 2 additions & 2 deletions docs/book/examples/cookbook_llm_regression_testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,13 +86,13 @@ ws = CloudWorkspace(token="YOUR_API_TOKEN", url="https://app.evidently.cloud")
Create a Project:

```python
project = ws.create_project("Regression testing example", team_id="YOUR_TEAM_ID")
project = ws.create_project("Regression testing example", org_id="YOUR_ORG_ID")
project.description = "My project description"
project.save()
```

{% hint style="info" %}
**Need help?** Check how to find API key and [create a Team](../installation/cloud_account.md).
**Need help?** Check how to find [API key](../installation/cloud_account.md).
{% endhint %}

# 3. Prepare the Dataset
Expand Down
6 changes: 3 additions & 3 deletions docs/book/examples/tutorial-llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,18 +137,18 @@ assistant_logs.head(3)
To be able to save and share results and get a live monitoring dashboard, create a Project in Evidently Cloud. Here's how to set it up:

* **Sign up**. If you do not have one yet, create a free [Evidently Cloud account](https://app.evidently.cloud/signup) and name your Organization.
* **Add a Team**. Click **Teams** in the left menu. Create a Team, copy and save the Team ID. ([Team page](https://app.evidently.cloud/teams)).
* **Create an Organization** when you log in for the first time. Get an ID of your organization. [Organizations page](https://app.evidently.cloud/organizations).
* **Get your API token**. Click the **Key** icon in the left menu to go. Generate and save the token. ([Token page](https://app.evidently.cloud/token)).
* **Connect to Evidently Cloud**. Pass your API key to connect.

```python
ws = CloudWorkspace(token="YOUR_TOKEN",
url="https://app.evidently.cloud")
```
* **Create a Project**. Create a new Project inside your Team, adding your title and description:
* **Create a Project**. Create a new Project inside your Organization, adding your title and description:

```python
project = ws.create_project("My project title", team_id="YOUR_TEAM_ID")
project = ws.create_project("My project title", org_id="YOUR_ORG_ID")
project.description = "My project description"
project.save()
```
Expand Down
7 changes: 3 additions & 4 deletions docs/book/get-started/cloud_quickstart_llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@ Need help? Ask on [Discord](https://discord.com/invite/xZjKRaNp8b).

Set up your Evidently Cloud workspace:
* **Sign up** for a free [Evidently Cloud account](https://app.evidently.cloud/signup).
* **Create an Organization** when you log in for the first time.
* **Create a Team**. Click Teams in the left menu, create a Team, and save the Team ID ([Team page](https://app.evidently.cloud/teams)).
* **Create an Organization** when you log in for the first time. Get an ID of your organization. [Organizations page](https://app.evidently.cloud/organizations).
* **Get your API token**. Click the **Key** icon in the left menu. Generate and save the token. ([Token page](https://app.evidently.cloud/token)).

Now, switch to your Python environment.
Expand Down Expand Up @@ -49,10 +48,10 @@ Connect to Evidently Cloud using your API token:
ws = CloudWorkspace(token="YOUR_API_TOKEN", url="https://app.evidently.cloud")
```

Create a Project within your Team:
Create a Project within your Organization:

```python
project = ws.create_project("My test project", team_id="YOUR_TEAM_ID")
project = ws.create_project("My test project", org_id="YOUR_ORG_ID")
project.description = "My project description"
project.save()
```
Expand Down
7 changes: 3 additions & 4 deletions docs/book/get-started/cloud_quickstart_tabular.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ description: ML Monitoring “Hello world.” From data to dashboard in a couple

Set up your Evidently Cloud workspace:
* **Sign up**. If you do not have one yet, sign up for a free [Evidently Cloud account](https://app.evidently.cloud/signup).
* **Create an Organization**. When you log in the first time, create and name your Organization.
* **Create a Team**. Click **Teams** in the left menu. Create a Team, copy and save the Team ID. ([Team page](https://app.evidently.cloud/teams)).
* **Create an Organization** when you log in for the first time. Get an ID of your organization. [Organizations page](https://app.evidently.cloud/organizations).
* **Get your API token**. Click the **Key** icon in the left menu. Generate and save the token. ([Token page](https://app.evidently.cloud/token)).

You can now go to your Python environment.
Expand Down Expand Up @@ -39,10 +38,10 @@ Connect to Evidently Cloud using your access token.
ws = CloudWorkspace(token="YOUR_TOKEN_HERE", url="https://app.evidently.cloud")
```

Create a new Project inside your Team. Pass the `team_id`.
Create a new Project inside your Organization. Pass the `org_id`.

```python
project = ws.create_project("My test project", team_id="YOUR_TEAM_ID")
project = ws.create_project("My test project", org_id="YOUR_ORG_ID")
project.description = "My project description"
project.save()
```
Expand Down
2 changes: 1 addition & 1 deletion docs/book/get-started/cloud_quickstart_tracing.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Initialize the OpenAI client. Pass the token as an environment variable:
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
```

Set up tracing parameters. Copy the Team ID from the [Teams page](https://app.evidently.cloud/teams), and give a name to identify your tracing dataset.
Set up tracing parameters. Give it a name to identify your tracing dataset.

```python
init_tracing(
Expand Down
10 changes: 1 addition & 9 deletions docs/book/installation/cloud_account.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,7 @@ If not yet, [sign up for a free Evidently Cloud account](https://app.evidently.c

After logging in, create an **Organization** and name it.

# 3. Create a Team

Go to the **Teams** icon in the left menu, create a Team, and name it. ([Team page](https://app.evidently.cloud/teams)).

{% hint style="info" %}
**Do I always need a Team?** Yes. Every Project must be within a Team. Teams act as "folders" to organize your work, and you can create multiple Teams. If you work alone, simply create a Team without external users.
{% endhint %}

# 4. Connect from Python
# 3. Connect from Python

You will need an access token to interact with Evidently Cloud from your Python environment.

Expand Down
8 changes: 4 additions & 4 deletions docs/book/presets/text-overview.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
**TL;DR:** You can explore and compare text datasets.

* **Report**: for visual analysis or metrics export, use the `TextOverviewPreset`.
* **Report**: for visual analysis or metrics export, use the `TextEvals`.

# Use case

Expand All @@ -14,13 +14,13 @@ You can evaluate and explore text data:

# Text Overview Report

If you want to visually explore the text data, you can create a new Report object and use the `TextOverviewPreset`.
If you want to visually explore the text data, you can create a new Report object and use the `TextEvals`.

## Code example

```python
text_overview_report = Report(metrics=[
TextOverviewPreset(column_name="Review_Text")
TextEvals(column_name="Review_Text")
])

text_overview_report.run(reference_data=ref, current_data=cur)
Expand All @@ -38,7 +38,7 @@ nltk.download('omw-1.4')

## How it works

The `TextOverviewPreset` provides an overview and comparison of text datasets.
The `TextEvals` provides an overview and comparison of text datasets.
* Generates a **descriptive summary** of the text columns in the dataset.
* Performs **data drift detection** to compare the two texts using the domain classifier approach.
* Shows distributions of the **text descriptors** in two datasets, and their **correlations** with other features.
Expand Down
12 changes: 3 additions & 9 deletions docs/book/projects/add_project.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,10 @@ You can create a Project using the Python API or directly in the user interface.

## Add a new Project - API

{% hint style="success" %}
Team management is a Pro feature available in the `Evidently Cloud` and `Evidently Enterprise`.
{% endhint %}

In Evidently Cloud and Enterprise, you must create a Team before adding a Project. To get your Team ID, go to the [Teams page](https://app.evidently.cloud/teams), select your Team, and copy the ID from there.

To create a Project inside a workspace `ws` and Team with a `team_id`, assign a name and description, and save the changes:
To create a Project inside a workspace `ws` and Organization ([see organizations](https://app.evidently.cloud/organizations)) with an `org_id`, assign a name and description, and save the changes:

```
project = ws.create_project("My test project", team_id="YOUR_TEAM_ID")
project = ws.create_project("My test project", org_id="YOUR_ORG_ID")
project.description = "My project description"
project.save()
```
Expand All @@ -37,7 +31,7 @@ project.save()

## Add a new Project - UI

Click on the “plus” sign on the home page, create a Team if you do not have one yet and type your Project name and description.
Click on the “plus” sign on the home page, type your Project name and description.

![](../.gitbook/assets/cloud/add_project_wide-min.png)

Expand Down
26 changes: 4 additions & 22 deletions docs/book/reference/all-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,28 +147,6 @@ How to set [data drift parameters](../customization/options-for-statistical-test

</details>

<details>

<summary>Text Overview Preset</summary>

`TextOverviewPreset()` provides a summary for a single or multiple text columns. Text columns are required.

**Composition**:
* `ColumnSummaryMetric()` for text descriptors for all columns. Descriptors included:
* `Sentiment()`
* `SentenceCount()`
* `OOV()`
* `TextLength()`
* `NonLetterCharacterPercentage()`
* `SemanticSimilarity()` between each pair of text columns, if there is more than one.

**Required parameters**:
* `column_name` or `columns` list

**Optional parameters**:
* `descriptors` list

</details>

<details>

Expand All @@ -187,6 +165,9 @@ How to set [data drift parameters](../customization/options-for-statistical-test
**Required parameters**:
* `column_name`

**Optional parameters**:
* `descriptors` list

</details>

<details>
Expand Down Expand Up @@ -282,6 +263,7 @@ Check for regular expression matches.
| **JSONMatch()** <ul><li>Compares two columns of a dataframe and checks whether the two objects in each row of the dataframe are matching JSONs or not. </li><li>Returns True/False for every input. </li></ul> Example use:<br> `JSONMatch(with_column="column_2")`| **Required:** <br> `with_column : str` <br><br>**Optional:**<ul><li>`display_name`</li> |
| **ContainsLink()** <ul><li>Checks if the text contains at least one valid URL. </li><li>Returns True/False for each row. </li></ul> | **Required:** n/a <br>**Optional:**<ul><li>`display_name`</li></ul> |
| **IsValidPython()** <ul><li>Checks if the text is valid Python code without syntax errors.</li><li>Returns True/False for every input. </li></ul>| **Required:** <br>n/a<br><br>**Optional:**<ul><li>`display_name`</li></ul> |
| **IsValidSQL()** <ul><li>Checks if the text in a specified column is a valid SQL query without executing the query.</li><li>Returns True/False for every input. </li></ul>| **Required:** <br>n/a<br><br>**Optional:**<ul><li>`display_name`</li></ul> |

## Descriptors: Text stats

Expand Down
1 change: 1 addition & 0 deletions requirements.min.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,5 @@ openai==1.16.2
evaluate==0.4.1
transformers[torch]==4.39.3
sentence-transformers==2.7.0
sqlvalidator==0.0.20
chromadb==0.4.0
3 changes: 3 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,9 @@ ignore_missing_imports = True
[mypy-litellm.*]
ignore_missing_imports = True

[mypy-sqlvalidator.*]
ignore_missing_imports = True

[mypy-chromadb.*]
ignore_missing_imports = True

Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@
install_requires=[
"plotly>=5.10.0",
"statsmodels>=0.12.2",
"scikit-learn>=1.0.1",
"scikit-learn>=1.0.1,<1.6.0",
"pandas[parquet]>=1.3.5",
"numpy>=1.22.0,<2.1",
"nltk>=3.6.7",
Expand Down Expand Up @@ -103,6 +103,7 @@
"evaluate>=0.4.1",
"transformers[torch]>=4.39.3",
"sentence-transformers>=2.7.0",
"sqlvalidator>=0.0.20",
"chromadb>=0.4.0",
],
"spark": ["pyspark>=3.4.0"],
Expand Down
3 changes: 3 additions & 0 deletions src/evidently/_pydantic_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from pydantic.v1 import PrivateAttr
from pydantic.v1 import SecretStr
from pydantic.v1 import ValidationError
from pydantic.v1 import create_model
from pydantic.v1 import parse_obj_as
from pydantic.v1 import validator
from pydantic.v1.fields import SHAPE_DICT
Expand All @@ -37,6 +38,7 @@
from pydantic import PrivateAttr
from pydantic import SecretStr # type: ignore[assignment]
from pydantic import ValidationError # type: ignore[assignment]
from pydantic import create_model # type: ignore[attr-defined,no-redef]
from pydantic import parse_obj_as
from pydantic import validator
from pydantic.fields import SHAPE_DICT # type: ignore[attr-defined,no-redef]
Expand Down Expand Up @@ -77,4 +79,5 @@
"DictStrAny",
"PrivateAttr",
"Extra",
"create_model",
]
2 changes: 1 addition & 1 deletion src/evidently/_version.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env python
# coding: utf-8

version_info = (0, 4, 41)
version_info = (0, 5, 1)
__version__ = ".".join(map(str, version_info))


Expand Down
2 changes: 2 additions & 0 deletions src/evidently/descriptors/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from .hf_descriptor import HuggingFaceToxicityModel
from .is_valid_json_descriptor import IsValidJSON
from .is_valid_python_descriptor import IsValidPython
from .is_valid_sql_descriptor import IsValidSQL
from .json_match_descriptor import JSONMatch
from .json_schema_match_descriptor import JSONSchemaMatch
from .llm_judges import BiasLLMEval
Expand Down Expand Up @@ -74,6 +75,7 @@
"WordMatch",
"WordNoMatch",
"IsValidJSON",
"IsValidSQL",
"JSONSchemaMatch",
"IsValidPython",
"_registry",
Expand Down
3 changes: 3 additions & 0 deletions src/evidently/descriptors/_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,9 @@
"evidently.descriptors.custom_descriptor.CustomPairColumnEval",
"evidently:descriptor:CustomPairColumnEval",
)
register_type_alias(
FeatureDescriptor, "evidently.descriptors.is_valid_sql_descriptor.IsValidSQL", "evidently:descriptor:IsValidSQL"
)
register_type_alias(
FeatureDescriptor,
"evidently.descriptors.json_match_descriptor.JSONMatch",
Expand Down
11 changes: 11 additions & 0 deletions src/evidently/descriptors/is_valid_sql_descriptor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from evidently.features import is_valid_sql_feature
from evidently.features.generated_features import FeatureDescriptor
from evidently.features.generated_features import GeneratedFeature


class IsValidSQL(FeatureDescriptor):
class Config:
type_alias = "evidently:descriptor:IsValidSQL"

def feature(self, column_name: str) -> GeneratedFeature:
return is_valid_sql_feature.IsValidSQL(column_name, self.display_name)
3 changes: 3 additions & 0 deletions src/evidently/features/_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,9 @@
register_type_alias(
GeneratedFeatures, "evidently.features.contains_link_feature.ContainsLink", "evidently:feature:ContainsLink"
)
register_type_alias(
GeneratedFeatures, "evidently.features.is_valid_sql_feature.IsValidSQL", "evidently:feature:IsValidSQL"
)
register_type_alias(
GeneratedFeatures, "evidently.features.exact_match_feature.ExactMatchFeature", "evidently:feature:ExactMatchFeature"
)
Expand Down
43 changes: 43 additions & 0 deletions src/evidently/features/is_valid_sql_feature.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
from typing import Any
from typing import ClassVar
from typing import Optional

from evidently import ColumnType
from evidently.features.generated_features import ApplyColumnGeneratedFeature


class IsValidSQL(ApplyColumnGeneratedFeature):
class Config:
type_alias = "evidently:feature:IsValidSQL"

__feature_type__: ClassVar = ColumnType.Categorical
display_name_template: ClassVar = "SQL Validity Check for {column_name}"
column_name: str

def __init__(self, column_name: str, display_name: Optional[str] = None):
self.column_name = column_name
self.display_name = display_name
super().__init__()

def apply(self, value: Any):
if value is None or not isinstance(value, str):
return False

return self.is_valid_sql(value)

def is_valid_sql(self, query: str) -> bool:
import sqlvalidator

queries = query.strip().split(";") # Split by semicolon

for q in queries:
q = q.strip() # Remove extra whitespace
if not q: # Skip empty queries
continue

try:
sqlvalidator.format_sql(q) # Validate SQL syntax
except Exception:
return False # Invalid SQL

return True # All queries are valid
Loading

0 comments on commit d451be5

Please sign in to comment.