-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cosmos Full Text Search support #35868
base: main
Are you sure you want to change the base?
Conversation
src/EFCore.Cosmos/Query/Internal/CosmosQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Expressions/SqlFunctionExpression.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for full-text search in Cosmos by introducing new full-text index types, translators, model validations, and extensions.
- Introduces full-text indexing support in container creation and client wrappers.
- Adds a new CosmosFullTextSearchTranslator and corresponding DbFunctions extensions.
- Updates model validation and Fluent API extensions to include full-text search configuration.
Reviewed Changes
Copilot reviewed 23 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
src/EFCore.Cosmos/Storage/Internal/CosmosClientWrapper.cs | Adds support for full-text indexing in container creation, but the new full-text index assignment may override vector index settings. |
src/EFCore.Cosmos/Query/Internal/Translators/CosmosVectorSearchTranslator.cs | Modifies method checking logic that may prematurely reject valid method calls by changing && to |
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs | Implements a new translator for full-text search functions. |
src/EFCore.Cosmos/Query/Internal/Expressions/SqlFunctionExpression.cs | Introduces an extra constructor parameter to flag scoring functions. |
src/EFCore.Cosmos/Query/Internal/Expressions/SelectExpression.cs | Adds a RankOrdering property and methods to apply rank-based ordering. |
src/EFCore.Cosmos/Query/Internal/CosmosQueryableMethodTranslatingExpressionVisitor.cs | Adjusts ordering logic to support scoring functions and rank ordering. |
src/EFCore.Cosmos/Query/Internal/CosmosQuerySqlGenerator.cs | Updates SQL generation to include rank ordering handling. |
src/EFCore.Cosmos/Metadata/Internal/CosmosAnnotationNames.cs | Defines new annotations for full-text search configuration. |
src/EFCore.Cosmos/Infrastructure/Internal/CosmosModelValidator.cs | Validates full-text index configuration, ensuring only single-property indexes with language annotation are defined. |
src/EFCore.Cosmos/Extensions/CosmosPropertyExtensions.cs, CosmosPropertyBuilderExtensions.cs, CosmosIndexExtensions.cs, CosmosIndexBuilderExtensions.cs, CosmosDbFunctionsExtensions.cs | Adds Fluent API and extension methods for configuring full-text search on properties and indexes. |
Files not reviewed (3)
- Directory.Packages.props: Language not supported
- src/EFCore.Cosmos/EFCore.Cosmos.csproj: Language not supported
- src/EFCore.Cosmos/Properties/CosmosStrings.Designer.cs: Language not supported
Comments suppressed due to low confidence (2)
src/EFCore.Cosmos/Query/Internal/Translators/CosmosVectorSearchTranslator.cs:33
- Changing the logical operator from '&&' to '||' alters the intended logic and causes the translator to return null for methods declared on CosmosDbFunctionsExtensions that are not named 'VectorDistance'. Revisit the condition to ensure it only returns null when both conditions are unmet.
if (method.DeclaringType != typeof(CosmosDbFunctionsExtensions) || method.Name != nameof(CosmosDbFunctionsExtensions.VectorDistance))
src/EFCore.Cosmos/Extensions/CosmosPropertyExtensions.cs:137
- [nitpick] The Experimental attribute for full-text search methods is using 'CosmosVectorSearchExperimental' instead of the more appropriate 'CosmosFullTextSearchExperimental'. This may cause confusion regarding the feature's experiment status.
[Experimental(EFDiagnostics.CosmosVectorSearchExperimental)]
0f39e44
to
a836b63
Compare
src/EFCore.Cosmos/Query/Internal/Expressions/RankOrderingExpression.cs
Outdated
Show resolved
Hide resolved
a836b63
to
898cc5f
Compare
@maumar just to note that I'm holding back from reviewing since this is still in draft (you seemed to maybe want people to take a look in yesterday's meeting). Let me know if you want a review etc. |
898cc5f
to
cee74fd
Compare
@roji feel free to hold off on reviewing a draft. I'm pretty close to marking it for review I think |
a45f839
to
245066a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up doing a quick review even though the PR is still in draft... Here are some comments.
src/EFCore.Cosmos/Infrastructure/Internal/CosmosModelValidator.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/CosmosQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/CosmosQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
.../CosmosShapedQueryCompilingExpressionVisitor.InExpressionValuesExpandingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Expressions/SelectExpression.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs
Outdated
Show resolved
Hide resolved
73b0297
to
30e0bd9
Compare
30e0bd9
to
ebebcfc
Compare
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs
Outdated
Show resolved
Hide resolved
ebebcfc
to
68ca715
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @AndriySvyryd in case you want to take a quick look for the metadata side and Cosmos container building part.
src/EFCore.Cosmos/Query/Internal/Expressions/SelectExpression.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Expressions/SelectExpression.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Expressions/SelectExpression.cs
Outdated
Show resolved
Hide resolved
.../CosmosShapedQueryCompilingExpressionVisitor.InExpressionValuesExpandingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs
Outdated
Show resolved
Hide resolved
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs
Outdated
Show resolved
Hide resolved
e7d281d
to
81f7758
Compare
@roji @AndriySvyryd the PR is now ready for review. |
src/EFCore.Cosmos/Extensions/CosmosPropertyBuilderExtensions.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maumar see batch of comments.
I think we should maybe discuss the FTS property vs. index question in design...
src/EFCore.Cosmos/Extensions/CosmosPropertyBuilderExtensions.cs
Outdated
Show resolved
Hide resolved
.../CosmosShapedQueryCompilingExpressionVisitor.InExpressionValuesExpandingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
.../CosmosShapedQueryCompilingExpressionVisitor.InExpressionValuesExpandingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
.../CosmosShapedQueryCompilingExpressionVisitor.InExpressionValuesExpandingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs
Show resolved
Hide resolved
64a2808
to
0933839
Compare
Fixes #35476 Fixes #35853 (need to fix this one, otherwise vector translator will try to translate full text methods and fail) Description This PR enables full-text search queries using EF Core 9 when targeting Azure Cosmos Db. This is one of the flagship new features for Cosmos and the aim here is to help with it's adoption. This is very limited port of the full feature we are adding in EF 10 Preview 4. We are only adding querying capabilities: function stubs for FullTextContains, FullTextContainsAll, FullTextContainsAny, FullTextScore and RRF as well as logic translating these signatures to built-in Cosmos functions. No model building or data manipulation - containers need to be created outside EF (using Cosmos SDK or in the Data Explorer). Customer impact Customers will be able to use the upcoming full text search capabilities when working with Azure Cosmos Db without the need to upgrade to EF 10 preview. How found Partner team ask. Regression No Testing Extensively tested on EF 10, manual testing on EF9. End-to-end testing is not possible because we can't create containers programmatically (no support for it inside EF Core itself, and the Cosmos SDK which supports it is currently only available in beta, so we can't take dependency on it). Instead, we created containers and data using EF 10, ported all the query tests from EF 10 and ran them using the EF9 bits. Risk Low. Code here is purely additive and actually localized to only a handful of places in the code: validation in ApplyOrdering/AppendOrdering, FTS method translator, parameter inliner and sql generator. Feature is marked as experimental and quirks have been added.
- Adding model building API to configure property as full-text search enabled, as well as setup the index for it, - Adding model validation (e.g. FTS index not matching FTS property), - Adding EF.Functions stubs and translations for FullTextContains, FullTextContainsAll, FullTextContainsAny, FullTextScore and RRF (for hybrid), - Adding logic in SelectExpression to produce ORDER BY RANK when necessary, - Adding validation when attempting to mix with ORDER BY RANK with regular ORDER BY, - Rewrite OFFSET/LIMIT from parameter to constant when ORDER BY RANK is present. Also fixed / added support for vector search on owned types (since it shares logic with FTS) and added some tests. outstanding work: - support for FTS Container building using Azure.ResourceManager.CosmosDb (currently blocked on updated package being released) - add model building support for default language (superfluous for now, since only one language is supported), Fixes #35476 Fixes #35853 Fixes #35867 Fixes #35852
d9fa4f3
to
b554e29
Compare
@roji @AndriySvyryd I updated the PR addressing the Property/Index API and fixing a few small errors. You can review just the second commit - all the changes are there.
|
…reparation for Cosmos supporting more languages than just en-US.
b554e29
to
65cc1e5
Compare
/// <param name="language">The language for the full-text search.</param> | ||
/// <returns>The same builder instance so that multiple calls can be chained.</returns> | ||
[Experimental(EFDiagnostics.CosmosFullTextSearchExperimental)] | ||
public static PropertyBuilder IsFullTextProperty( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still find the discrepancy between builder (IsFullTextProperty) and metadata (SetFullTextSearchLanguage) names odd and unclear. For example, what does IsFullTextProperty() actually do when a language isn't passed? Since we just call SetFullTextSearchLanguage with null, it doesn't seem to actually do anything (apart erasing a previously-set language); if so, then this method doesn't actually make the property into a full-text property - it really just sets the language.
Here's another way to look at it. Since Cosmos allows defining a default language at the container level, I'm assuming it's possible to omit it from the property level (otherwise why have a default at all). If that's the case (should confirm), then there needs to be a way to way "this is a full text property" without setting a language. How would this work with the current scheme, i.e. how does you code know to emit a container policy specifying the property as a full-text property?
If the above is true on the Cosmos side - and it needs to be confirmed by simple experimentation - we simply need to have two separate metadata: one "IsFullTextSearchEnabled" (boolean), and one "FullTextSearchLanguage" (which is optional, if not specified the container default applies). We could still of course have a single convenience builder method (I'd call it EnableFullTextSearch), which sets both metadata. According to my understand of how Cosmos actually works - that the language specification on the property level is optional - this is what we need.
One last possibility here is to use a null language to mean "this is a full text property who's language hasn't been set by the user (and should default to the container default" (is this the current design @maumar?). If so, then there's no way to turn off a property being a full-text property (since passing a null language activate FTS).
On another note, an alternative possible naming for the builder API could simply be EnableFullTextSearch(), since the property still remains a regular property that can be used for other things (xmldoc can also be updated to reflect that). Constrast this with vector properties, which have no other possible use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last possibility here is to use a null language to mean "this is a full text property who's language hasn't been set by the user (and should default to the container default" (is this the current design @maumar?). If so, then there's no way to turn off a property being a full-text property (since passing a null language activate FTS).
you are right, this is the current design - will switch to 2 annotations, it is much cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood the Cosmos design correctly there is no such setting as "this is an FTS property", FTS can be used on anything by default, but to avoid specifying the language in the call it should be configured either on the container or the property. Then, if you want it to be read-efficient, you can also create an index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method should be called HasFullTextSearchLanguage and have a required parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And EnableFullTextSearch() is completely unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roji it is currently not allowed to set individual path's language to null or empty. SDK throws a validation error "Argument {0} can't be null or empty. (Parameter 'Language')". I'm following up with Cosmos folks to figure out exactly how it's supposed to work then, but it might be what @AndriySvyryd suggests (all properties are FTS enabled out of the box and you only specify path/language if the language is supposed to be different than the default). Documentation says something different: "For every text property you'd like to configure for full text search, you must declare both the path of the property with text and the language of the text.", so we'll see.
src/EFCore.Cosmos/Infrastructure/Internal/CosmosModelValidator.cs
Outdated
Show resolved
Hide resolved
c52fdaa
to
8d8771f
Compare
8d8771f
to
993bc56
Compare
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Cosmos Full Text Search support
Also fixed / added support for vector search on owned types (since it shares logic with FTS) and added some tests.
outstanding work:
Fixes #35476
Fixes #35851
Fixes #35852
Fixes #35853
Fixes #35867