You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run a targeted multi-partition query on a sub-partitioned container, the query becomes a fan-out cross-partition query if the following conditions are met:
The filter condition is always false.
The filter condition contains the first part of the partition key.
The query options do not contain a partition key.
To reproduce
I wrote a console application to reproduce the issue. Please download the solution, open the Program.cs file and specify a connection string to an Azure Cosmos DB for NoSQL account. The application creates a new database with a container Subpartitioned. It also upserts random data, runs a query (with and without a partition key in the request options) and displays relevant information. You can run the application several times; it will only create resources and upsert data once.
The container contains 100 documents (articles) with the following properties:
Id : string
CustomerId : Guid
Label : string
Title : string
The value of the Id property is a random GUID. The value of the CustomerId and Label properties is always the same. The value of the Title property is unique for each document.
The indexing policy is automatic and consistent. As for the partition key, it has two components, customerId and id, so there are 100 logical partitions with one document. The maximum throughput is 20,000 RU/s so the container has two physical partitions, one of which is empty.
The query is SELECT * FROM c WHERE c.customerId = @customerId AND c.label = @label AND c.label <> @label and the application runs it twice:
Request options contain a partition key with the customer id.
Request options do not contain a partition key.
You can also simplify the query to SELECT * FROM c WHERE c.customerId = @customerId AND false.
As the filter condition contains the first component of the partition key, the request charge should be the same and the SDK should only read data only from one physical partition.
Actual behavior
Without a partition key in the request options the query becomes a fan-out cross-partition query so the request charge is higher as the SDK reads data from all physical partitions.
Environment summary
SDK Version: 3.46.0
OS Version: Windows 11 Enterprise (10.0.22631 Build 22631)
Additional context
It happens in both Direct and Gateway mode.
The text was updated successfully, but these errors were encountered:
@alesk-kontent - this is the current limitation of the system. Note that it's not limited to hierarchical partitioned container, but also applies to single partition key container.
As you may be aware, due to the contradiction in the filter condition, query is not supposed to return any documents.
Here's what's common between both invocations:
Since the query filter is effectively false,
(WHERE c.customerId = @customerId AND c.label = @label AND c.label <> @label is same as c.customerId = @customerId AND false which is same as WHERE false), query plan returns empty partition range.
However, at the moment, client doesn't seem to distinguish between empty partition range (for e.g. due to the contradiction in the filter condition) v/s full range (because partition elimination could not be performed - either because filter condition doesn't contain partition key or it's too complex). Client instead interprets the range as all partitions in the collection. We are looking into this issue.
Here's what's different between both invocations:
The partition ranges returned by the query plan are subsequently intersected with those specified in query request options (if present), which restricts the execution to a single partition only in case of first query. The second query, due to lack of query request options fans out to all partitions.
Describe the bug
When I run a targeted multi-partition query on a sub-partitioned container, the query becomes a fan-out cross-partition query if the following conditions are met:
To reproduce
I wrote a console application to reproduce the issue. Please download the solution, open the
Program.cs
file and specify a connection string to an Azure Cosmos DB for NoSQL account. The application creates a new database with a containerSubpartitioned
. It also upserts random data, runs a query (with and without a partition key in the request options) and displays relevant information. You can run the application several times; it will only create resources and upsert data once.The container contains 100 documents (articles) with the following properties:
Id : string
CustomerId : Guid
Label : string
Title : string
The value of the
Id
property is a random GUID. The value of theCustomerId
andLabel
properties is always the same. The value of theTitle
property is unique for each document.The indexing policy is automatic and consistent. As for the partition key, it has two components,
customerId
andid
, so there are 100 logical partitions with one document. The maximum throughput is 20,000 RU/s so the container has two physical partitions, one of which is empty.The query is
SELECT * FROM c WHERE c.customerId = @customerId AND c.label = @label AND c.label <> @label
and the application runs it twice:You can also simplify the query to
SELECT * FROM c WHERE c.customerId = @customerId AND false
.I can see the following results:
Container: Subpartitioned
Partition key: ["df165b31-7641-4664-9549-37862ed806ee"]
Request charge: 2,25
Physical partitions: 1
Container: Subpartitioned
Partition key:
Request charge: 4,50
Physical partitions: 0, 1
Expected behavior
As the filter condition contains the first component of the partition key, the request charge should be the same and the SDK should only read data only from one physical partition.
Actual behavior
Without a partition key in the request options the query becomes a fan-out cross-partition query so the request charge is higher as the SDK reads data from all physical partitions.
Environment summary
SDK Version: 3.46.0
OS Version: Windows 11 Enterprise (10.0.22631 Build 22631)
Additional context
It happens in both
Direct
andGateway
mode.The text was updated successfully, but these errors were encountered: