Inconsistent querying on a partitioned CosmosDB collection

Refresh

April 2019

Views

98 time

1

I have a partitioned cosmos DB collection which is defined as unlimited with a throughput of 1000. It has the following document structure:

"Id": "b42129d2-5467-450c-9f7e-744f78dfe1e7", // Primary key
"ArrayOfObjects": [
 {
     // other properties omitted for brevity
     "SubId": "ed2a49fb-51d4-45b4-9690-df0721d6a32f"
 },
 {
     "SubId": "35c87833-9bea-4151-86da-4d9c482ae1fe"
 },
 "ParitionKey": "b42"

The partition key is the first 3 letters of the primary key which is a GUID. This gives me 32768 possible partitions with good cardinality. I am using the CosmosDB .NetCore SDK. There is currently ~170 thousand documents across ~6 thousand partitions.

I have functionality where I need to retrieve a document from the collection via a "SubId" where I do not know the Primary Key, which means I do not know the partition key. Unfortunatley I cannot change this functionality to work with a primary key as it's dependency is a legacy system which cannot be modified.

What's happening is, I successfully create a new document, then at some point I need to query that document using a "SubId". Which is done in C# as following:

public async Task<DocumentModel> GetBySubId(string subId)
{
    var collectionId = _cosmosClient.CollectionId;
    var query = [email protected]"SELECT * FROM {collectionId} c
                   WHERE ARRAY_CONTAINS(c.ArrayOfObjects, {{'SubId': '{subId}'}}, true)";

    var feedOptions = new FeedOptions { EnableCrossPartitionQuery = true };

    var docQuery = _cosmosClient.Client.CreateDocumentQuery(
            _collectionUri,
            query,
            feedOptions)
            .AsDocumentQuery();

    var executedQuery = await docQuery.ExecuteNextAsync<DocumentModel>();

     if (executedQuery.Count == 0)
     {
           return null;
     }

     return executedQuery.FirstOrDefault();
}

Sometimes it queries successfully sometimes it doesn't and I return null, then from my controller I return a 404.

Why this is so strange is because if I check the database and run that query directly the document is there and isn't actually missing, but for some reason when I query from C# using the SDK it cannot find the document. I have other functionality that queries using the primary key (which now means I have the partition key) and the SubId and that works fine. It's only when I query using the SubId by itself (without the partition key) it cannot find the document.

Given the above I think it has something to do with querying without the partition key. Anything I'm missing when querying without a partition key?

What I've tried at the moment is setting the database consistency from Eventual to Strong. This doesn't seem to make any difference.

2 answers

1

What I've tried at the moment is setting the database consistency from Eventual to Strong. This doesn't seem to make any difference.

From this document, the Strong level consistency guarantees returning the most recent committed version of an item. According to your description, the environment you are testing is not a high concurrency read operations. So, I think it has nothing to do with consistency level.

Sometimes it queries successfully sometimes it doesn't and I return null, then from my controller I return a 404.

Per my experience, this problem is capricious because of throughput bottlenecks. Partition key need to be provided when you querying partitioned collection. However,you do not know the partition key, it can't be done until you set EnableCrossPartitionQuery = true which already exists in your sample code. Then your query would cross the whole partitions until the specific document is found. In addition, array_contains operator adds the burden of query.

Cosmos DB queries are limited by throughput settings and will not cover the entire database endlessly.Please refer to this document.

Since you can't adjust your partition policy now, I suggest you increasing the throughput settings to check the issue.

2

I am from the CosmosDB engineering team.

The behavior you mentioned likely is because, occasionally, the query isn't able to complete execution over one continuation. Please ensure that the query finishes execution by draining continuations. You can find a sample here: https://docs.microsoft.com/en-us/azure/cosmos-db/performance-tips#throughput . Query execution is not considered complete until IDocumentQuery.HasMoreResults is set to false.