This repository demonstrates how to integrate vector search capabilities into Azure Cosmos DB using various programming languages and APIs.
Azure Cosmos DB provides integrated vector search capabilities for AI-powered semantic search, Retrieval-Augmented Generation (RAG), and recommendation systems. This repository contains comprehensive code samples showing how to:
- Generate embeddings with Azure OpenAI
- Store vector embeddings in Cosmos DB
- Query with vector similarity search
- Use different vector indexing algorithms
- Implement managed identity authentication
- nosql-vector-search-typescript - TypeScript samples for Cosmos DB NoSQL API
- DiskANN, Flat, and QuantizedFlat indexing algorithms
- Managed identity authentication
- Comprehensive documentation and examples
This project demonstrates:
✅ Vector Embedding Generation - Using Azure OpenAI to generate embeddings
✅ Vector Storage - Storing embeddings directly in JSON documents
✅ Similarity Search - Querying with VectorDistance for nearest neighbors
✅ Multiple Algorithms - DiskANN, Flat, QuantizedFlat indexing
✅ Distance Metrics - Cosine, Euclidean (L2), and DotProduct
✅ Managed Identity - Passwordless authentication with Azure AD
✅ Production Ready - Enterprise-grade patterns with retry logic
- Azure Subscription - Create a free account
- Azure Cosmos DB Account - NoSQL API
- Azure OpenAI Service - With embedding model deployed
- Development Environment - Node.js, Python, .NET, or Go depending on sample
- Choose a sample from the repository structure above
- Navigate to the sample directory and follow its README
- Configure environment variables with your Azure resource information
- Run the sample to see vector search in action
# Clone the repository
git clone https://github.com/Azure-Samples/cosmos-db-vector-samples.git
# Provision Azure resources with Azure Developer CLI
azd auth login
azd up
# Work with TypeScript sample
cd cosmos-db-vector-samples/nosql-vector-search-typescript
# Install dependencies
npm install
# Set environment variables from provisioned infrstructure
azd env get-values > .env
# Build and run
npm run build
npm run start:diskannVector embeddings are numerical representations of text, images, or other data in high-dimensional space. Similar items have similar vector representations, enabling semantic search.
| Algorithm | Accuracy | Speed | Scale | Best For |
|---|---|---|---|---|
| Flat | 100% | Slow | Small | Dev/test, maximum accuracy |
| QuantizedFlat | ~100% | Fast | Large | Balanced performance |
| DiskANN | High | Very Fast | Massive | Enterprise scale, RAG, AI apps |
- Cosine Similarity - Measures angle between vectors (most common for text)
- Euclidean Distance (L2) - Straight-line distance in n-dimensional space
- Dot Product - Projection of one vector onto another
- Azure Cosmos DB Vector Search Overview
- Vector Search for NoSQL API
- DiskANN in Cosmos DB
- Azure OpenAI Embeddings
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project is licensed under the MIT License - see the LICENSE file for details.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.