Getting Started with Qdrant: A Complete Setup Guide
Learn how to set up Qdrant vector database from scratch. This step-by-step guide covers installation, configuration, and your first vector operations.
Qdrant is a powerful open-source vector database designed for similarity search and AI applications. Whether you're building a semantic search engine, recommendation system, or RAG (Retrieval-Augmented Generation) application, Qdrant provides the foundation you need. In this guide, we'll walk through setting up Qdrant and performing your first vector operations.
What is Qdrant?
Qdrant (read: quadrant) is a vector similarity search engine that provides a production-ready service with a convenient API. It's written in Rust, making it fast and memory-efficient, and offers advanced filtering capabilities that set it apart from other vector databases.
Key Features
- High Performance: Built in Rust for speed and efficiency
- Rich Filtering: Advanced filtering capabilities with payload data
- Flexible Deployment: Docker, Kubernetes, or cloud-hosted
- Multiple Clients: Python, JavaScript, Go, and REST API
- Scalable: Distributed deployment support for production workloads
Installation Options
Qdrant offers several installation methods. We'll cover the three most common approaches.
Option 1: Docker (Recommended for Development)
The quickest way to get started is using Docker:
# Pull the latest Qdrant image
docker pull qdrant/qdrant
# Run Qdrant container
docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage:z \
qdrant/qdrant
This command:
- Exposes port 6333 for the REST API
- Exposes port 6334 for the gRPC interface
- Mounts a local directory for data persistence
The -v flag ensures your vector data persists even if the container is stopped. Make sure to use this in production!
Option 2: Docker Compose
For a more permanent setup, create a docker-compose.yml file:
version: '3.8'
services:
qdrant:
image: qdrant/qdrant:latest
ports:
- "6333:6333"
- "6334:6334"
volumes:
- ./qdrant_storage:/qdrant/storage:z
environment:
- QDRANT__SERVICE__HTTP_PORT=6333
- QDRANT__SERVICE__GRPC_PORT=6334
restart: unless-stopped
Start the service:
docker-compose up -d
Option 3: Qdrant Cloud
For production workloads, consider using Qdrant Cloud:
- Sign up at cloud.qdrant.io
- Create a new cluster
- Get your API key and cluster URL
- Connect using the provided credentials
Use Docker for development and testing, but consider Qdrant Cloud for production to benefit from managed infrastructure, automatic backups, and scaling.
Installing Client Libraries
Now let's install the client library for your preferred programming language.
Python Client
pip install qdrant-client
JavaScript/TypeScript Client
npm install @qdrant/js-client-rest
# or
yarn add @qdrant/js-client-rest
Go Client
go get github.com/qdrant/go-client
Your First Qdrant Collection
Let's create a collection and perform basic operations. We'll use Python for these examples.
Connecting to Qdrant
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams
# Connect to local Qdrant instance
client = QdrantClient(host="localhost", port=6333)
# For Qdrant Cloud, use:
# client = QdrantClient(
# url="https://your-cluster.qdrant.io",
# api_key="your-api-key"
# )
Creating a Collection
A collection is like a table in traditional databases. Each collection stores vectors of the same dimensionality.
# Create a collection for storing document embeddings
client.create_collection(
collection_name="my_documents",
vectors_config=VectorParams(
size=384, # Dimension of your vectors
distance=Distance.COSINE # Similarity metric
)
)
Qdrant supports three distance metrics:
- COSINE: Best for text embeddings (most common)
- EUCLIDEAN: Good for image embeddings
- DOT: Useful for specific ML models
Inserting Vectors
Let's add some vectors to our collection:
from qdrant_client.models import PointStruct
import numpy as np
# Generate sample vectors (in production, use actual embeddings)
vectors = [
np.random.rand(384).tolist() for _ in range(10)
]
# Prepare points with vectors and metadata
points = [
PointStruct(
id=idx,
vector=vector,
payload={
"text": f"Document {idx}",
"category": "tutorial" if idx % 2 == 0 else "guide",
"created_at": "2025-01-20"
}
)
for idx, vector in enumerate(vectors)
]
# Insert into collection
client.upsert(
collection_name="my_documents",
points=points
)
print("Successfully inserted 10 vectors!")
Searching Vectors
Now for the exciting part - similarity search:
# Create a query vector
query_vector = np.random.rand(384).tolist()
# Search for similar vectors
results = client.search(
collection_name="my_documents",
query_vector=query_vector,
limit=5 # Return top 5 results
)
# Display results
for result in results:
print(f"ID: {result.id}, Score: {result.score}")
print(f"Payload: {result.payload}\n")
Filtering Results
Qdrant's powerful filtering lets you combine vector search with structured filters:
from qdrant_client.models import Filter, FieldCondition, MatchValue
# Search only within "tutorial" category
results = client.search(
collection_name="my_documents",
query_vector=query_vector,
query_filter=Filter(
must=[
FieldCondition(
key="category",
match=MatchValue(value="tutorial")
)
]
),
limit=3
)
print("Filtered search results:")
for result in results:
print(f"Category: {result.payload['category']}")
Use filters to narrow down search space before vector similarity calculation. This dramatically improves performance in large collections!
Working with Real Embeddings
In practice, you'll use actual embedding models. Here's an example with Sentence Transformers:
from sentence_transformers import SentenceTransformer
# Load an embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')
# Your documents
documents = [
"Qdrant is a vector database",
"Vector search enables semantic similarity",
"Machine learning models create embeddings"
]
# Generate embeddings
embeddings = model.encode(documents)
# Insert into Qdrant
points = [
PointStruct(
id=idx,
vector=embedding.tolist(),
payload={"text": doc}
)
for idx, (doc, embedding) in enumerate(zip(documents, embeddings))
]
client.upsert(
collection_name="my_documents",
points=points
)
# Search with a query
query = "What is a vector database?"
query_embedding = model.encode(query)
results = client.search(
collection_name="my_documents",
query_vector=query_embedding.tolist(),
limit=2
)
print("\nSearch Results:")
for result in results:
print(f"Score: {result.score:.4f} - {result.payload['text']}")
Configuration and Optimization
Memory Configuration
Optimize Qdrant for your workload by adjusting memory settings:
# config/production.yaml
storage:
# Where to store data
storage_path: ./storage
# Performance tuning
performance:
max_search_threads: 0 # 0 = auto-detect
max_optimization_threads: 1
service:
# API settings
http_port: 6333
grpc_port: 6334
# Enable CORS if needed
enable_cors: true
HNSW Index Parameters
Fine-tune the HNSW (Hierarchical Navigable Small World) index:
from qdrant_client.models import VectorParams, HnswConfigDiff
client.create_collection(
collection_name="optimized_collection",
vectors_config=VectorParams(
size=384,
distance=Distance.COSINE
),
hnsw_config=HnswConfigDiff(
m=16, # Number of edges per node (higher = better accuracy)
ef_construct=100, # Construction time accuracy (higher = better quality)
)
)
Higher m and ef_construct values improve search quality but increase memory usage and indexing time. Start with defaults and tune based on your needs.
Monitoring and Management
Check Collection Info
# Get collection information
info = client.get_collection(collection_name="my_documents")
print(f"Vectors count: {info.vectors_count}")
print(f"Indexed vectors: {info.indexed_vectors_count}")
print(f"Points count: {info.points_count}")
REST API Access
Qdrant also provides a REST API for language-agnostic access:
# Get collection info
curl http://localhost:6333/collections/my_documents
# Search vectors
curl -X POST http://localhost:6333/collections/my_documents/points/search \
-H 'Content-Type: application/json' \
-d '{
"vector": [0.1, 0.2, 0.3, ...],
"limit": 5
}'
Web UI Dashboard
Qdrant includes a built-in web dashboard. Access it at:
http://localhost:6333/dashboard
The dashboard lets you:
- Browse collections
- Inspect points and payloads
- Run search queries
- Monitor performance metrics
Common Use Cases
Semantic Search
# Build a semantic search engine
def semantic_search(query: str, limit: int = 5):
query_embedding = model.encode(query)
results = client.search(
collection_name="my_documents",
query_vector=query_embedding.tolist(),
limit=limit
)
return [
{
"text": r.payload["text"],
"score": r.score
}
for r in results
]
Recommendation System
# Recommend similar items based on user preferences
def get_recommendations(item_id: int, limit: int = 5):
# Get the vector for the item
item = client.retrieve(
collection_name="my_documents",
ids=[item_id]
)[0]
# Find similar items
results = client.search(
collection_name="my_documents",
query_vector=item.vector,
limit=limit + 1 # +1 to exclude the item itself
)
# Filter out the original item
return [r for r in results if r.id != item_id][:limit]
RAG (Retrieval-Augmented Generation)
def rag_query(question: str, llm_client):
# 1. Retrieve relevant context
query_embedding = model.encode(question)
context_results = client.search(
collection_name="my_documents",
query_vector=query_embedding.tolist(),
limit=3
)
# 2. Build context for LLM
context = "\n".join([r.payload["text"] for r in context_results])
# 3. Query LLM with context
prompt = f"""Context: {context}
Question: {question}
Answer based on the context above:"""
return llm_client.generate(prompt)
Troubleshooting
Connection Issues
If you can't connect to Qdrant:
# Test connection
try:
collections = client.get_collections()
print("Connection successful!")
print(f"Collections: {[c.name for c in collections.collections]}")
except Exception as e:
print(f"Connection failed: {e}")
print("Check that Qdrant is running on localhost:6333")
Memory Issues
If you're running out of memory:
- Reduce HNSW parameters (
mandef_construct) - Use quantization for large collections
- Enable disk-backed storage for payloads
from qdrant_client.models import OptimizersConfigDiff
client.update_collection(
collection_name="my_documents",
optimizer_config=OptimizersConfigDiff(
memmap_threshold=20000 # Store on disk after 20k points
)
)
Next Steps
Now that you have Qdrant up and running, here are some next steps:
- Experiment with different embedding models - Try OpenAI embeddings, Cohere, or specialized domain models
- Implement filtering - Combine vector search with metadata filtering for more precise results
- Optimize for production - Tune HNSW parameters, set up monitoring, and implement proper error handling
- Explore advanced features - Look into sharding, replication, and distributed deployment
- Build a real application - Create a semantic search engine, chatbot, or recommendation system
Conclusion
Qdrant makes it straightforward to add vector search capabilities to your applications. With Docker for easy deployment, intuitive client libraries, and powerful filtering, you can quickly move from prototype to production.
The combination of high performance, flexible deployment options, and rich features makes Qdrant an excellent choice for AI applications. Whether you're building semantic search, RAG systems, or recommendation engines, Qdrant provides the foundation you need.
Ready to scale your vector search?
Vectrify makes it easy to manage and optimize your vector databases in production
Get Started Free