Complete Qdrant Setup Guide | Vector Database Tutorial

Qdrant is a powerful open-source vector database designed for similarity search and AI applications. Whether you're building a semantic search engine, recommendation system, or RAG (Retrieval-Augmented Generation) application, Qdrant provides the foundation you need. In this guide, we'll walk through setting up Qdrant and performing your first vector operations.

What is Qdrant?

Qdrant (read: quadrant) is a vector similarity search engine that provides a production-ready service with a convenient API. It's written in Rust, making it fast and memory-efficient, and offers advanced filtering capabilities that set it apart from other vector databases.

Key Features

High Performance: Built in Rust for speed and efficiency
Rich Filtering: Advanced filtering capabilities with payload data
Flexible Deployment: Docker, Kubernetes, or cloud-hosted
Multiple Clients: Python, JavaScript, Go, and REST API
Scalable: Distributed deployment support for production workloads

Installation Options

Qdrant offers several installation methods. We'll cover the three most common approaches.

Option 1: Docker (Recommended for Development)

The quickest way to get started is using Docker:

# Pull the latest Qdrant image
docker pull qdrant/qdrant

# Run Qdrant container
docker run -p 6333:6333 -p 6334:6334 \
    -v $(pwd)/qdrant_storage:/qdrant/storage:z \
    qdrant/qdrant

This command:

Exposes port 6333 for the REST API
Exposes port 6334 for the gRPC interface
Mounts a local directory for data persistence

Data Persistence

The -v flag ensures your vector data persists even if the container is stopped. Make sure to use this in production!

Option 2: Docker Compose

For a more permanent setup, create a docker-compose.yml file:

version: '3.8'

services:
  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - "6333:6333"
      - "6334:6334"
    volumes:
      - ./qdrant_storage:/qdrant/storage:z
    environment:
      - QDRANT__SERVICE__HTTP_PORT=6333
      - QDRANT__SERVICE__GRPC_PORT=6334
    restart: unless-stopped

Start the service:

docker-compose up -d

Option 3: Qdrant Cloud

For production workloads, consider using Qdrant Cloud:

Sign up at cloud.qdrant.io
Create a new cluster
Get your API key and cluster URL
Connect using the provided credentials

Best Practice

Use Docker for development and testing, but consider Qdrant Cloud for production to benefit from managed infrastructure, automatic backups, and scaling.

Installing Client Libraries

Now let's install the client library for your preferred programming language.

Python Client

pip install qdrant-client

JavaScript/TypeScript Client

npm install @qdrant/js-client-rest
# or
yarn add @qdrant/js-client-rest

Go Client

go get github.com/qdrant/go-client

Your First Qdrant Collection

Let's create a collection and perform basic operations. We'll use Python for these examples.

Connecting to Qdrant

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

# Connect to local Qdrant instance
client = QdrantClient(host="localhost", port=6333)

# For Qdrant Cloud, use:
# client = QdrantClient(
#     url="https://your-cluster.qdrant.io",
#     api_key="your-api-key"
# )

Creating a Collection

A collection is like a table in traditional databases. Each collection stores vectors of the same dimensionality.

# Create a collection for storing document embeddings
client.create_collection(
    collection_name="my_documents",
    vectors_config=VectorParams(
        size=384,  # Dimension of your vectors
        distance=Distance.COSINE  # Similarity metric
    )
)

Choosing Distance Metrics

Qdrant supports three distance metrics:

COSINE: Best for text embeddings (most common)
EUCLIDEAN: Good for image embeddings
DOT: Useful for specific ML models

Inserting Vectors

Let's add some vectors to our collection:

from qdrant_client.models import PointStruct
import numpy as np

# Generate sample vectors (in production, use actual embeddings)
vectors = [
    np.random.rand(384).tolist() for _ in range(10)
]

# Prepare points with vectors and metadata
points = [
    PointStruct(
        id=idx,
        vector=vector,
        payload={
            "text": f"Document {idx}",
            "category": "tutorial" if idx % 2 == 0 else "guide",
            "created_at": "2025-01-20"
        }
    )
    for idx, vector in enumerate(vectors)
]

# Insert into collection
client.upsert(
    collection_name="my_documents",
    points=points
)

print("Successfully inserted 10 vectors!")

Searching Vectors

Now for the exciting part - similarity search:

# Create a query vector
query_vector = np.random.rand(384).tolist()

# Search for similar vectors
results = client.search(
    collection_name="my_documents",
    query_vector=query_vector,
    limit=5  # Return top 5 results
)

# Display results
for result in results:
    print(f"ID: {result.id}, Score: {result.score}")
    print(f"Payload: {result.payload}\n")

Filtering Results

Qdrant's powerful filtering lets you combine vector search with structured filters:

from qdrant_client.models import Filter, FieldCondition, MatchValue

# Search only within "tutorial" category
results = client.search(
    collection_name="my_documents",
    query_vector=query_vector,
    query_filter=Filter(
        must=[
            FieldCondition(
                key="category",
                match=MatchValue(value="tutorial")
            )
        ]
    ),
    limit=3
)

print("Filtered search results:")
for result in results:
    print(f"Category: {result.payload['category']}")

Pro Tip

Use filters to narrow down search space before vector similarity calculation. This dramatically improves performance in large collections!

Working with Real Embeddings

In practice, you'll use actual embedding models. Here's an example with Sentence Transformers:

from sentence_transformers import SentenceTransformer

# Load an embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Your documents
documents = [
    "Qdrant is a vector database",
    "Vector search enables semantic similarity",
    "Machine learning models create embeddings"
]

# Generate embeddings
embeddings = model.encode(documents)

# Insert into Qdrant
points = [
    PointStruct(
        id=idx,
        vector=embedding.tolist(),
        payload={"text": doc}
    )
    for idx, (doc, embedding) in enumerate(zip(documents, embeddings))
]

client.upsert(
    collection_name="my_documents",
    points=points
)

# Search with a query
query = "What is a vector database?"
query_embedding = model.encode(query)

results = client.search(
    collection_name="my_documents",
    query_vector=query_embedding.tolist(),
    limit=2
)

print("\nSearch Results:")
for result in results:
    print(f"Score: {result.score:.4f} - {result.payload['text']}")

Configuration and Optimization

Memory Configuration

Optimize Qdrant for your workload by adjusting memory settings:

# config/production.yaml
storage:
  # Where to store data
  storage_path: ./storage

  # Performance tuning
  performance:
    max_search_threads: 0  # 0 = auto-detect
    max_optimization_threads: 1

service:
  # API settings
  http_port: 6333
  grpc_port: 6334
  
  # Enable CORS if needed
  enable_cors: true

HNSW Index Parameters

Fine-tune the HNSW (Hierarchical Navigable Small World) index:

from qdrant_client.models import VectorParams, HnswConfigDiff

client.create_collection(
    collection_name="optimized_collection",
    vectors_config=VectorParams(
        size=384,
        distance=Distance.COSINE
    ),
    hnsw_config=HnswConfigDiff(
        m=16,  # Number of edges per node (higher = better accuracy)
        ef_construct=100,  # Construction time accuracy (higher = better quality)
    )
)

Performance Trade-offs

Higher m and ef_construct values improve search quality but increase memory usage and indexing time. Start with defaults and tune based on your needs.

Monitoring and Management

Check Collection Info

# Get collection information
info = client.get_collection(collection_name="my_documents")

print(f"Vectors count: {info.vectors_count}")
print(f"Indexed vectors: {info.indexed_vectors_count}")
print(f"Points count: {info.points_count}")

REST API Access

Qdrant also provides a REST API for language-agnostic access:

# Get collection info
curl http://localhost:6333/collections/my_documents

# Search vectors
curl -X POST http://localhost:6333/collections/my_documents/points/search \
  -H 'Content-Type: application/json' \
  -d '{
    "vector": [0.1, 0.2, 0.3, ...],
    "limit": 5
  }'

Web UI Dashboard

Qdrant includes a built-in web dashboard. Access it at:

http://localhost:6333/dashboard

The dashboard lets you:

Browse collections
Inspect points and payloads
Run search queries
Monitor performance metrics

Common Use Cases

Semantic Search

# Build a semantic search engine
def semantic_search(query: str, limit: int = 5):
    query_embedding = model.encode(query)
    
    results = client.search(
        collection_name="my_documents",
        query_vector=query_embedding.tolist(),
        limit=limit
    )
    
    return [
        {
            "text": r.payload["text"],
            "score": r.score
        }
        for r in results
    ]

Recommendation System

# Recommend similar items based on user preferences
def get_recommendations(item_id: int, limit: int = 5):
    # Get the vector for the item
    item = client.retrieve(
        collection_name="my_documents",
        ids=[item_id]
    )[0]
    
    # Find similar items
    results = client.search(
        collection_name="my_documents",
        query_vector=item.vector,
        limit=limit + 1  # +1 to exclude the item itself
    )
    
    # Filter out the original item
    return [r for r in results if r.id != item_id][:limit]

RAG (Retrieval-Augmented Generation)

def rag_query(question: str, llm_client):
    # 1. Retrieve relevant context
    query_embedding = model.encode(question)
    context_results = client.search(
        collection_name="my_documents",
        query_vector=query_embedding.tolist(),
        limit=3
    )
    
    # 2. Build context for LLM
    context = "\n".join([r.payload["text"] for r in context_results])
    
    # 3. Query LLM with context
    prompt = f"""Context: {context}
    
Question: {question}

Answer based on the context above:"""
    
    return llm_client.generate(prompt)

Troubleshooting

Connection Issues

If you can't connect to Qdrant:

# Test connection
try:
    collections = client.get_collections()
    print("Connection successful!")
    print(f"Collections: {[c.name for c in collections.collections]}")
except Exception as e:
    print(f"Connection failed: {e}")
    print("Check that Qdrant is running on localhost:6333")

Memory Issues

If you're running out of memory:

Reduce HNSW parameters (m and ef_construct)
Use quantization for large collections
Enable disk-backed storage for payloads

from qdrant_client.models import OptimizersConfigDiff

client.update_collection(
    collection_name="my_documents",
    optimizer_config=OptimizersConfigDiff(
        memmap_threshold=20000  # Store on disk after 20k points
    )
)

Next Steps

Now that you have Qdrant up and running, here are some next steps:

Experiment with different embedding models - Try OpenAI embeddings, Cohere, or specialized domain models
Implement filtering - Combine vector search with metadata filtering for more precise results
Optimize for production - Tune HNSW parameters, set up monitoring, and implement proper error handling
Explore advanced features - Look into sharding, replication, and distributed deployment
Build a real application - Create a semantic search engine, chatbot, or recommendation system

Conclusion

Qdrant makes it straightforward to add vector search capabilities to your applications. With Docker for easy deployment, intuitive client libraries, and powerful filtering, you can quickly move from prototype to production.

The combination of high performance, flexible deployment options, and rich features makes Qdrant an excellent choice for AI applications. Whether you're building semantic search, RAG systems, or recommendation engines, Qdrant provides the foundation you need.

Ready to scale your vector search?

Vectrify makes it easy to manage and optimize your vector databases in production

Get Started Free

Getting Started with Qdrant: A Complete Setup Guide