PostgreSQL 17 + pgvector 0.8 RAG 实战 2026
TL;DR:pgvector 0.8 在 2026 Q2 发布,HNSW 索引查询性能比 0.7 提升 2 倍,新增 Sparse Vectors 和混合检索(BM25 + 向量)。本文从 schema 设计、索引选型到 Go 集成,给出生产级 RAG 完整方案。
一、pgvector 0.8 重大升级
| 特性 | 0.7 | 0.8 |
|---|---|---|
| HNSW 构建速度 | 基线 | 2.1x |
| HNSW 查询 QPS | 基线 | 2.0x |
| 索引内存 | 100% | 65% |
| Sparse Vectors | ❌ | ✅ |
| 混合检索 | ❌ | ✅(BM25 + dense) |
| 量化位数 | int8, bit | + int4, int2 |
| 二进制量化 | ❌ | ✅(32x 压缩) |
二、Schema 设计
2.1 文档块表
sql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm; -- 全文检索
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
source TEXT NOT NULL, -- 来源:pdf/url/notion
title TEXT,
content TEXT NOT NULL,
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE chunks (
id BIGSERIAL PRIMARY KEY,
document_id BIGINT REFERENCES documents(id) ON DELETE CASCADE,
chunk_index INT NOT NULL,
content TEXT NOT NULL,
-- 1024 维 dense embedding(OpenAI text-embedding-3-small)
embedding vector(1024),
-- sparse embedding(pgvector 0.8 新增,SPLADE v2)
sparse_embedding sparsevec(30522),
tsv tsvector GENERATED ALWAYS AS (to_tsvector('chinese', content)) STORED,
created_at TIMESTAMPTZ DEFAULT now()
);2.2 HNSW 索引
sql
-- dense 向量 HNSW
CREATE INDEX chunks_embedding_hnsw ON chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- sparse 向量 HNSW
CREATE INDEX chunks_sparse_hnsw ON chunks
USING hnsw (sparse_embedding sparse_ip_ops)
WITH (m = 16, ef_construction = 64);
-- 全文索引(BM25 模拟)
CREATE INDEX chunks_tsv_idx ON chunks USING GIN (tsv);参数说明:
- m = 16:每个节点连接 16 个邻居(默认 16)
- ef_construction = 64:构建时搜索深度(默认 64)
- ef = 40:查询时搜索深度(运行时可调)
三、写入数据(Go)
3.1 安装依赖
go get github.com/pgvector/pgvector-go go get github.com/jackc/pgx/v5
3.2 写入
go
import (
github.com/pgvector/pgvector-go
github.com/jackc/pgx/v5
)
func insertChunk(ctx context.Context, db *pgx.Conn, docID int64, content string, emb []float32) error {
_, err := db.Exec(ctx, `
INSERT INTO chunks (document_id, chunk_index, content, embedding)
VALUES ($1, $2, $3, $4)
`, docID, 0, content, pgvector.NewVector(emb))
return err
}四、检索
4.1 纯向量检索
sql
SET hnsw.ef_search = 100; -- 查询时精度
SELECT id, content, embedding <=> $1 AS distance
FROM chunks
ORDER BY embedding <=> $1
LIMIT 10;<=> 是 cosine distance 运算符。
4.2 混合检索(pgvector 0.8 新特性)
sql
WITH vector_search AS (
SELECT id, content, embedding <=> $1 AS vec_score
FROM chunks
ORDER BY embedding <=> $1
LIMIT 50
),
keyword_search AS (
SELECT id, content, ts_rank_cd(tsv, plainto_tsquery('chinese', $2)) AS kw_score
FROM chunks
WHERE tsv @@ plainto_tsquery('chinese', $2)
ORDER BY kw_score DESC
LIMIT 50
),
combined AS (
SELECT
id, content,
0.7 * COALESCE(vec_score, 1.0) + 0.3 * COALESCE(kw_score, 0.0) AS final_score
FROM vector_search
FULL OUTER JOIN keyword_search USING (id, content)
)
SELECT * FROM combined
ORDER BY final_score
LIMIT 10;权重经验值:纯语义查询 vec=0.9/kw=0.1,事实型查询 vec=0.5/kw=0.5。
4.3 重排序(Cross-Encoder)
RAG 经典流程:召回 → 重排 → 生成。
go
import github.com/knights-analytics/[email protected]
func rerank(ctx context.Context, query string, candidates []string) ([]string, error) {
r := knightsanalytics.NewReranker("your-model-path")
scores, _ := r.Rank(ctx, query, candidates)
// 按分数降序
sort.SliceStable(candidates, func(i, j int) bool {
return scores[i] > scores[j]
})
return candidates[:5], nil
}五、性能优化
5.1 索引参数调优
| 场景 | m | ef_construction | ef_search |
|---|---|---|---|
| 高精度(Recall > 0.95) | 32 | 128 | 200 |
| 平衡(Recall ~0.9) | 16 | 64 | 100 |
| 高吞吐(Recall ~0.85) | 8 | 32 | 40 |
5.2 二进制量化
sql
-- 创建量化后的副本表
CREATE TABLE chunks_quantized (
id BIGINT,
embedding binary_vector(1024) -- 1024 维压缩到 128 字节
);
-- 量化函数(用户自定义)
INSERT INTO chunks_quantized
SELECT id, binary_quantize(embedding) FROM chunks;32x 压缩比,Recall 损失 < 5%。
5.3 分区表
大表按时间分区:
sql
CREATE TABLE chunks (...) PARTITION BY RANGE (created_at);
CREATE TABLE chunks_2026q2 PARTITION OF chunks
FOR VALUES FROM ('2026-04-01') TO ('2026-07-01');六、与专业向量数据库对比
| 维度 | pgvector 0.8 | Pinecone | Milvus |
|---|---|---|---|
| QPS(百万级) | 5,000 | 50,000 | 30,000 |
| 部署复杂度 | 低(已有 PG) | 中(SaaS) | 高 |
| 事务支持 | ✅ | ❌ | ❌ |
| 混合检索 | ✅ | ❌ | ⚠️ |
| 成本 | PG 实例成本 | $0.096/小时起 | 自建 |
选型建议:
- 100 万向量以下:pgvector(复用 PG 栈)
- 千万级:Milvus 集群
- 实时性要求极高:Pinecone/Qdrant
七、生产部署 Checklist
- PG 17 + pgvector 0.8 升级
- shared_buffers = RAM 25%
- work_mem = RAM 5% / max_connections
- hnsw.ef_search 写入连接池默认值
- 监控 pg_stat_user_indexes 看索引使用
- 定期 VACUUM ANALYZE chunks
- 备份:pg_dump + WAL 归档
八、参考
- pgvector 0.8 Release Notes
- PostgreSQL 17 性能白皮书
- NVIDIA RAG with pgvector 教程
系列导航:Claude Sonnet 4.6 → 本篇 → AIOps 实战

