Announcement

👇Official Account👇

Welcome to join the group & private message

Article first/tail QR code

Skip to content

PostgreSQL 17 + pgvector 0.8 RAG 实战 2026

TL;DR:pgvector 0.8 在 2026 Q2 发布,HNSW 索引查询性能比 0.7 提升 2 倍,新增 Sparse Vectors 和混合检索(BM25 + 向量)。本文从 schema 设计、索引选型到 Go 集成,给出生产级 RAG 完整方案。

一、pgvector 0.8 重大升级

特性0.70.8
HNSW 构建速度基线2.1x
HNSW 查询 QPS基线2.0x
索引内存100%65%
Sparse Vectors
混合检索✅(BM25 + dense)
量化位数int8, bit+ int4, int2
二进制量化✅(32x 压缩)

二、Schema 设计

2.1 文档块表

sql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;  -- 全文检索

CREATE TABLE documents (
    id BIGSERIAL PRIMARY KEY,
    source TEXT NOT NULL,             -- 来源:pdf/url/notion
    title TEXT,
    content TEXT NOT NULL,
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE chunks (
    id BIGSERIAL PRIMARY KEY,
    document_id BIGINT REFERENCES documents(id) ON DELETE CASCADE,
    chunk_index INT NOT NULL,
    content TEXT NOT NULL,
    -- 1024 维 dense embedding(OpenAI text-embedding-3-small)
    embedding vector(1024),
    -- sparse embedding(pgvector 0.8 新增,SPLADE v2)
    sparse_embedding sparsevec(30522),
    tsv tsvector GENERATED ALWAYS AS (to_tsvector('chinese', content)) STORED,
    created_at TIMESTAMPTZ DEFAULT now()
);

2.2 HNSW 索引

sql
-- dense 向量 HNSW
CREATE INDEX chunks_embedding_hnsw ON chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- sparse 向量 HNSW
CREATE INDEX chunks_sparse_hnsw ON chunks
USING hnsw (sparse_embedding sparse_ip_ops)
WITH (m = 16, ef_construction = 64);

-- 全文索引(BM25 模拟)
CREATE INDEX chunks_tsv_idx ON chunks USING GIN (tsv);

参数说明:

  • m = 16:每个节点连接 16 个邻居(默认 16)
  • ef_construction = 64:构建时搜索深度(默认 64)
  • ef = 40:查询时搜索深度(运行时可调)

三、写入数据(Go)

3.1 安装依赖

go get github.com/pgvector/pgvector-go go get github.com/jackc/pgx/v5

3.2 写入

go
import (
    github.com/pgvector/pgvector-go
    github.com/jackc/pgx/v5
)

func insertChunk(ctx context.Context, db *pgx.Conn, docID int64, content string, emb []float32) error {
    _, err := db.Exec(ctx, `
        INSERT INTO chunks (document_id, chunk_index, content, embedding)
        VALUES ($1, $2, $3, $4)
    `, docID, 0, content, pgvector.NewVector(emb))
    return err
}

四、检索

4.1 纯向量检索

sql
SET hnsw.ef_search = 100;  -- 查询时精度

SELECT id, content, embedding <=> $1 AS distance
FROM chunks
ORDER BY embedding <=> $1
LIMIT 10;

<=> 是 cosine distance 运算符。

4.2 混合检索(pgvector 0.8 新特性)

sql
WITH vector_search AS (
    SELECT id, content, embedding <=> $1 AS vec_score
    FROM chunks
    ORDER BY embedding <=> $1
    LIMIT 50
),
keyword_search AS (
    SELECT id, content, ts_rank_cd(tsv, plainto_tsquery('chinese', $2)) AS kw_score
    FROM chunks
    WHERE tsv @@ plainto_tsquery('chinese', $2)
    ORDER BY kw_score DESC
    LIMIT 50
),
combined AS (
    SELECT
        id, content,
        0.7 * COALESCE(vec_score, 1.0) + 0.3 * COALESCE(kw_score, 0.0) AS final_score
    FROM vector_search
    FULL OUTER JOIN keyword_search USING (id, content)
)
SELECT * FROM combined
ORDER BY final_score
LIMIT 10;

权重经验值:纯语义查询 vec=0.9/kw=0.1,事实型查询 vec=0.5/kw=0.5。

4.3 重排序(Cross-Encoder)

RAG 经典流程:召回 → 重排 → 生成。

go
import github.com/knights-analytics/[email protected]

func rerank(ctx context.Context, query string, candidates []string) ([]string, error) {
    r := knightsanalytics.NewReranker("your-model-path")
    scores, _ := r.Rank(ctx, query, candidates)
    // 按分数降序
    sort.SliceStable(candidates, func(i, j int) bool {
        return scores[i] > scores[j]
    })
    return candidates[:5], nil
}

五、性能优化

5.1 索引参数调优

场景mef_constructionef_search
高精度(Recall > 0.95)32128200
平衡(Recall ~0.9)1664100
高吞吐(Recall ~0.85)83240

5.2 二进制量化

sql
-- 创建量化后的副本表
CREATE TABLE chunks_quantized (
    id BIGINT,
    embedding binary_vector(1024)  -- 1024 维压缩到 128 字节
);

-- 量化函数(用户自定义)
INSERT INTO chunks_quantized
SELECT id, binary_quantize(embedding) FROM chunks;

32x 压缩比,Recall 损失 < 5%。

5.3 分区表

大表按时间分区:

sql
CREATE TABLE chunks (...) PARTITION BY RANGE (created_at);
CREATE TABLE chunks_2026q2 PARTITION OF chunks
    FOR VALUES FROM ('2026-04-01') TO ('2026-07-01');

六、与专业向量数据库对比

维度pgvector 0.8PineconeMilvus
QPS(百万级)5,00050,00030,000
部署复杂度低(已有 PG)中(SaaS)
事务支持
混合检索⚠️
成本PG 实例成本$0.096/小时起自建

选型建议

  • 100 万向量以下:pgvector(复用 PG 栈)
  • 千万级:Milvus 集群
  • 实时性要求极高:Pinecone/Qdrant

七、生产部署 Checklist

  1. PG 17 + pgvector 0.8 升级
  2. shared_buffers = RAM 25%
  3. work_mem = RAM 5% / max_connections
  4. hnsw.ef_search 写入连接池默认值
  5. 监控 pg_stat_user_indexes 看索引使用
  6. 定期 VACUUM ANALYZE chunks
  7. 备份:pg_dump + WAL 归档

八、参考

  • pgvector 0.8 Release Notes
  • PostgreSQL 17 性能白皮书
  • NVIDIA RAG with pgvector 教程

系列导航:Claude Sonnet 4.6 → 本篇 → AIOps 实战

上次更新于: