#rag
8 notes
- Jun 23, 2026
Many asymmetric embedding models need task prefixes on inputs, and skipping them quietly degrades relevance. Each model has its own scheme: nomic (
search_query:/search_document:), E5 (query:/passage:), BGE (a query instruction sentence, bare docs) — not interchangeable. OpenAItext-embedding-3-*andall-MiniLM-L6-v2need none. Whether you add the prefix depends on the serving layer, not the model: raw endpoints (llama.cpp/v1/embeddings, HF TEI, Ollama) send bare text so it's on you, while sentence-transformers (prompt_name="query") and vendor SDKs inject it for you. Addingsearch_query:/search_document:to a nomic-v1.5 call lifted cosine similarity on a real query/doc pair from 0.54 to 0.60 at zero cost. - Mar 24, 2026
pageindex generates a semantic tree-like json index of a lengthy document to allow for reasoning based RAG without the need for vectordb.
- Feb 15, 2026
For generating embedding locally, nomic-embed-text is a large context length text encoder that surpasses OpenAI text-embedding-ada-002 and text-embedding-3-small performance on short and long context tasks. It has a balance of speed, 8k context, and accuracy for English-centric apps. BGE-M3, Qwen3-Embedding and E5-Small are other alternatives.
- Feb 15, 2026
alibaba/zvec by alibaba is an embedded vector database supports both spare and dense vectors, along with structured data. It can be considered the sqlite of vector databases.
- Feb 15, 2026
yichuan-w/LEANN is a RAG focused framework focused on efficient storage with built-in chunking strategies embedding model management and MCP server. gemini
- Feb 5, 2026
Vortex a newer format is supported by duckdb promises to be faster for random access and has zero-copy metadata, with better compression. While the support across the board is limited but worth considering for LLM/RAG based uses over parquet.
- Jan 11, 2026
Puzer published a github recommendor that uses semantic embedding from user's github stars all client side, I found some great recommendations which I plan to use:
- pamburus/hl: A fast and powerful log viewer and processor that converts JSON logs or logfmt logs into a clear human-readable format. (⭐2657)
- samwho/spacer: CLI tool to insert spacers when command output stops (⭐1663)
- darrenburns/posting: The modern API client that lives in your terminal. (⭐11134)
- plutov/oq: Terminal OpenAPI Spec viewer (⭐943)
- wey-gu/py-pglite: PGlite wrapper in Python for testing. Test your app with Postgres just as lite as SQLite. (⭐577)
- Dec 19, 2025
VectorChord is a faster pgvector alternative.