#databases
11 notes
- Mar 24, 2026
pageindex generates a semantic tree-like json index of a lengthy document to allow for reasoning based RAG without the need for vectordb.
- Feb 19, 2026
slingdata-io/sling-cli is a promising tool move/sync data between databases and files, esp. helpful for local testing, ci/cd while able to do stage/sql based transformations.
- Feb 15, 2026
alibaba/zvec by alibaba is an embedded vector database supports both spare and dense vectors, along with structured data. It can be considered the sqlite of vector databases.
- Feb 13, 2026
zoocache is a sematic dependency based cache manager, that support in-memory, LMDB or redis backends. Integration with Django looks interesting.
- Feb 5, 2026
Vortex a newer format is supported by duckdb promises to be faster for random access and has zero-copy metadata, with better compression. While the support across the board is limited but worth considering for LLM/RAG based uses over parquet.
- Feb 5, 2026
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann manages to be both rigorous technical manual and something approaching a philosophical treatise on the nature of truth, consistency, and trust in distributed systems. [Claude][https://claude.ai/chat/0f58f2f6-bd56-41a1-a785-8267afa5a3d1]
- Foundation. He starts by asking the questions "What do we actually want in our data systems", answers them as - reliability, scalability and maintainability.
- Reliability. The question isn't whether failure will happen, it will, but the whether the systems can survive them.
- Scalability. It's not a binary question of whether a system is "scalable" or "not scalable". Ask - "What happens when specific load parameter increase?"
- Maintainability - most underappreciated of the three, he argues the majority of cost isn't in initial development but in ongoing maintainence.
- The data model wars - the skill isn't in choosing the "best" database but in understanding which tradeoffs matter for your specific problem.
- Storage engines. Two major approaches to read & write data from disk, neither is universally better.
- Log-structured storage (like LSM-trees) optimizes for writes, every write is appended and the merged/compacted later.
- Page-oriented storage (like B-trees) optimizes for reads, data is stored in fixed-sized block, which then get updated in-place, more like filing cabinet with each document has a designated slot.
- Instead of asking "how do I build this?" ask "what does it mean for this to work correctly?"
- Foundation. He starts by asking the questions "What do we actually want in our data systems", answers them as - reliability, scalability and maintainability.
- Jan 12, 2026
Distributed Key-value store -- https://etcd.io/
- Jan 11, 2026
Puzer published a github recommendor that uses semantic embedding from user's github stars all client side, I found some great recommendations which I plan to use:
- pamburus/hl: A fast and powerful log viewer and processor that converts JSON logs or logfmt logs into a clear human-readable format. (⭐2657)
- samwho/spacer: CLI tool to insert spacers when command output stops (⭐1663)
- darrenburns/posting: The modern API client that lives in your terminal. (⭐11134)
- plutov/oq: Terminal OpenAPI Spec viewer (⭐943)
- wey-gu/py-pglite: PGlite wrapper in Python for testing. Test your app with Postgres just as lite as SQLite. (⭐577)
- Jan 7, 2026
"Django Postgres Migration Tools - add-on for safer and more scalable migrations in django.
- Dec 22, 2025
Notes from Thoughtworks - Technology Radar vol 33
- text-to-sql solutions aren't working as expected
- pnpm, langGraph, and pydantic recommended for adoption
- Dec 19, 2025
VectorChord is a faster pgvector alternative.