Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Or you can use Postgres to store the original text, the metadata and the embedding in a single table and then do hybrid queries against them (with pre- or post-filtering, optimized automatically).

Shameless plug:

BM25 search implemented in PL/pgSQL: https://github.com/jankovicsandras/plpgsql_bm25

faster BM25 search algorithms in Python: https://github.com/jankovicsandras/bm25opt



BM25 implemented in Postgres as a Postgres extension: https://github.com/paradedb/paradedb (disclaimer: I work for ParadeDB)


Yeah when I implemented a RAG myself I wondered why people were storing the text separately, it doesn't make any sense to me!

It's not that "vector databases are the wrong abstraction", it's that "vector data is not an abstraction at all". It's just a data type with some operators, you are responsible for architecting that tool into your system in a coherent way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: