Artwork

Вміст надано Nicolay Gerold. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією Nicolay Gerold або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.
Player FM - додаток Podcast
Переходьте в офлайн за допомогою програми Player FM !

Local-First Search: How to Push Search To End-Devices | S2 E22

53:09
 
Поширити
 

Manage episode 462752799 series 3585930
Вміст надано Nicolay Gerold. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією Nicolay Gerold або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.

Alex Garcia is a developer focused on making vector search accessible and practical. As he puts it: "I'm a SQLite guy. I use SQLite for a lot of projects... I want an easier vector search thing that I don't have to install 10,000 dependencies to use.”

Core Mantra: "Simple, Local, Scalable"

Why SQLite Vec?

"I didn't go along thinking, 'Oh, I want to build vector search, let me find a database for it.' It was much more like: I use SQLite for a lot of projects, I want something lightweight that works in my current workflow."

SQLiteVec uses row-oriented storage with some key design choices:

  • Vectors are stored in large chunks (megabytes) as blobs
  • Data is split across 4KB SQLite pages, which affects analytical performance
  • Currently uses brute force linear search without ANN indexing
  • Supports binary quantization for 32x size reduction
  • Handles tens to hundreds of thousands of vectors efficiently

Practical limits:

  • 500ms search time for 500K vectors (768 dimensions)
  • Best performance under 100ms for user experience
  • Binary quantization enables scaling to ~1M vectors
  • Metadata filtering and partitioning coming soon

Key advantages:

  • Fast writes for transactional workloads
  • Simple single-file database
  • Easy integration with existing SQLite applications
  • Leverages SQLite's mature storage engine

Garcia's preferred tools for local AI:

  • Sentence Transformers models converted to GGUF format
  • Llama.cpp for inference
  • Small models (30MB) for basic embeddings
  • Larger models like Arctic Embed (hundreds of MB) for recent topics
  • SQLite L-Embed extension for text embeddings
  • Transformers.js for browser-based implementations

1. Choose Your Storage

"There's two ways of storing vectors within SQLiteVec. One way is a manual way where you just store a JSON array... [second is] using a virtual table."

  • Traditional row storage: Simple, flexible, good for small vectors
  • Virtual table storage: Optimized chunks, better for large datasets
  • Performance sweet spot: Up to 500K vectors with 500ms search time

2. Optimize Performance

"With binary quantization it's 1/32 of the space... and holds up at 95 percent quality"

  • Binary quantization reduces storage 32x with 95% quality
  • Default page size is 4KB - plan your vector storage accordingly
  • Metadata filtering dramatically improves search speed

3. Integration Patterns

"It's a single file, right? So you can like copy and paste it if you want to make a backup."

  • Two storage approaches: manual columns or virtual tables
  • Easy backups: single file database
  • Cross-platform: desktop, mobile, IoT, browser (via WASM)

4. Real-World Tips

"I typically choose the really small model... it's 30 megabytes. It quantizes very easily... I like it because it's very small, quick and easy."

  • Start with smaller, efficient models (30MB range)
  • Use binary quantization before trying complex solutions
  • Plan for partitioning when scaling beyond 100K vectors

Alex Garcia

Nicolay Gerold:

  continue reading

44 епізодів

Artwork
iconПоширити
 
Manage episode 462752799 series 3585930
Вміст надано Nicolay Gerold. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією Nicolay Gerold або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.

Alex Garcia is a developer focused on making vector search accessible and practical. As he puts it: "I'm a SQLite guy. I use SQLite for a lot of projects... I want an easier vector search thing that I don't have to install 10,000 dependencies to use.”

Core Mantra: "Simple, Local, Scalable"

Why SQLite Vec?

"I didn't go along thinking, 'Oh, I want to build vector search, let me find a database for it.' It was much more like: I use SQLite for a lot of projects, I want something lightweight that works in my current workflow."

SQLiteVec uses row-oriented storage with some key design choices:

  • Vectors are stored in large chunks (megabytes) as blobs
  • Data is split across 4KB SQLite pages, which affects analytical performance
  • Currently uses brute force linear search without ANN indexing
  • Supports binary quantization for 32x size reduction
  • Handles tens to hundreds of thousands of vectors efficiently

Practical limits:

  • 500ms search time for 500K vectors (768 dimensions)
  • Best performance under 100ms for user experience
  • Binary quantization enables scaling to ~1M vectors
  • Metadata filtering and partitioning coming soon

Key advantages:

  • Fast writes for transactional workloads
  • Simple single-file database
  • Easy integration with existing SQLite applications
  • Leverages SQLite's mature storage engine

Garcia's preferred tools for local AI:

  • Sentence Transformers models converted to GGUF format
  • Llama.cpp for inference
  • Small models (30MB) for basic embeddings
  • Larger models like Arctic Embed (hundreds of MB) for recent topics
  • SQLite L-Embed extension for text embeddings
  • Transformers.js for browser-based implementations

1. Choose Your Storage

"There's two ways of storing vectors within SQLiteVec. One way is a manual way where you just store a JSON array... [second is] using a virtual table."

  • Traditional row storage: Simple, flexible, good for small vectors
  • Virtual table storage: Optimized chunks, better for large datasets
  • Performance sweet spot: Up to 500K vectors with 500ms search time

2. Optimize Performance

"With binary quantization it's 1/32 of the space... and holds up at 95 percent quality"

  • Binary quantization reduces storage 32x with 95% quality
  • Default page size is 4KB - plan your vector storage accordingly
  • Metadata filtering dramatically improves search speed

3. Integration Patterns

"It's a single file, right? So you can like copy and paste it if you want to make a backup."

  • Two storage approaches: manual columns or virtual tables
  • Easy backups: single file database
  • Cross-platform: desktop, mobile, IoT, browser (via WASM)

4. Real-World Tips

"I typically choose the really small model... it's 30 megabytes. It quantizes very easily... I like it because it's very small, quick and easy."

  • Start with smaller, efficient models (30MB range)
  • Use binary quantization before trying complex solutions
  • Plan for partitioning when scaling beyond 100K vectors

Alex Garcia

Nicolay Gerold:

  continue reading

44 епізодів

Усі епізоди

×
 
Loading …

Ласкаво просимо до Player FM!

Player FM сканує Інтернет для отримання високоякісних подкастів, щоб ви могли насолоджуватися ними зараз. Це найкращий додаток для подкастів, який працює на Android, iPhone і веб-сторінці. Реєстрація для синхронізації підписок між пристроями.

 

Короткий довідник

Слухайте це шоу, досліджуючи
Відтворити