Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams

  • Esteban Rodríguez-Betancourt
  • Edgar Casasola-Murillo

Resumen

With the growing use of vector embeddings in areas like natural language processing and recommendation systems, the need for effective storage and retrieval methods is increasingly important. However, deploying specialized databases for vector indexing can be challenging due to resource limitations or operational constraints. This paper introduces a novel approach that utilizes existing trigram indexes within SQL databases to efficiently manage vector embeddings. By adapting traditional relational databases to handle high-dimensional data, organizations can use their existing infrastructure without the need to invest in new database systems. This method reduces management complexity and costs associated with maintaining separate systems for vector data. We outline the process of converting vector embeddings for trigram indexing and evaluate the performance and recall through empirical analysis. This paper aims to offer a practical solution for researchers and practitioners seeking to integrate advanced vector-based queries into their current database systems, thereby enhancing the functionality and accessibility of vector embeddings in mainstream applications.

Publicado
2024-09-19
Cómo citar
Rodríguez-Betancourt, E., & Casasola-Murillo, E. (2024). Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams. Memorias De Las JAIIO, 10(1), 150-157. Recuperado a partir de https://ojs.sadio.org.ar/index.php/JAIIO/article/view/1021
Sección
ASAID - Simposio Argentino de Inteligencia Artificial y Ciencias de Datos