NSGIA Memory System

Free to use - Try Now | Home
Permanent Link Access
🔗 Permanent Link - This link never expires and can be shared freely
This is the permanent version of the temporary URLs the Memory System produces. Many AIs can directly ingest this format. If not, you can paste the JSON instead, or click on the Formatted view for easier reading — though AI tools generally work best with this version.
Some AIs cannot follow links. If that happens, click Copy JSON and paste it into the AI manually.
Vector Guide 2026-01-31 02:32:17

Full Content

TITLE: Vector Guide NSG Vector/Semantic Search Setup Guide Last Updated: 2026-01-31 Status: Verified working Overview Semantic search requires THREE tables with matching rowids: file_vectors - VSS virtual table holding embeddings file_lookup - Used by search queries (JOIN in search_vss) vector_map - Used by status/count queries (get_vector_status) Both file_lookup AND vector_map are required. Missing either one causes silent failures. Required Schema sql-- VSS for semantic search (384 dimensions for all-MiniLM-L6-v2) CREATE VIRTUAL TABLE file_vectors USING vss0(embedding(384)); -- Used by search_vss() for search results CREATE TABLE file_lookup ( rowid INTEGER PRIMARY KEY, collection TEXT, filepath TEXT ); -- Used by get_vector_status() for counting embeddings CREATE TABLE vector_map ( vector_rowid INTEGER PRIMARY KEY, filepath TEXT ); Critical: Rowid Matching All three tables must have matching rowids. The rowid comes from the FTS5 files table: python# Get rowid from FTS5 insert rowid = conn.execute('SELECT last_insert_rowid()').fetchone()[0] # Insert into ALL THREE tables with SAME rowid conn.execute("INSERT INTO file_vectors(rowid, embedding) VALUES (?, ?)", (rowid, emb)) conn.execute("INSERT INTO file_lookup(rowid, collection, filepath) VALUES (?, ?, ?)", (rowid, collection, filepath)) conn.execute("INSERT INTO vector_map(vector_rowid, filepath) VALUES (?, ?)", (rowid, filepath)) Complete Build Script Template python#!/usr/bin/env python3 """ Vector index builder - correct schema Requires: sentence_transformers, sqlite_vss """ import sqlite3 import sqlite_vss from sentence_transformers import SentenceTransformer DB_PATH = "indexes/YOUR_INDEX.db" COLLECTION = "your-index-id" # Must match registry id print("Loading model...") model = SentenceTransformer('all-MiniLM-L6-v2') conn = sqlite3.connect(DB_PATH) conn.enable_load_extension(True) sqlite_vss.load(conn) # Drop old tables (clean rebuild) print("Dropping old vector tables...") conn.execute("DROP TABLE IF EXISTS file_vectors") conn.execute("DROP TABLE IF EXISTS vector_map") conn.execute("DROP TABLE IF EXISTS file_lookup") # Create with correct schema print("Creating tables...") conn.execute("CREATE VIRTUAL TABLE file_vectors USING vss0(embedding(384))") conn.execute("CREATE TABLE file_lookup (rowid INTEGER PRIMARY KEY, collection TEXT, filepath TEXT)") conn.execute("CREATE TABLE vector_map (vector_rowid INTEGER PRIMARY KEY, filepath TEXT)") conn.commit() # Get all files from FTS table rows = conn.execute("SELECT rowid, filepath, filename FROM files").fetchall() print(f"Found {len(rows)} files") # Build with matching rowids for i, (rowid, filepath, filename) in enumerate(rows): # Embed filename + path (for filesystem indexes without text content) # For text indexes, use: content[:8000] instead text = f"{filename} {filepath}" emb = model.encode(text, convert_to_numpy=True) conn.execute("INSERT INTO file_vectors(rowid, embedding) VALUES (?, ?)", (rowid, emb.tobytes())) conn.execute("INSERT INTO file_lookup(rowid, collection, filepath) VALUES (?, ?, ?)", (rowid, COLLECTION, filepath)) conn.execute("INSERT INTO vector_map(vector_rowid, filepath) VALUES (?, ?)", (rowid, filepath)) if (i+1) % 100 == 0: print(f"{i+1}/{len(rows)}") conn.commit() conn.commit() conn.close() print(f"Done - {len(rows)} vectors") Registry Update After building vectors, update registry.db: sqlUPDATE index_registry SET vector_enabled = 1, vector_table = 'file_vectors', vector_type = 'vss' WHERE id = 'your-index-id'; Verification pythonimport sqlite3 import sqlite_vss conn = sqlite3.connect('indexes/YOUR_INDEX.db') conn.enable_load_extension(True) sqlite_vss.load(conn) # All three counts must match fv = conn.execute("SELECT COUNT(*) FROM file_vectors").fetchone()[0] fl = conn.execute("SELECT COUNT(*) FROM file_lookup").fetchone()[0] vm = conn.execute("SELECT COUNT(*) FROM vector_map").fetchone()[0] print(f"file_vectors: {fv}") print(f"file_lookup: {fl}") print(f"vector_map: {vm}") assert fv == fl == vm, "ROWID MISMATCH - rebuild required" print("✓ All tables match") How Search Works (Reference) get_vector_status() in index_vector.py: python# Counts from vector_map count = conn.execute("SELECT COUNT(*) FROM vector_map").fetchone()[0] search_vss() in index_vector.py: python# Joins file_vectors with file_lookup SELECT v.rowid, v.distance, l.filepath, l.collection FROM file_vectors v JOIN file_lookup l ON v.rowid = l.rowid WHERE vss_search(v.embedding, vss_search_params(?, ?)) Common Failures SymptomCauseFixStatus shows "empty"Missing vector_map tableRebuild with both tablesSearch returns 0 resultsRowids don't match between tablesRebuild using rowid from files table"No vector index available"Registry not updatedUPDATE registry SET vector_enabled=1 Source Files (Authoritative) /var/www/api.nsgia.com/index_api.py - Schema definition /var/www/api.nsgia.com/upload_index.py - Insert pattern /var/www/api.nsgia.com/index_vector.py - Search functions Search RAG for these files before making changes. Lesson from 2026-01-31: This schema was discovered by examining index_api.py which creates BOTH tables. Two hours were wasted guessing instead of reading existing working code.