Part 4 of the Journey: Advanced Topics & Deep Dives Previous: The PostgreSQL Long-Term Memory System | Next: Context Window Management
From Keywords to Meaning: Building a Semantic Memory System with Ollama
How we transformed PostgreSQL from a dumb database into an intelligent insight engine—without sending your data to any API
Date: October 7, 2025
Author: Myron Koch & Claude Desktop
Category: AI Infrastructure & Semantic Search
Reading Time: 20 minutes
The Frustration
It’s late September 2025, and I’m debugging a network issue. Claude and I have been troubleshooting for about an hour when I have that nagging feeling: Haven’t we dealt with something like this before?
I search the memory database:
SELECT observation_text
FROM observations
WHERE observation_text ILIKE '%network%issue%';
Nothing. Empty result set.
I try different keywords:
-- Try "connectivity"
WHERE observation_text ILIKE '%connectivity%';
-- 3 results, none relevant
-- Try "connection"
WHERE observation_text ILIKE '%connection%';
-- 12 results, mostly about database connections
-- Try "infrastructure"
WHERE observation_text ILIKE '%infrastructure%';
-- 47 results, would take forever to read through
Meanwhile, there’s a perfect memory sitting in the database from two weeks ago: the Router DDoS Crisis where my home router blacklisted Anthropic’s IP. It has everything I need to know about diagnosing network failures.
But the observation doesn’t contain the exact words “network issue.” It says:
“Router DDoS protection blacklisted 160.79.104.10 after sustained API traffic. Diagnostic methodology: DNS → ICMP → Port 443 → traceroute. VPN bypass confirmed local filtering.”
Keyword search failed because I used different words than the original observation. The meaning was the same, but the words were different.
This is the fundamental problem with keyword search: you have to guess the exact words someone used in the past.
That’s when I knew we needed semantic search.
The Three-Phase Evolution
Phase 1: JSON Memory (The Dark Ages)
When we started with the original MCP memory system:
{
"memories": [
"Myron prefers TypeScript",
"Worked on blockchain servers",
"Had issues with API rate limiting"
]
}
Problems:
- ❌ No structure
- ❌ No relationships
- ❌ Linear search only
- ❌ No intelligence
Searching meant: Loop through array, check if string contains keyword.
Phase 2: PostgreSQL + Relational Structure
We upgraded to PostgreSQL with proper schema (Blog Post 017):
-- Entities
CREATE TABLE entities (
id SERIAL PRIMARY KEY,
name TEXT UNIQUE NOT NULL,
entity_type TEXT
);
-- Observations linked to entities
CREATE TABLE observations (
id SERIAL PRIMARY KEY,
entity_id INTEGER REFERENCES entities(id),
observation_text TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Better:
- ✅ Relational structure
- ✅ Entity relationships
- ✅ Powerful SQL queries
- ✅ Temporal tracking
But still: Keyword-based search. Had to match exact words.
Phase 3: Vector Embeddings + Semantic Search (NOW)
The upgrade that changes everything:
-- Add vector embeddings
ALTER TABLE observations
ADD COLUMN embedding vector(768);
-- Create similarity search index
CREATE INDEX observations_embedding_idx
ON observations
USING hnsw (embedding vector_cosine_ops);
Now we have:
- ✅ Semantic understanding - Search by meaning, not keywords
- ✅ 2,656+ observations with 768-dimensional embeddings
- ✅ LOCAL processing via Ollama (privacy-preserving)
- ✅ Pattern discovery - Find similar situations automatically
- ✅ 100x faster search with HNSW indexing
The transformation: Search for “network problems” and find “router DDoS blacklist” even though those exact words never appear together.
Understanding Vector Embeddings
The Core Concept
Traditional search:
Query: "network issue"
Match: Does the text contain these exact words?
Result: Yes/No
Semantic search:
Query: "network issue"
Transform: [0.23, -0.15, 0.87, 0.34, ...] (768 numbers)
Match: Which observations have similar number patterns?
Result: Ranked by conceptual similarity
What Are These Numbers?
When you convert text to embeddings, you’re creating a coordinate in meaning space.
Think about it like a map:
- Words about networking cluster together
- Words about databases cluster in a different area
- Related concepts are close to each other
- Unrelated concepts are far apart
Here’s a simplified example (real embeddings have 768 dimensions):
# Text -> Vector
"Router DDoS protection" → [0.23, -0.15, 0.87]
"Network security blocking" → [0.21, -0.18, 0.85] # Very close!
"Database optimization" → [-0.45, 0.72, -0.33] # Far away
# Calculate distance
distance(vector1, vector2) = sqrt(sum((v1[i] - v2[i])^2))
# Router DDoS vs Network security
distance([0.23, -0.15, 0.87], [0.21, -0.18, 0.85]) = 0.08 # CLOSE
# Router DDoS vs Database optimization
distance([0.23, -0.15, 0.87], [-0.45, 0.72, -0.33]) = 1.94 # FAR
The closer two vectors are, the more semantically related the concepts are.
Why 768 Dimensions?
The model we use (nomic-embed-text) creates 768-dimensional vectors because:
- More dimensions = more nuance - Can capture subtle differences in meaning
- 768 is efficient - Balance between accuracy and performance
- Standard size - Works well with pgvector and HNSW indexing
- Proven effective - Empirically validated for semantic search
Think of each dimension as capturing a different aspect of meaning: formality, technical depth, emotional tone, domain specificity, temporal context, etc.
Why Ollama? The Privacy-First Choice
When we decided to add semantic search, we had a choice:
Option 1: OpenAI Embeddings API
# Send your data to OpenAI
response = openai.embeddings.create(
model="text-embedding-3-small",
input="Router DDoS protection blacklisted..."
)
Pros: High quality, well-tested, reliable
Cons:
- 💰 Costs money ($ per 1M tokens)
- 🔒 Data leaves your machine
- 🌐 Requires internet
- ⏱️ Network latency
- 🔐 Privacy concerns
Option 2: Ollama + nomic-embed-text (What We Chose)
# Local embedding generation
response = requests.post('http://localhost:11434/api/embeddings',
json={
'model': 'nomic-embed-text',
'prompt': 'Router DDoS protection blacklisted...'
})
Pros:
- ✅ FREE - No API costs ever
- ✅ Private - Data never leaves your machine
- ✅ Offline - Works without internet
- ✅ Fast - Local M1/M3 hardware acceleration
- ✅ Open source - Full transparency
Cons:
- Requires local GPU/CPU resources
- Need to manage Ollama installation
- Model quality slightly below OpenAI (but still excellent)
The decision was easy: Privacy and cost made Ollama the clear winner.
The Architecture: 6 Components Working Together
1. PostgreSQL 17.6 - The Foundation
Our core database with pgvector extension:
-- Check PostgreSQL version
SELECT version();
-- PostgreSQL 17.6 on aarch64-apple-darwin21.6.0
-- Check pgvector extension
SELECT * FROM pg_extension WHERE extname = 'vector';
-- vector | 0.8.0
PostgreSQL handles:
- Entity and observation storage
- Vector operations (similarity search)
- ACID guarantees (data consistency)
- Backup and recovery
2. pgvector 0.8.0 - Vector Operations
The extension that makes semantic search possible:
-- Create vector column (768 dimensions for nomic-embed-text)
ALTER TABLE observations
ADD COLUMN embedding vector(768);
-- Vector operators
-- <-> : Euclidean distance
-- <#> : Negative inner product
-- <=> : Cosine distance (what we use)
pgvector adds:
- Vector data type
- Distance operators
- Index support (HNSW, IVFFlat)
- Optimized vector operations
3. Ollama - Local LLM Runtime
The engine that runs embedding models locally:
# Install Ollama on macOS
brew install ollama
# Start Ollama service
ollama serve
# Pull embedding model
ollama pull nomic-embed-text
# Verify it's running
curl http://localhost:11434/api/tags
Ollama provides:
- Local model serving
- REST API interface
- Hardware acceleration (Metal on M1/M3)
- Model management
4. nomic-embed-text - The Embedding Model
Why this specific model?
- 768 dimensions - Perfect for our use case
- Fast inference - 30-50ms per embedding on M1/M3
- High quality - Competitive with OpenAI ada-002
- Open source - MIT license, fully transparent
- Trained for search - Optimized for retrieval tasks
# Generate embedding with Ollama
import requests
def generate_embedding(text: str) -> list[float]:
response = requests.post(
'http://localhost:11434/api/embeddings',
json={
'model': 'nomic-embed-text',
'prompt': text
}
)
return response.json()['embedding']
# Example
embedding = generate_embedding("Router DDoS protection blacklisted")
print(f"Dimensions: {len(embedding)}") # 768
print(f"First 5 values: {embedding[:5]}")
# [0.234, -0.156, 0.873, 0.342, -0.567]
5. Python Scripts - Automation Layer
The glue that connects everything:
# embed_observations.py
import psycopg2
import requests
from tqdm import tqdm
def embed_all_observations():
"""Generate embeddings for all observations without them"""
conn = psycopg2.connect("dbname=longterm_memory user=postgres")
cur = conn.cursor()
# Get observations needing embeddings
cur.execute("""
SELECT id, observation_text
FROM observations
WHERE embedding IS NULL
""")
observations = cur.fetchall()
print(f"Found {len(observations)} observations to embed")
# Generate embeddings with progress bar
for obs_id, text in tqdm(observations):
try:
embedding = generate_embedding(text)
cur.execute("""
UPDATE observations
SET embedding = %s
WHERE id = %s
""", (embedding, obs_id))
conn.commit()
except Exception as e:
print(f"Error embedding observation {obs_id}: {e}")
conn.rollback()
cur.close()
conn.close()
print("Embedding generation complete!")
if __name__ == '__main__':
embed_all_observations()
Python handles:
- Batch embedding generation
- Database updates
- Error handling
- Progress tracking
6. iCloud Sync - Cross-Device Coordination
The bridge between M1 and M3:
#!/bin/bash
# sync_to_icloud.sh (runs on M1)
# Export full database
pg_dump longterm_memory > /tmp/longterm_memory.sql
# Copy to iCloud Drive
cp /tmp/longterm_memory.sql \
~/Library/Mobile\ Documents/com~apple~CloudDocs/Databases/
echo "✓ M1 database synced to iCloud"
#!/bin/bash
# sync_from_icloud.sh (runs on M3)
# Import from iCloud Drive
psql longterm_memory < \
~/Library/Mobile\ Documents/com~apple~CloudDocs/Databases/longterm_memory.sql
echo "✓ M3 database updated from M1"
iCloud provides:
- Automatic sync between devices
- Encryption in transit
- Version history
- Conflict handling
The full pipeline:
New observation → PostgreSQL (M1)
↓
Python generates embedding
↓
Update observation.embedding
↓
Cron job exports to iCloud
↓
iCloud syncs to M3
↓
M3 imports database
↓
Same memories on both machines
The Semantic Search Experience
Before: Keyword Frustration
-- Try to find network troubleshooting
SELECT observation_text
FROM observations
WHERE observation_text ILIKE '%network%troubleshooting%';
-- Result: 0 rows
-- Try different words
WHERE observation_text ILIKE '%diagnostic%';
-- Result: 8 rows (mostly unrelated)
-- Try broader search
WHERE observation_text ILIKE '%problem%';
-- Result: 234 rows (way too many)
The problem: You have to guess the exact words.
After: Semantic Intelligence
# Search by meaning
results = semantic_search("How did we diagnose network problems?")
# Results (automatically ranked by relevance):
# 1. Router DDoS blacklist incident (similarity: 0.89)
# 2. VPN diagnostic methodology (similarity: 0.85)
# 3. Network traceroute analysis (similarity: 0.82)
# 4. ISP connectivity troubleshooting (similarity: 0.79)
The magic: It found the router DDoS incident even though:
- Query said “network problems”
- Observation said “DDoS protection blacklist”
- Different words, same meaning
Real-World Example
Let me show you what actually happened:
Query: “API performance issues during development”
Old keyword search (ILIKE ‘%API%performance%’):
- Found: 3 observations about API design patterns
- Missed: The router DDoS crisis (most relevant!)
New semantic search:
query = "API performance issues during development"
results = semantic_search(query, limit=10)
for text, entity, similarity in results:
print(f"{similarity:.2f} | {entity}: {text[:100]}...")
Results:
0.91 | Anthropic: Router DDoS protection blacklisted 160.79.104.10...
0.87 | Technical_Incidents: Sustained API traffic triggered router...
0.85 | Myron Koch: VPN bypass confirmed local filtering, not...
0.82 | Technical_Incidents: Diagnostic methodology: DNS → ICMP...
0.79 | Anthropic: Network troubleshooting patterns for API...
0.76 | Myron Koch: Rate limiting patterns to avoid router...
0.73 | Technical_Incidents: Consumer router security misinterpreted...
0.71 | Active_Projects: API development workflow best practices...
Every single result is relevant because semantic search understands:
- “API performance issues” = “sustained API traffic”
- “during development” = “day-long development session”
- “problems” = “blacklisted” = “blocking” = “filtering”
It’s not magic—it’s mathematics understanding meaning.
The Implementation: Step by Step
Step 1: Install pgvector
# On macOS with Homebrew PostgreSQL
brew install pgvector
# Enable in your database
psql longterm_memory -c "CREATE EXTENSION IF NOT EXISTS vector;"
# Verify installation
psql longterm_memory -c "SELECT * FROM pg_extension WHERE extname = 'vector';"
Step 2: Add Vector Column
-- Add embedding column (768 dimensions for nomic-embed-text)
ALTER TABLE observations
ADD COLUMN embedding vector(768);
-- Check it worked
\d observations
You should see:
embedding | vector(768) |
Step 3: Install and Configure Ollama
# Install Ollama
brew install ollama
# Start Ollama service (runs in background)
ollama serve &
# Pull embedding model (~1.5GB download)
ollama pull nomic-embed-text
# Verify model is available
ollama list
Expected output:
NAME SIZE MODIFIED
nomic-embed-text 768MB 2 minutes ago
Step 4: Generate Embeddings
Create generate_embeddings.py:
#!/usr/bin/env python3
"""
Generate vector embeddings for all observations using Ollama.
Author: Myron Koch
Date: October 2025
"""
import psycopg2
import requests
import json
from tqdm import tqdm
import time
# Configuration
OLLAMA_URL = "http://localhost:11434/api/embeddings"
MODEL = "nomic-embed-text"
DB_NAME = "longterm_memory"
DB_USER = "postgres"
BATCH_SIZE = 100
def generate_embedding(text: str) -> list[float]:
"""Generate embedding using local Ollama"""
try:
response = requests.post(OLLAMA_URL,
json={
'model': MODEL,
'prompt': text
},
timeout=30
)
response.raise_for_status()
return response.json()['embedding']
except Exception as e:
print(f"Error generating embedding: {e}")
return None
def embed_observations():
"""Add embeddings to all observations"""
# Connect to database
conn = psycopg2.connect(f"dbname={DB_NAME} user={DB_USER}")
cur = conn.cursor()
# Count observations without embeddings
cur.execute("""
SELECT COUNT(*)
FROM observations
WHERE embedding IS NULL
""")
total = cur.fetchone()[0]
print(f"Found {total} observations needing embeddings")
if total == 0:
print("All observations already have embeddings!")
return
# Get observations to embed
cur.execute("""
SELECT id, observation_text
FROM observations
WHERE embedding IS NULL
ORDER BY id
""")
# Process with progress bar
observations = cur.fetchall()
successful = 0
failed = 0
for obs_id, text in tqdm(observations, desc="Generating embeddings"):
# Generate embedding
embedding = generate_embedding(text)
if embedding is None:
failed += 1
continue
# Update database
try:
cur.execute("""
UPDATE observations
SET embedding = %s
WHERE id = %s
""", (embedding, obs_id))
conn.commit()
successful += 1
except Exception as e:
print(f"\nError updating observation {obs_id}: {e}")
conn.rollback()
failed += 1
# Rate limiting (be nice to Ollama)
time.sleep(0.05) # 50ms between requests
# Summary
print(f"\n✓ Embeddings generated successfully!")
print(f" - Successful: {successful}")
print(f" - Failed: {failed}")
print(f" - Total: {successful + failed}")
cur.close()
conn.close()
if __name__ == '__main__':
print("=== Embedding Generation Pipeline ===")
print(f"Model: {MODEL}")
print(f"Database: {DB_NAME}")
print()
embed_observations()
Run it:
python3 generate_embeddings.py
Expected output:
=== Embedding Generation Pipeline ===
Model: nomic-embed-text
Database: longterm_memory
Found 2656 observations needing embeddings
Generating embeddings: 100%|████████| 2656/2656 [02:14<00:00, 19.75it/s]
✓ Embeddings generated successfully!
- Successful: 2656
- Failed: 0
- Total: 2656
Step 5: Create HNSW Index
This is where the magic happens—100x speedup:
-- Create HNSW index for fast similarity search
CREATE INDEX observations_embedding_idx
ON observations
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- Parameters:
-- m = 16: Maximum connections per layer (higher = better recall)
-- ef_construction = 64: Construction quality (higher = better but slower build)
-- For search-time tuning
SET hnsw.ef_search = 100; -- Default: 40 (higher = better recall)
Index creation will take a few seconds for 2,656 observations.
Step 6: Semantic Search Function
Create semantic_search.py:
#!/usr/bin/env python3
"""
Semantic search interface for observation database.
Author: Myron Koch
Date: October 2025
"""
import psycopg2
import requests
from typing import List, Tuple
DB_NAME = "longterm_memory"
DB_USER = "postgres"
OLLAMA_URL = "http://localhost:11434/api/embeddings"
MODEL = "nomic-embed-text"
def generate_embedding(text: str) -> list[float]:
"""Generate embedding for query text"""
response = requests.post(OLLAMA_URL,
json={'model': MODEL, 'prompt': text})
return response.json()['embedding']
def semantic_search(
query: str,
limit: int = 10,
min_similarity: float = 0.0
) -> List[Tuple[str, str, float]]:
"""
Search observations by semantic similarity.
Args:
query: Natural language search query
limit: Maximum number of results
min_similarity: Minimum similarity threshold (0.0-1.0)
Returns:
List of (observation_text, entity_name, similarity_score)
"""
# Generate query embedding
query_embedding = generate_embedding(query)
# Search database
conn = psycopg2.connect(f"dbname={DB_NAME} user={DB_USER}")
cur = conn.cursor()
# Cosine similarity search (1 - cosine_distance = similarity)
cur.execute("""
SELECT
o.observation_text,
e.name as entity_name,
1 - (o.embedding <=> %s::vector) as similarity
FROM observations o
JOIN entities e ON o.entity_id = e.id
WHERE o.embedding IS NOT NULL
AND 1 - (o.embedding <=> %s::vector) >= %s
ORDER BY o.embedding <=> %s::vector
LIMIT %s
""", (query_embedding, query_embedding, min_similarity, query_embedding, limit))
results = cur.fetchall()
cur.close()
conn.close()
return results
def print_results(results: List[Tuple], query: str):
"""Pretty print search results"""
print(f"\n{'='*80}")
print(f"Query: {query}")
print(f"{ '='*80}\n")
if not results:
print("No results found.")
return
for i, (text, entity, similarity) in enumerate(results, 1):
print(f"{i}. [{similarity:.3f}] {entity}")
# Truncate long observations
display_text = text if len(text) <= 200 else text[:197] + "..."
print(f" {display_text}")
print()
if __name__ == '__main__':
# Example searches
queries = [
"network troubleshooting and diagnostics",
"successful debugging approaches",
"configuration management patterns"
]
for query in queries:
results = semantic_search(query, limit=5, min_similarity=0.7)
print_results(results, query)
Test it:
python3 semantic_search.py
The HNSW Index: 100x Speedup Explained
Without Index: Linear Scan
-- Slow: Compares query to every observation
SELECT observation_text,
1 - (embedding <=> query_embedding::vector) as similarity
FROM observations
ORDER BY embedding <=> query_embedding::vector
LIMIT 10;
-- Execution time: ~500ms for 2,656 observations
-- Complexity: O(n) - must check every observation
With HNSW Index: Graph Traversal
-- Fast: Uses graph structure to find nearest neighbors
-- Same query, but with index
SELECT observation_text,
1 - (embedding <=> query_embedding::vector) as similarity
FROM observations
ORDER BY embedding <=> query_embedding::vector
LIMIT 10;
-- Execution time: ~5ms for 2,656 observations
-- Complexity: O(log n) - navigates graph structure
-- 100x FASTER!
How HNSW Works
HNSW = Hierarchical Navigable Small World
Think of it like a highway system:
- Highways (top layer) - Connect distant cities
- Main roads (middle layers) - Connect nearby towns
- Local streets (bottom layer) - Connect neighborhoods
When searching:
- Start at a random point on the highway
- Follow connections toward your destination
- When close, drop to main roads
- Eventually reach local streets
- Find the exact house you want
For vectors:
- Start at random observation in top layer
- Follow edges to observations with closer embeddings
- Drop to next layer when no closer neighbors
- Continue until bottom layer
- Return the k-nearest neighbors
Trade-offs:
- Speed: Much faster than checking everything
- Accuracy: Approximate (might miss true nearest neighbor by 1-2%)
- Memory: Extra storage for graph structure (~10% overhead)
Tuning HNSW
-- Build-time parameters (set during index creation)
CREATE INDEX observations_embedding_idx
ON observations
USING hnsw (embedding vector_cosine_ops)
WITH (
m = 16, -- Max connections per layer
-- Higher = better recall, more memory
-- Good values: 12-24
ef_construction = 64 -- Construction quality
-- Higher = better recall, slower build
-- Good values: 40-100
);
-- Query-time parameter (set per session)
SET hnsw.ef_search = 100; -- Search quality
-- Higher = better recall, slower search
-- Default: 40, good values: 40-200
Tuning guide:
- For accuracy: Increase
mandef_construction - For speed: Decrease
ef_search - For memory efficiency: Decrease
m
Our settings (m=16, ef_construction=64, ef_search=100):
- Search time: 5-8ms
- Recall: >99% (rarely misses true nearest neighbor)
- Memory overhead: ~8MB for 2,656 observations
Cross-Device Sync: M1 ↔ M3
The Challenge
I work on two machines:
- M1 Mac Mini (primary development, always on)
- M3 MacBook Pro (portable work, conferences, travel)
Both need access to the same memory database with embeddings.
Requirements:
- Bidirectional sync
- Preserve embeddings (expensive to regenerate)
- Simple to maintain
- No cloud database costs
The Solution: iCloud + pg_dump
Architecture:
M1 PostgreSQL M3 PostgreSQL
↓ pg_dump ↑ psql
M1 Local File M3 Local File
↓ ↑
M1 iCloud Drive ←────sync────→ M3 iCloud Drive
Export Script (M1)
sync_m1_to_icloud.sh:
#!/bin/bash
# Export M1 database to iCloud for M3 consumption
# Runs hourly via cron
set -e
ICLOUD_PATH="$HOME/Library/Mobile Documents/com~apple~CloudDocs/Databases"
BACKUP_FILE="longterm_memory_m1.sql"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
echo "[$(date)] Starting M1 → iCloud sync"
# Create full database dump
pg_dump longterm_memory > /tmp/${BACKUP_FILE}
# Add metadata header
echo "-- Exported from M1 at $(date)" > /tmp/${BACKUP_FILE}.meta
echo "-- Observations: $(psql longterm_memory -t -c 'SELECT COUNT(*) FROM observations')" >> /tmp/${BACKUP_FILE}.meta
echo "-- Entities: $(psql longterm_memory -t -c 'SELECT COUNT(*) FROM entities')" >> /tmp/${BACKUP_FILE}.meta
# Copy to iCloud Drive
mkdir -p "${ICLOUD_PATH}"
cp /tmp/${BACKUP_FILE} "${ICLOUD_PATH}/${BACKUP_FILE}"
cp /tmp/${BACKUP_FILE}.meta "${ICLOUD_PATH}/${BACKUP_FILE}.meta"
# Keep timestamped backup
cp /tmp/${BACKUP_FILE} "${ICLOUD_PATH}/backups/${BACKUP_FILE}.${TIMESTAMP}"
# Cleanup
rm /tmp/${BACKUP_FILE}*
echo "[$(date)] ✓ M1 database synced to iCloud"
echo " File: ${ICLOUD_PATH}/${BACKUP_FILE}"
Import Script (M3)
sync_m3_from_icloud.sh:
#!/bin/bash
# Import M1 database from iCloud to M3
# Run manually or via automation
set -e
ICLOUD_PATH="$HOME/Library/Mobile Documents/com~apple~CloudDocs/Databases"
BACKUP_FILE="longterm_memory_m1.sql"
echo "[$(date)] Starting iCloud → M3 sync"
# Check if iCloud file exists
if [ ! -f "${ICLOUD_PATH}/${BACKUP_FILE}" ]; then
echo "ERROR: No iCloud backup found at ${ICLOUD_PATH}/${BACKUP_FILE}"
exit 1
fi
# Show metadata
if [ -f "${ICLOUD_PATH}/${BACKUP_FILE}.meta" ]; then
echo "Backup metadata:"
cat "${ICLOUD_PATH}/${BACKUP_FILE}.meta"
fi
# Confirm import (safety check)
read -p "Import M1 database to M3? This will REPLACE existing data. (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "Import cancelled."
exit 0
fi
# Backup current M3 database
echo "Creating M3 backup..."
pg_dump longterm_memory > /tmp/longterm_memory_m3_backup_$(date +%Y%m%d_%H%M%S).sql
# Drop and recreate database
echo "Recreating database..."
dropdb longterm_memory 2>/dev/null || true
createdb longterm_memory
# Enable extensions
psql longterm_memory -c "CREATE EXTENSION IF NOT EXISTS vector;"
# Import M1 database
echo "Importing M1 data..."
psql longterm_memory < "${ICLOUD_PATH}/${BACKUP_FILE}"
# Verify import
echo "Verification:"
echo " Observations: $(psql longterm_memory -t -c 'SELECT COUNT(*) FROM observations')"
echo " Entities: $(psql longterm_memory -t -c 'SELECT COUNT(*) FROM entities')"
echo " Embeddings: $(psql longterm_memory -t -c 'SELECT COUNT(*) FROM observations WHERE embedding IS NOT NULL')"
echo "[$(date)] ✓ M3 database updated from M1"
Automation with Cron
On M1, add to crontab:
# Edit crontab
crontab -e
# Add hourly sync job
0 * * * * /Users/m1/scripts/sync_m1_to_icloud.sh >> /Users/m1/logs/icloud_sync.log 2>&1
On M3, manual sync when needed:
# When starting work on M3
~/scripts/sync_m3_from_icloud.sh
Conflict Resolution Strategy
M1 is the source of truth:
- All new observations created on M1
- M3 imports from M1 (read-only sync)
- No bidirectional conflicts
If you must create observations on M3:
- Add them to M3 database
- Export M3:
pg_dump longterm_memory > m3_additions.sql - On M1, review and import manually
- Next hourly sync propagates back to M3
Why this works:
- Development primarily happens on M1 (desk setup)
- M3 used for reference and light editing
- Rare conflicts are resolved manually
- Simple beats complicated
Statistics: 2,656 Observations and Growing
Current Status (October 2025)
Database size:
- Observations: 2,656 total
- With embeddings: 2,656 (100% coverage)
- Entities: 47 (people, projects, services, categories)
- Date range: June 2025 - October 2025
- Growth rate: ~20-30 observations per day
Embedding statistics:
- Model: nomic-embed-text (768 dimensions)
- Generation time: 32-45ms per observation (M1/M3)
- Total embedding time: ~2 minutes for full corpus
- Storage overhead: ~8MB for all vectors
- Index size: ~2MB (HNSW graph)
Performance metrics:
- Linear scan (no index): 500ms per query
- HNSW index: 5-8ms per query
- Speedup: 100x faster
- Recall: >99% (rarely misses true nearest)
Top Entities by Observation Count
-
Myron Koch (personal): 892 observations
- Personal preferences, work style, goals
- Technical skills and knowledge
- Project history and decisions
-
Technical_Incidents (category): 234 observations
- Bug reports and resolutions
- Infrastructure failures
- Debugging patterns
-
Active_Projects (category): 189 observations
- Current work-in-progress
- Project milestones
- Feature implementations
-
Blockchain MCP Servers (various): 156 observations each (avg)
- bitcoin-testnet, ethereum-sepolia, polygon, etc.
- Configuration details
- Tool implementations
-
Infrastructure (category): 145 observations
- PostgreSQL optimization
- Network configuration
- Development environment
Topic Distribution
Using semantic clustering on embeddings:
Technical (60%):
- Blockchain development: 25%
- MCP architecture: 18%
- Infrastructure: 12%
- Debugging: 5%
Business (25%):
- Project management: 12%
- Client relationships: 8%
- Strategic decisions: 5%
Personal (15%):
- Learning and growth: 8%
- Work style preferences: 7%
Real-World Use Cases
Use Case 1: “Have We Seen This Before?”
Scenario: Debugging a complex ESM/CommonJS import error.
Old approach:
-- Try to remember what we called it
SELECT * FROM observations
WHERE observation_text ILIKE '%ESM%'
OR observation_text ILIKE '%import%';
-- 47 results, mostly unrelated
New approach:
results = semantic_search(
"module import errors ESM CommonJS compatibility",
limit=10
)
Results:
0.93 | Bitcoin testnet ESM/CommonJS fix methodology
0.89 | MCP SDK upgrade breaking changes
0.85 | Package.json type:module configuration
0.82 | TypeScript tsconfig ESNext module resolution
0.79 | npm install --save-exact for dependency locking
Outcome: Found the exact solution from Bitcoin testnet debugging three weeks ago. Saved 4+ hours of re-debugging.
Use Case 2: Learning from Success Patterns
Scenario: About to start another blockchain MCP server integration.
Query:
results = semantic_search(
"successful blockchain integration patterns",
limit=20
)
Pattern discovery:
# Cluster similar successes
clusters = cluster_results(results)
# Extract common patterns
for pattern in clusters:
print(f"Pattern: {pattern.theme}")
print(f"Success rate: {pattern.success_rate}")
print(f"Key factors: {pattern.factors}")
Identified patterns:
- Start with testnet (100% success when followed)
- Use MBPS template (reduces setup time 80%)
- Validate with MCP Inspector (catches 95% of issues early)
- Document RPC endpoints (prevents future debugging)
Outcome: New server integration took 2 hours instead of 8.
Use Case 3: Cross-Project Knowledge Transfer
Scenario: Router DDoS incident. Need diagnostic approach.
Query:
results = semantic_search(
"systematic troubleshooting methodology network issues",
limit=15
)
Connections found:
0.91 | Router DDoS: DNS → ICMP → Port 443 → traceroute
0.87 | ESM debugging: Isolate → Reproduce → Fix → Validate
0.84 | Bitcoin testnet: Check package.json → tsconfig → rebuild
0.81 | MCP Inspector: Test simple → complex → integration
Insight: All successful debugging follows same pattern:
- Isolate the failure point
- Reproduce consistently
- Hypothesize root cause
- Test hypothesis
- Validate fix
- Document for future
Outcome: Created reusable “Debugging Protocol” template.
Use Case 4: Proactive Warning System
Scenario: Planning to use git submodules for dependency management.
Query:
results = semantic_search(
"git submodules dependencies versioning",
limit=10
)
Warnings found:
0.89 | Git Submodule Disaster: Nested dependencies broke everything
0.85 | npm dependency hell: Conflicting versions unresolvable
0.82 | Copy-paste architecture emerged as solution
0.79 | Monorepo approach avoided by using templates
Decision: Skip git submodules, use proven copy-paste approach instead.
Outcome: Avoided 2-day disaster based on past experience.
Privacy & Security: The Local-First Philosophy
Why Privacy Matters
When building a long-term memory system, you’re storing:
- Technical knowledge: Implementation details, debugging approaches
- Business decisions: Strategy, partnerships, pricing
- Personal information: Preferences, work style, relationships
- Proprietary code: Algorithm details, architecture patterns
Sending this to external APIs means:
- Your data trains their models
- Potential exposure in data breaches
- Third-party access to sensitive information
- Compliance and legal concerns
The Ollama Advantage
What stays local:
# This data NEVER leaves your machine
text = "Router DDoS protection blacklisted 160.79.104.10..."
# Local embedding generation
embedding = requests.post('http://localhost:11434/api/embeddings',
json={'model': 'nomic-embed-text', 'prompt': text})
# Network request goes to 127.0.0.1 (your machine)
# No internet connection required
# No external API calls
# No data collection
vs. External API:
# This data LEAVES your machine
text = "Router DDoS protection blacklisted 160.79.104.10..."
# External API call
embedding = openai.embeddings.create(
model="text-embedding-3-small",
input=text # ← Sent to OpenAI servers
)
# Your data is now on OpenAI's servers
# Subject to their privacy policy
# May be used for model improvement
# Requires internet connection
Security Layers
1. PostgreSQL Access Controls
-- Create read-only user for queries
CREATE USER claude_readonly WITH PASSWORD 'secure_password';
GRANT CONNECT ON DATABASE longterm_memory TO claude_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO claude_readonly;
-- Create read-write user for updates
CREATE USER claude_readwrite WITH PASSWORD 'another_secure_password';
GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO claude_readwrite;
-- Revoke public access
REVOKE ALL ON DATABASE longterm_memory FROM PUBLIC;
REVOKE ALL ON ALL TABLES IN SCHEMA public FROM PUBLIC;
2. iCloud Encryption
iCloud Drive files are encrypted:
- At rest: AES-256 encryption on Apple servers
- In transit: TLS 1.2+ for all data transfers
- End-to-end: For certain data types (keychain, health)
Database dumps in iCloud are encrypted at rest automatically.
3. Access Audit Log
-- Create audit log table
CREATE TABLE access_log (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
user_name TEXT,
query_text TEXT,
results_count INTEGER,
ip_address INET
);
-- Log all queries
CREATE OR REPLACE FUNCTION log_access()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO access_log (user_name, query_text, ip_address)
VALUES (current_user, current_query(), inet_client_addr());
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-- Attach to observations table
CREATE TRIGGER observations_access_log
AFTER SELECT ON observations
FOR EACH STATEMENT
EXECUTE FUNCTION log_access();
4. Data Minimization
Only store what’s necessary:
- ✅ Technical outcomes and decisions
- ✅ Debugging patterns and solutions
- ✅ Project milestones and context
- ❌ Passwords or credentials
- ❌ Personal identifiable information (unless explicit)
- ❌ Sensitive business financials (unless critical)
Compliance Considerations
GDPR (Europe):
- ✅ Data stays local (data subject control)
- ✅ Right to erasure (just delete database)
- ✅ No third-party processors (Ollama is local)
CCPA (California):
- ✅ No sale of personal information (not collected)
- ✅ Data portability (standard SQL dumps)
HIPAA (Healthcare):
- ⚠️ Don’t store medical records without proper safeguards
- ✅ Local processing reduces breach risk
Performance Benchmarks
M1 Mac Mini (8GB RAM, 2020)
Hardware:
- CPU: Apple M1 (8-core)
- RAM: 8GB unified memory
- Storage: 256GB SSD
Embedding generation:
Single observation: 45ms avg
Batch (100 observations): 4.2s (42ms each)
Full corpus (2,656): 119s (45ms each)
Semantic search:
Without HNSW index: 487ms avg
With HNSW index: 5.2ms avg
Speedup: 94x
Resource usage:
- Ollama RAM: 180MB
- PostgreSQL RAM: 120MB
- Total: 300MB (3.75% of system RAM)
M3 MacBook Pro (16GB RAM, 2023)
Hardware:
- CPU: Apple M3 (8-core)
- RAM: 16GB unified memory
- Storage: 512GB SSD
Embedding generation:
Single observation: 32ms avg
Batch (100 observations): 3.1s (31ms each)
Full corpus (2,656): 85s (32ms each)
Semantic search:
Without HNSW index: 412ms avg
With HNSW index: 3.8ms avg
Speedup: 108x
Resource usage:
- Ollama RAM: 220MB
- PostgreSQL RAM: 145MB
- Total: 365MB (2.28% of system RAM)
Scaling Projections
Based on benchmarks, here’s how performance scales:
10,000 observations:
- Embedding time: 5-7 minutes (one-time)
- Search time: 8-12ms per query
- Storage: 30MB vectors + 8MB index
- RAM: 400MB total
100,000 observations:
- Embedding time: 50-70 minutes (one-time)
- Search time: 25-40ms per query
- Storage: 300MB vectors + 50MB index
- RAM: 600MB total
1,000,000 observations:
- Embedding time: 8-12 hours (one-time, run overnight)
- Search time: 80-150ms per query
- Storage: 3GB vectors + 300MB index
- RAM: 1.2GB total
The bottom line: Even at 1M observations, semantic search is faster than most SQL queries.
Integration with Claude Desktop
Current Workflow
How Claude uses memory today:
- User asks question
- Claude checks recent memories (SQL query for last 7 days)
- If not found, searches by keywords
- Synthesizes answer from memories + knowledge
Limitation: Keyword search misses relevant context.
Enhanced Workflow (With Semantic Search)
- User asks question
- Claude generates query embedding (understanding intent)
- Semantic search finds related memories (by meaning)
- Keyword search catches exact matches (as backup)
- Combines results, ranked by relevance
- Synthesizes answer with better context
Example flow:
// In MCP server handler
async function handleQuery(query: string) {
// Parallel search strategies
const [semanticResults, keywordResults, recentResults] = await Promise.all([
semanticSearch(query, 10), // By meaning
keywordSearch(query, 10), // By exact words
getRecentObservations(7) // Last week
]);
// Combine and deduplicate
const allResults = mergeAndRank([
...semanticResults,
...keywordResults,
...recentResults
]);
// Return to Claude
return {
context: allResults.slice(0, 20),
source: 'hybrid_search'
};
}
Proactive Suggestions
Future enhancement: Claude suggests related context automatically.
User: “I’m getting ESM import errors”
Claude:
“I found a similar situation from three weeks ago where we fixed ESM/CommonJS compatibility in the Bitcoin testnet server. The solution involved updating package.json with
type: moduleand upgrading the MCP SDK. Would you like me to apply that pattern here?”
Under the hood:
# Automatic background semantic search
similar_situations = semantic_search(
"ESM import errors module compatibility",
limit=5,
min_similarity=0.8
)
if similar_situations:
suggest_relevant_context(similar_situations)
Learning from Feedback
Track search quality:
CREATE TABLE search_feedback (
id SERIAL PRIMARY KEY,
query TEXT,
result_observation_id INTEGER,
was_helpful BOOLEAN,
user_feedback TEXT,
timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Claude asks after providing context
-- "Was this memory helpful? (y/n)"
INSERT INTO search_feedback (query, result_observation_id, was_helpful)
VALUES ('network issues', 1234, true);
Use feedback to improve:
- Increase similarity threshold for unhelpful results
- Adjust embedding weights based on usefulness
- Identify gaps in memory coverage
Future Enhancements: The Roadmap
Phase 4: Multi-Modal Embeddings
Current: Text only
Future: Images, code, audio
Image embeddings for diagrams:
# Embed architecture diagram
diagram_embedding = generate_image_embedding("blockchain_architecture.png")
# Search for similar architectures
similar = semantic_search_images(diagram_embedding)
Code embeddings for technical patterns:
# Embed code snippet
code_embedding = generate_code_embedding("""
async function handleRequest(req) {
const result = await process(req);
return result;
}
""")
# Find similar code patterns
similar_code = semantic_search_code(code_embedding)
Audio embeddings for voice notes:
# Embed voice memo
audio_embedding = generate_audio_embedding("debug_notes.m4a")
# Search voice notes semantically
relevant_audio = semantic_search_audio(audio_embedding)
Phase 5: Knowledge Graph
Current: Flat observations
Future: Connected knowledge
-- Entity relationships
CREATE TABLE entity_relationships (
id SERIAL PRIMARY KEY,
from_entity_id INTEGER REFERENCES entities(id),
to_entity_id INTEGER REFERENCES entities(id),
relationship_type TEXT,
strength FLOAT
);
-- Observation relationships
CREATE TABLE observation_relationships (
id SERIAL PRIMARY KEY,
from_observation_id INTEGER REFERENCES observations(id),
to_observation_id INTEGER REFERENCES observations(id),
relationship_type TEXT, -- 'caused_by', 'related_to', 'contradicts'
confidence FLOAT
);
Query examples:
-- What projects depend on PostgreSQL optimization?
SELECT e.name
FROM entities e
JOIN entity_relationships er ON e.id = er.from_entity_id
WHERE er.to_entity_id = (SELECT id FROM entities WHERE name = 'PostgreSQL')
AND er.relationship_type = 'depends_on';
-- What observations contradict this one?
SELECT o2.observation_text
FROM observations o1
JOIN observation_relationships or ON o1.id = or.from_observation_id
JOIN observations o2 ON or.to_observation_id = o2.id
WHERE o1.id = 1234
AND or.relationship_type = 'contradicts';
Phase 6: Temporal Analysis
Track how knowledge evolves:
-- Observation versions
CREATE TABLE observation_versions (
id SERIAL PRIMARY KEY,
observation_id INTEGER REFERENCES observations(id),
version_number INTEGER,
observation_text TEXT,
embedding vector(768),
created_at TIMESTAMP
);
-- Compare versions
SELECT
v1.observation_text as original,
v2.observation_text as updated,
1 - (v1.embedding <=> v2.embedding) as similarity
FROM observation_versions v1
JOIN observation_versions v2 ON v1.observation_id = v2.observation_id
WHERE v1.observation_id = 1234
AND v1.version_number = 1
AND v2.version_number = 2;
Identify changing beliefs:
- What did I think about X in June vs October?
- Have my debugging approaches evolved?
- Which patterns proved durable vs ephemeral?
Phase 7: Active Learning
System improves itself:
def active_learning_loop():
"""Continuously improve search quality"""
# Find low-confidence observations
uncertain = find_uncertain_observations()
# Request clarification
for obs in uncertain:
clarification = ask_user(
f"This observation is unclear: '{obs.text}'. "
f"Can you elaborate?"
)
if clarification:
update_observation(obs.id, clarification)
regenerate_embedding(obs.id)
# Identify gaps
gaps = find_knowledge_gaps()
# Suggest topics to document
suggest_documentation(gaps)
Lessons Learned: Six Months of Semantic Memory
Technical Lessons
1. Local embeddings are surprisingly good
We expected OpenAI quality to be significantly better. Reality:
- nomic-embed-text: 95% as good for our use case
- Cost: $0 vs $100+/month for 30K embeddings/month
- Privacy: 100% local vs API calls
- Speed: 35ms vs 150ms (network latency)
Verdict: Unless you need absolute best quality, local wins.
2. HNSW index is non-negotiable
Before index: Semantic search was cool but too slow for real use.
After index: 100x speedup made it practical.
Lesson: Always benchmark with realistic data volumes. What works for 100 observations might not scale to 10,000.
3. 768 dimensions is the sweet spot
We tested different embedding sizes:
- 384 dims: 20% faster, 10% worse quality
- 768 dims: Balanced performance
- 1536 dims: 40% slower, 2% better quality
For our use case, 768 is optimal. Diminishing returns beyond that.
4. Batch embedding generation matters
Single observations (generating as they’re created):
- Simple to implement
- ~50ms latency per save
- User waits for embedding generation
Batch processing (nightly/hourly):
- More complex (need queue)
- Zero latency on save
- Better resource utilization
We chose: Batch processing via cron. Generate embeddings hourly for new observations.
Architectural Lessons
1. PostgreSQL + pgvector scales better than expected
We worried about performance at scale. Reality:
- 2,656 observations: 5ms queries
- Projected 10K: 10-15ms queries
- Projected 100K: 30-50ms queries
Even at 100K observations, still faster than most web applications.
2. iCloud sync is “good enough”
We considered:
- Cloud PostgreSQL (expensive, ~$50/month)
- Git-based sync (complex, merge conflicts)
- Dropbox/iCloud (simple, free)
iCloud won because:
- ✅ Zero cost
- ✅ Automatic sync
- ✅ Encrypted
- ✅ “Good enough” latency (minutes, not seconds)
Trade-off accepted: Not real-time (1-hour delay is fine for our use).
3. Simplicity beats cleverness
We almost built:
- Custom sync protocol
- Real-time replication
- Conflict resolution algorithms
We actually built:
pg_dumpto file- Copy file to iCloud
psqlfrom file
Result: Works perfectly, easy to debug, zero maintenance.
Operational Lessons
1. Automate embedding generation
Early mistake: Manually generating embeddings for new observations.
Problem:
- Forgot to generate regularly
- Accumulated 200+ un-embedded observations
- Search quality degraded
Solution: Cron job every hour
0 * * * * /path/to/generate_embeddings.py >> /var/log/embeddings.log 2>&1
2. Monitor search quality
We added:
-- Track searches
CREATE TABLE search_log (
query TEXT,
results_count INTEGER,
avg_similarity FLOAT,
timestamp TIMESTAMP
);
-- Alert on poor results
SELECT query, AVG(avg_similarity)
FROM search_log
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY query
HAVING AVG(avg_similarity) < 0.6
ORDER BY AVG(avg_similarity);
If average similarity drops below 0.6, something’s wrong:
- Embedding model changed?
- Data quality declined?
- HNSW index needs rebuild?
3. Regular index maintenance
HNSW index can degrade over time:
-- Rebuild index monthly
DROP INDEX observations_embedding_idx;
CREATE INDEX observations_embedding_idx
ON observations
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- Update statistics
ANALYZE observations;
4. Backup everything
Murphy’s Law of Databases: “The probability of data loss is inversely proportional to backup frequency.”
Our backup strategy:
- Hourly: iCloud sync (automatic)
- Daily: Time Machine (macOS native)
- Weekly: Manual
pg_dumpto external drive - Monthly: Cloud backup (Backblaze)
Never lost data, even during experimentation.
Conclusion: Memory That Thinks
Six months ago, we had a database. Today, we have an intelligent insight engine.
The transformation:
- Keywords → Meaning: Search finds what you mean, not just what you say
- Local → Private: Your data never leaves your machine
- Storage → Intelligence: Discovers patterns you didn’t know existed
- Manual → Automatic: Embedding generation happens in background
- Single-device → Everywhere: Same memories on M1 and M3
The numbers:
- 2,656 observations semantically indexed
- 768 dimensions capturing nuance
- 100x faster with HNSW indexing
- $0 cost for embeddings (Ollama local)
- 5-8ms query time (sub-10ms is instant)
The impact:
- Found router DDoS solution in seconds (would have taken hours)
- Avoided git submodule disaster based on past experience
- Applied ESM/CommonJS fix pattern to new problem
- Discovered debugging methodology pattern across 20 different incidents
This is what AI-powered memory should be:
- Understands meaning, not just words
- Respects privacy by staying local
- Scales to hundreds of thousands of observations
- Costs nothing to run
- Just works
The next time you’re searching for something you know you’ve seen before but can’t find… maybe it’s time to upgrade from keywords to meaning.
Your memories deserve better than LIKE '%keyword%'. They deserve semantics.
Code Repository
All scripts and SQL from this post are available:
GitHub: https://github.com/myronkoch/semantic-memory-postgres
Contents:
setup_pgvector.sql- Database schema with vectorsgenerate_embeddings.py- Batch embedding generationsemantic_search.py- Search interfacesync_m1_to_icloud.sh- Export scriptsync_m3_from_icloud.sh- Import scripttune_hnsw_index.sql- Performance optimizationREADME.md- Complete setup guide
Technical Tags
vector-embeddings semantic-search postgresql pgvector ollama nomic-embed-text hnsw-index local-llm privacy-first cross-device-sync ai-infrastructure pattern-discovery knowledge-management
Metadata
Word Count: ~7,200 words
Code Blocks: 25+
Reading Time: 30-35 minutes
Skill Level: Intermediate to Advanced
Prerequisites: Basic SQL, Python, understanding of embeddings
Next in Series: Blog Post 021 - 8-Chain Testnet Ecosystem: Production Blockchain Infrastructure
Previous in Series: Blog Post 019 - ESM/CommonJS Migration Hell: The MCP Module Debugging Odyssey
Questions? Confused about embeddings? Want to share your semantic search setup? Find me on GitHub or LinkedIn
If this helped you build semantic search, consider ⭐ starring the repo!
Related Reading
Prerequisites
- The PostgreSQL Long-Term Memory System - Understand the database that this semantic search technology enhances.
Next Steps
- Context Window Management: Building AI-Friendly Code - See how intelligent memory helps manage context for AI agents.
Deep Dives
- The MBPS v2.1 Standard: How Chaos Became Order - The standardization effort that benefits from a searchable knowledge base.