Session 5 — Semantic Signatures
What We Built
Real meaning-based signatures replace random number generation. Similar concepts now produce similar signatures automatically. The brain understands relationships between ideas for the first time.
The Problem With Random Signatures
Before Session 5, every node's signature was 64 random numbers. The architecture worked perfectly but had no understanding — "dog" and "puppy" were as unrelated as "dog" and "quantum physics."
How Semantic Signatures Work
- Input — a word or phrase like "firewall"
- Encode — sentence-transformers converts it to a 384-dimension embedding vector
- Project — a fixed projection matrix reduces 384 → 64 dimensions
- Normalize — unit length normalization for cosine similarity
- Store — the 64-number vector becomes the node's signature
Similar concepts produce similar vectors automatically — no manual labeling needed.
Key Concepts
Word Embeddings
Vectors where geometry encodes meaning. Similar meanings produce vectors that point in similar directions.
The Famous Example
king − man + woman ≈ queen
That's not magic — it's just geometry. Meaning becomes math.
Analogy
A map of meaning. Every concept gets coordinates. Nearby coordinates = similar meaning.
Dimension Reduction
384 dimensions → 64 dimensions via a fixed random projection matrix (seed 2025). Preserves semantic relationships — similar concepts stay similar after projection.
The Model
all-MiniLM-L6-v2 — small, fast, CPU-friendly. ~80MB download, cached locally after first use. No API key, no internet required after download.
Similarity Results (Verified)
| Pair | Score | Interpretation |
|---|---|---|
| dog vs puppy | 0.814 | Almost the same concept ✅ |
| dog vs cat | 0.773 | Similar — both animals ✅ |
| dog vs database | 0.278 | Barely related ✅ |
| networking vs firewall | 0.535 | Related domain ✅ |
| networking vs cooking | 0.439 | Less related ✅ |
New Methods Added
# New file — embeddings.py
concept_to_signature("firewall") # single concept → 64-dim vector
concepts_to_signatures(["DNS", "VPN"]) # batch → faster
similarity("dog", "puppy") # test similarity between two concepts
# New methods in KnowledgeLandscape
brain.learn("firewall") # teach one concept
brain.learn_many(["DNS", "VPN", "router"]) # batch teach
brain.query_concept("network security") # query by name
brain.concept_label(node_id) # get human-readable label
TurboQuant Context
Google published TurboQuant in March 2026 — a compression algorithm working in the same problem space as NCI. The key difference:
- TurboQuant — compresses existing large models after the fact
- NCI — compression is native to the architecture from the start
TurboQuant researchers noted their approach is approaching its theoretical limit. NCI starts from a different foundation entirely.
Key Files
embeddings.py— semantic signature generationindex.py— learn(), learn_many(), query_concept(), concept_label()