Garry Tan Built an AI Brain in 12 Days—Now It's Yours
Garry Tan Built an AI Brain in 12 Days—Now It's Yours
Your AI agent is brilliant in the moment and utterly useless tomorrow. Ask it about yesterday's meeting? Gone. That insight from last month's investor call? Evaporated. The connection between two people you met three years apart? Never existed as far as it's concerned.
This isn't a bug. It's the fundamental limitation that's holding back every AI agent deployment in production today. Your agent has no memory. It reasons beautifully over whatever you feed it right now, then discards everything. It's like hiring a genius consultant who suffers complete amnesia after every conversation.
Garry Tan—President and CEO of Y Combinator, the most powerful startup accelerator on Earth—faced this exact problem running his own AI agents. His solution wasn't another vector database wrapper or a RAG tutorial. He built GBrain, a full cognitive architecture that makes AI agents remember, connect, and grow smarter while you sleep. Seventeen thousand pages. Four thousand people. Seven hundred companies. Twenty-one cron jobs running autonomously. Built in twelve days.
And now he's open-sourced the entire system.
What is GBrain?
GBrain is Garry Tan's opinionated production brain for AI agents—specifically designed for OpenClaw and Hermes Agent deployments, though it works with any agent framework through MCP (Model Context Protocol). The repository lives at github.com/garrytan/gbrain and represents one of the most sophisticated personal knowledge base systems ever released as open source.
The name is deliberately evocative: this isn't a database, a search index, or a note-taking app. It's a brain—a system that ingests, enriches, connects, and consolidates information autonomously. The core insight is that raw information storage is worthless without structured relationships, compiled truth, and continuous maintenance.
GBrain's architecture reflects Tan's operational reality as a prolific investor and operator. His personal deployment ingests meetings, emails, tweets, voice calls, and original ideas. Every person and company encountered gets enriched with structured data. Citations fix themselves overnight. Memory consolidates automatically. The system implements what cognitive scientists call "consolidation"—the process by which short-term memories become long-term, structured knowledge.
The project emerged from Tan's own agent setup: he started with a simple markdown brain repo, one page per person, one page per company, with compiled truth on top and timeline below. Within a week, he had 10,000+ files, 3,000+ people, 13 years of calendar data, 280+ meeting transcripts, and 300+ captured ideas. What took 11 days to build by hand now ships as a mod you install in 30 minutes.
The system is benchmarked rigorously. On a 240-page Opus-generated rich-prose corpus, GBrain achieves P@5 49.1% and R@5 97.9%—beating its own graph-disabled variant by +31.4 points P@5 and ripgrep-BM25 plus vector-only RAG by similar margins. The graph layer and v0.12 extract quality together carry this gap. Full scorecards live in the sibling gbrain-evals repository.
Key Features That Separate GBrain From Everything Else
Hybrid Search Architecture. GBrain doesn't rely on vector search alone. It layers vector similarity (HNSW cosine over embeddings), keyword search (Postgres tsvector), Reciprocal Rank Fusion, multi-query expansion via Claude Haiku, cosine re-scoring, compiled-truth boosting, and backlink boosting. Each layer covers what the others miss. Vector search misses exact phrase matches; keyword search misses conceptual similarity. Together, they achieve what neither can alone.
Self-Wiring Knowledge Graph. Every page write triggers entity extraction with zero LLM calls. The system parses markdown links and bare slugs, infers relationship types through deterministic cascades (attended, works_at, invested_in, founded, advises), reconciles stale links on edits, and maintains multi-type constraints. Ask "who works at Acme AI?" or "what did Bob invest in this quarter?" and get answers that vector search alone cannot reach.
Compiled Truth + Timeline Pattern. Every page follows a strict structure: above the --- separator lives compiled truth (your current best understanding, rewritten when evidence changes); below lives an append-only timeline of evidence. This separation prevents the common failure mode where important assessments get buried in chronological noise.
Deterministic Classification with Fail-Improve Loop. The intent classifier routes queries automatically (entity? temporal? event? general?) and improves over time by logging every LLM fallback and generating better regex patterns from failures. gbrain doctor shows the trajectory: "intent classifier: 87% deterministic, up from 40% in week 1."
34 Production Skills. From signal-detector (fires on every message, capturing ideas and entities in parallel) to brain-ops (brain-first lookup before any external API) to book-mirror (personalized two-column chapter-by-chapter book analysis mapping ideas to your life using your own words). Skills are "fat markdown documents" encoding entire workflows—when to fire, what to check, how to chain with other skills, what quality bar to enforce.
Minions: Durable Background Execution. A Postgres-native job queue that survives gateway restarts, streams progress, and gets paused/resumed/steered mid-flight. For deterministic work (pull posts, parse JSON, write brain page, run sync): 753ms wall time, $0.00 tokens, 100% success rate versus >10,000ms gateway timeout, ~$0.03 per run, 0% success for sub-agent spawn under equivalent load.
BrainBench-Real Evaluation. With GBRAIN_CONTRIBUTOR_MODE=1, every real query + search call gets captured (PII-scrubbed) for replay against code changes. Three numbers come back: mean Jaccard@k between captured and current retrieved slugs, top-1 stability, and latency delta. Off by default for production users.
14 Embedding Provider Recipes. OpenAI is default, but GBrain ships configurations for Voyage, Google Gemini, Azure OpenAI, MiniMax, Alibaba DashScope, Zhipu, Ollama (local), llama.cpp llama-server (local), LiteLLM proxy (universal), and five more.
Use Cases Where GBrain Transforms Agent Capabilities
Investor and Founder Relationship Intelligence
You're preparing for a meeting with a founder you last spoke to eight months ago. Your agent pulls their full dossier: every meeting transcript, every email exchange, every company they're connected to, every investment they've made that quarter, every person you both know. The graph traversal finds the connection through three degrees you forgot existed. "Prep me for my meeting with Jordan in 30 minutes" becomes a single command with comprehensive context.
Longitudinal Knowledge Synthesis
You've been thinking about a problem for years—scattered across thousands of notes, tweets, meeting transcripts, and voice memos. Ask "What have I said about the relationship between shame and founder performance?" and GBrain searches YOUR thinking, not the internet. The concept-synthesis skill deduplicates thousands of stubs into a tiered intellectual map (T1 Canon to T4 Riff), tracing how ideas evolved across years of notes.
Autonomous Content Ingestion and Enrichment
Your agent ingests meetings, emails, tweets, voice calls, and original ideas while you sleep. Every attendee gets an enriched person page. Every mentioned company gets a timeline entry. The dream cycle runs 11 phases overnight: lint, backlinks, sync, synthesize, extract, patterns, emotional-weight recompute, consolidate, embed, orphans, purge. You wake up and the brain is smarter than when you went to bed.
Multi-Agent Research with Verified Claims
The perplexity-research skill sends brain context to Perplexity so search focuses on what's NEW versus already-known. The academic-verify skill traces research claims through publication → methodology → raw data → independent replication, producing a verdict: verified, partial, unverifiable, misattributed, or retracted. No more hallucinated citations or confident-sounding falsehoods.
Step-by-Step Installation & Setup Guide
Prerequisites
- Bun runtime (required; Node.js insufficient for postinstall hooks)
- Git
- API keys for your chosen embedding provider (OpenAI default)
Method 1: Agent-Installed (Recommended)
GBrain is designed to be installed and operated by an AI agent. If you don't have one running:
- OpenClaw: Deploy AlphaClaw on Render (one click, 8GB+ RAM)
- Hermes Agent: Deploy on Railway (one click)
Paste this into your agent:
Retrieve and follow the instructions at:
https://raw.githubusercontent.com/garrytan/gbrain/master/INSTALL_FOR_AGENTS.md
The agent clones the repo, installs GBrain, sets up the brain, loads 34 skills, and configures recurring jobs. You answer questions about API keys. ~30 minutes total.
If your agent doesn't auto-read AGENTS.md, point it at:
https://raw.githubusercontent.com/garrytan/gbrain/master/AGENTS.md(non-Claude agents)https://raw.githubusercontent.com/garrytan/gbrain/master/CLAUDE.md(Claude Code specifically)
Method 2: Standalone CLI
# Clone and install (CRITICAL: do NOT use bun install -g or npm install -g)
git clone https://github.com/garrytan/gbrain.git && cd gbrain && bun install && bun link
# Initialize local brain—database ready in 2 seconds (PGLite, no server)
gbrain init
# Picks a search mode: conservative / balanced / tokenmax
# Index your existing notes
gbrain import ~/notes/
# Start querying
gbrain query "what themes show up across my notes?"
# Check active search mode and per-knob attribution
gbrain search modes
# Monitor cache hit rate and intent mix after real usage
gbrain search stats
Critical installation warnings:
- Do NOT use
bun install -g github:garrytan/gbrain— Bun blocks the top-level postinstall hook, schema migrations never run, and the CLI aborts withAborted()on first PGLite open. See #218. - Do NOT use
bun add -g gbrainornpm install -g gbrain— The npm registry has an unrelated package squatting that name (gbrain@1.3.x). v0.28.5+ detects this and prints recovery ongbrain upgrade, butgit clone + bun linkis the only reliable path until@garrytan/gbrainpublishes. See #658.
MCP Server Configuration
For Claude Code, Cursor, Windsurf:
{
"mcpServers": {
"gbrain": { "command": "gbrain", "args": ["serve"] }
}
}
Add to ~/.claude/server.json (Claude Code), Settings > MCP Servers (Cursor), or your client's MCP config.
Remote MCP with OAuth 2.1
# Start production-grade OAuth 2.1 server with embedded admin dashboard
gbrain serve --http --port 3131
# Open admin, paste bootstrap token, register client
open http://localhost:3131/admin
# Expose publicly (set --public-url so OAuth issuer matches)
ngrok http 3131 --url your-brain.ngrok.app
gbrain serve --http --port 3131 --public-url https://your-brain.ngrok.app
Register OAuth clients from /admin — click Register client, pick scopes, save credentials shown once in the reveal modal. Source-scoped clients (v0.34): gbrain auth register-client my-agent --source dept-x ties write authority to one source; --federated-read S1,S2,S3 adds orthogonal read-scope for shared brains.
REAL Code Examples from the Repository
Example 1: Basic Query and Search Operations
The CLI provides both keyword search (gbrain search) and hybrid semantic search (gbrain query). Here's the actual output format from a real query:
# Hybrid search with structured results
gbrain query "what themes show up across my notes?"
Output:
3 results (hybrid search, 0.12s):
1. concepts/do-things-that-dont-scale (score: 0.94)
PG's argument that unscalable effort teaches you what users want.
[Source: paulgraham.com, 2013-07-01]
2. originals/founder-mode-observation (score: 0.87)
Deep involvement isn't micromanagement if it expands the team's thinking.
3. concepts/build-something-people-want (score: 0.81)
The YC motto. Connected to 12 other brain pages.
What's happening here: The query passes through GBrain's full search pipeline: intent classification ("general"—not entity or temporal), multi-query expansion (Haiku rephrases 3 ways), vector search (HNSW cosine) plus keyword search (tsvector), RRF fusion scoring each result by sum(1/(60 + rank)), cosine re-scoring against the actual query embedding, compiled-truth boost (assessments outrank timeline noise), and backlink boost (well-connected entities rank higher). The 4-layer dedup guarantees one compiled-truth chunk per page. The result shows not just relevance scores but provenance (source, date) and graph connectivity ("Connected to 12 other brain pages")—critical signals for trust and exploration.
Example 2: Graph Traversal for Relationship Queries
Vector search fails completely on relational questions. GBrain's graph layer answers them directly:
# Find who Alice met with, transitively (2 degrees)
gbrain graph-query people/alice --type attended --depth 2
This executes a recursive CTE with cycle prevention, type-filtered edges, and depth capping (≤10 for remote MCP as DoS prevention). The graph was auto-wired on every page write with zero LLM calls—entity references extracted via regex, typed through deterministic inference cascade (FOUNDED → INVESTED → ADVISES → WORKS_AT), with page-role priors (partner-bio language → invested_in).
Backfill an existing brain in one command:
# Wire up existing pages (run once on legacy brains)
gbrain extract links --source db # extract typed relationships from all pages
gbrain extract timeline --source db # extract dated events from markdown timelines
After backfill, the same graph queries work, and search ranking improves due to backlink boost. The benchmark delta is massive: +31.4 points P@5 versus graph-disabled, with v0.11→v0.12 improvement from 22.1% to 49.1% P@5 on identical inputs—proving typed-link extract quality is load-bearing.
Example 3: Skillify Workflow—From Bug to Permanent Fix
The skillify command turns one-off fixes into durable, tested, maintained skills:
# 1. Scaffold all 5 stub files for a new skill in one shot
gbrain skillify scaffold webhook-verify \
--description "verify ngrok webhooks" \
--triggers "verify the webhook,check tunnel" \
--writes-pages --writes-to people/,companies/
# 2. Replace SKILLIFY_STUB sentinels with real logic + tests
$EDITOR skills/webhook-verify/scripts/webhook-verify.mjs
$EDITOR test/webhook-verify.test.ts
# 3. Run the 10-item audit: SKILL.md, script, unit + E2E tests,
# LLM evals, resolver entry, trigger eval, check-resolvable gate, brain filing
gbrain skillify check skills/webhook-verify/scripts/webhook-verify.mjs
# 4. Verify the whole tree: reachability, MECE overlap, DRY, routing gaps,
# filing audit, SKILLIFY_STUB sentinels (fails if any remain)
gbrain check-resolvable # warnings advisory
gbrain check-resolvable --strict # warnings block too (CI opt-in)
Why this matters: Hermes and similar frameworks auto-create skills as background behavior, producing opaque piles nobody has read, tested, or verified still works. GBrain keeps the human in the loop with explicit commands at every step. The scaffold completes in under 2 seconds; your real work (rules, scripts, tests) is where time goes. The routing-eval.jsonl fixture catches routing gaps your users actually hit—false positives, missed routes, tautological fixtures all surface as specific advisories with exact file:line.
Example 4: Minions Job Submission and Supervision
For deterministic background work that must survive restarts:
# Verify Minions installation
gbrain jobs smoke
# Submit a background job with parameters
gbrain jobs submit sync --params '{}'
# Start canonical auto-restarting worker (Postgres only)
gbrain jobs supervisor --concurrency 4
# Monitor health dashboard
gbrain jobs stats
# List, manage, and prune jobs
gbrain jobs list --status completed
gbrain jobs get 1247
gbrain jobs cancel 1247
gbrain jobs prune --older-than 30d
The supervisor subcommand keeps workers alive across crashes with exponential backoff, atomic PID locking, structured audit events at ~/.gbrain/audit/supervisor-*.jsonl, and start --detach / status --json / stop for agent automation. In containers it runs as PID 1; on systemd hosts it's the child of gbrain-worker.service.
The production numbers are stark: under 19-cron load, sub-agent spawn couldn't clear the 10-second gateway wall. Minions landed in 753ms for $0.00 tokens versus >10,000ms timeout, ~$0.03 per run, 0% success. For 19,240 posts across 36 months: Minions took ~15 minutes total at $0.00; sub-agents ~9 minutes best case, ~$1.08 in tokens, ~40% spawn failure. Durability test: SIGKILL mid-flight, 10/10 rescued.
Example 5: Named Search Modes with Cost Transparency
v0.32.3 introduces explicit cost/quality tradeoffs:
# Check active mode and attribution
gbrain search modes
# See cost after real usage
gbrain search stats
# Get data-driven recommendations
gbrain search tune
The cost spread depends on both mode and downstream model—25x corner-to-corner at 10K queries/month:
| Mode \ Downstream | Haiku 4.5 ($1/M) | Sonnet 4.6 ($3/M) | Opus 4.7 ($5/M) |
|---|---|---|---|
conservative (~4K tokens) |
$40/mo | $120/mo | $200/mo |
balanced (~10K tokens) |
$100/mo | $300/mo | $500/mo |
tokenmax (~20K tokens) |
$200/mo | $600/mo | $1,000/mo |
Auto-suggests based on configured models.tier.subagent. Non-TTY installs auto-pick balanced and print a hint. Natural pairings span ~4x at realistic single-user volume—meaning your model choice matters as much as your search mode.
Advanced Usage & Best Practices
Storage Tiering for Scale. When your brain crosses 100K files, declare which directories live in git versus database-only:
# gbrain.yml at brain repo root
storage:
db_tracked:
- people/
- companies/
- deals/
db_only:
- media/x/
- media/articles/
- meetings/transcripts/
gbrain sync auto-manages .gitignore for db_only paths. gbrain export --restore-only --repo . repopulates missing files from database.
Model Routing Optimization. The tier system lets you assign different models to different tasks without per-call configuration:
# Set defaults and per-task overrides
gbrain config set models.default opus
gbrain config set models.tier.deep opus
# Verify all configured models are reachable
gbrain models doctor
gbrain models doctor runs 1-token reachability probes for each chat/expansion model plus zero-token embedding_config probes—catching model_not_found and Voyage flexible-dim misconfigs before silent degradation.
Contributor Mode for Quality Assurance. Set GBRAIN_CONTRIBUTOR_MODE=1 in your shell to capture real queries for regression testing:
# Export captured queries as NDJSON
gbrain eval export
# Replay against current code, get Jaccard@k, top-1 stability, latency delta
gbrain eval replay --against my-branch.ndjson
# Run public LongMemEval benchmark
gbrain eval longmemeval dataset.jsonl
Capture is off by default—no surprise data accumulation for production users.
Hot Memory for Real-Time Context. v0.31's recall command surfaces cross-session facts queryable in real time:
gbrain recall <entity> # active facts, newest first
gbrain recall --since "1h ago" # recency-filtered
gbrain recall --today # markdown with kind icons
gbrain recall --as-context # prompt-injection-ready for headless agents
gbrain forget <fact-id> # soft delete
Comparison with Alternatives
| Capability | GBrain | Pinecone/Weaviate | Obsidian + Plugins | Custom RAG | Notion AI |
|---|---|---|---|---|---|
| Self-wiring graph | ✅ Native, zero LLM calls | ❌ Manual metadata | ⚠️ Community plugins | ❌ Build yourself | ❌ No |
| Hybrid search (vector + keyword + RRF) | ✅ Production-proven | ⚠️ Vector only, or build hybrid | ❌ Keyword mostly | ⚠️ Build yourself | ❌ Basic search |
| Agent-native design | ✅ MCP, OAuth 2.1, 34 skills | ❌ Database only | ❌ Human-centric | ⚠️ Build yourself | ⚠️ Limited API |
| Deterministic background jobs | ✅ Minions (Postgres-native) | ❌ None | ❌ None | ❌ Build yourself | ❌ None |
| Compiled truth + timeline | ✅ Enforced structure | ❌ Unstructured | ⚠️ Templates possible | ❌ Build yourself | ❌ Flat pages |
| Continuous auto-enrichment | ✅ 21 cron jobs, dream cycle | ❌ None | ❌ Manual | ❌ Build yourself | ❌ None |
| Evaluation framework | ✅ BrainBench, LongMemEval, replay | ❌ None | ❌ None | ❌ Build yourself | ❌ None |
| Embedding provider flexibility | ✅ 14 recipes, local options | ⚠️ Limited | ❌ None | ⚠️ Build yourself | ❌ Proprietary |
| Open source | ✅ MIT | ⚠️ Various | ❌ Proprietary | ✅ Your code | ❌ Proprietary |
| Setup time | ~30 minutes | Hours to days | Hours | Weeks | Minutes (limited) |
The fundamental difference: GBrain is not a database or search tool. It's a complete cognitive architecture. Vector databases store embeddings. GBrain stores understanding—structured, connected, maintained, and evaluable.
FAQ
Does GBrain require OpenAI? No. OpenAI is the default embedding provider, but GBrain ships with 14 recipes covering Voyage, Google Gemini, Azure OpenAI, MiniMax, Alibaba DashScope, Zhipu, Ollama (local), llama.cpp (local), LiteLLM proxy, and more. Run gbrain providers list or gbrain doctor to see alternatives your environment already supports.
Can I use GBrain without an AI agent? Yes, via standalone CLI. But it's designed for agent operation—the 34 skills expect an agent reader. For pure human use, you'd miss the autonomous ingestion, enrichment, and maintenance cycles.
How does GBrain handle PII and privacy? The soul-audit skill generates a 4-tier ACCESS_POLICY.md. BrainBench-Real capture is opt-in via GBRAIN_CONTRIBUTOR_MODE=1. OAuth 2.1 scopes every request. Source-scoped clients (v0.34) restrict write authority. PII scrubbing runs before any eval capture.
What's the maximum brain size? Tan's production brain: 45,000 pages in Supabase Postgres. PGLite (default, embedded, zero config) handles thousands of files comfortably. Migrate with gbrain migrate --to supabase when you outgrow local.
How does this differ from Mem0 or other "memory" layers? Mem0 stores conversation history. GBrain builds a knowledge graph with typed relationships, compiled truth, and continuous maintenance. Ask Mem0 "what did Bob invest in?"—it searches text. Ask GBrain—it traverses invested_in edges with precision. The benchmark delta (+31.4 P@5) quantifies this gap.
Can multiple agents share one brain? Yes, via federated sources. --federated-read S1,S2,S3 lets departments write to one canon while reading the union. OAuth 2.1 clients can be scoped per-source. See docs/architecture/topologies.md for multi-machine and multi-worktree setups.
What happens when the agent makes a mistake? The maintain skill runs periodic health checks: stale pages, orphans, dead links, citation audit, back-link enforcement, tag consistency. The citation-fixer scans and repairs malformed citations. The cross-modal-review quality gate routes through a second model with refusal switching. And gbrain doctor surfaces issues with exact fix commands.
Conclusion
The AI agent revolution is stuck on a critical bottleneck: memory. Every impressive demo forgets everything by morning. Every "autonomous" agent rebuilds context from scratch. Every "personalized" interaction is actually just a longer prompt.
GBrain solves this at the architectural level—not with a bigger vector database, but with a complete cognitive system: structured knowledge representation, deterministic relationship extraction, continuous autonomous maintenance, and rigorous evaluation. The benchmarks prove it works. The production deployment proves it scales. The open-source release proves it's real.
Garry Tan built this for himself—17,888 pages, 4,383 people, 723 companies, running 21 cron jobs autonomously—and now you can install it in 30 minutes. The skills improve as his personal agent improves. Your agent benefits from his operational refinements.
Stop building amnesiac agents. Give them a brain.
Clone github.com/garrytan/gbrain, run gbrain init, and start building agents that actually remember.
Comments (0)
No comments yet. Be the first to share your thoughts!