I Built Three Versions of My AI Clone. The Third One Works - But It's Slower.

I was re-explaining myself to Claude for the third time that morning.

Same voice guidelines. Same strategic frameworks. Same context about how I think through M&A deals versus how I talk through parenting decisions.

Every new chat session started from zero.

I wanted an AI that remembered how I think - the way I break down business strategy differently than I talk through fatherhood conversations. For me, that meant M&A frameworks versus parenting advice, but the pattern applies to anyone with distinct professional and personal contexts.

So I started building.

2 weeks later, the third version finally works. It's also 40% slower than the first one.

And I'm completely fine with that.

V1: Stuff Everything In

The first version had one core assumption: give the model enough context and it'll figure it out.

The system prompt loaded a full 45KB voice profile, a 6KB Clifton Strengths document, and the top 5 retrieved chunks from a single RAG query - all in one pass. Model routing was handled by a shouldEscalate() function that scanned for keywords like "strategy" or "analyse" and switched between Haiku and Sonnet accordingly.

It worked. It sounded like me.

But as I added more content and learnings to the knowledge base, it started to break down in a specific way: if it got the facts right, it would drop the voice. If it nailed the voice, it missed the detail.

The first time it dropped my voice mid-response, I felt that same re-explaining exhaustion I was trying to escape.

The context window was being asked to do too much at once - reconcile identity, personality, strength profiles, and retrieved content in a single pass. That's where the dilution started.

V2: Tune the Parameters

Rather than restructure, I tried to push the existing architecture further. Over a focused evening, I made a series of targeted fixes:

Bumped retrieval from 5 chunks to 10
Combined the last 3 user messages into a single search query so follow-ups retained topic context
Filtered the voice profile document down to the sections actually relevant to chat
Forced learnings to always surface at least 15 chunks, inserted before blog content
Dropped Haiku entirely - it couldn't follow complex voice and retrieval instructions reliably

Each parameter tweak felt like progress until it didn't.

Each of these was a genuine improvement. But together, they revealed the ceiling.

Node	Role
JobRouter (Haiku)	Classify intent, select Clifton lens, generate per-namespace queries
Search	Run decomposed queries - blogs, learnings, frameworks, experiences get separate targeted searches
Reranker (BGE)	Fix exact-term retrieval that cosine similarity misses
Reasoning (Sonnet)	Apply the Clifton lens to produce structured analysis
Voice synthesis (Sonnet, cached)	Convert structured analysis into my voice

AI Marcus - Beta

Ask me anything

How I Built, Broke, and Rebuilt My Own AI Three Times

I Built Three Versions of My AI Clone. The Third One Works - But It's Slower.

V1: Stuff Everything In

V2: Tune the Parameters

Ask AI Marcus about this post

Marcus Hahnheuser

Stay in the loop

Continue exploring

The Developer Built It While I Was Still Reading The Approval Docs

AI Made Us More Productive, So Why Are We Drowning in Reports?

You're Not Building an AI Advantage. You're Building Someone Else's.

V3: Separate the Concerns

What This Means If You're Building AI Systems

The Real Lesson