codebuddy.tech

building in public from Vancouver

```html

Building My AI Content Pipeline: From Rambling to Published in 12 Languages

I finished this pipeline last night, and I'm genuinely proud of it. Three AI models, a Telegram bot, local speech recognition, and a translation layer—all to turn a shaky 10-minute phone monologue into a polished blog post published in 12 languages. Most people would call this over-engineered. I'd call it the only way I could make myself actually publish consistently.

The Problem: I'm Not a YouTube Personality Yet

My brain doesn't fire quickly on camera. I'm not one of those people who can just riff naturally for an audience. So instead, I film myself talking slowly for about 10 minutes on my phone—rambling, halting, thinking out loud. It's not pretty. But it's authentic, and that's the part that matters.

The rawness is where the real thinking lives. Unscripted, unpolished first thoughts often contain the most honest insights. The problem is nobody wants to read a transcript of someone thinking out loud. So I built a system that keeps the authenticity and removes the mess.

The Pipeline: What Actually Happens

I wrote a Python script that extracts audio from my phone video as an M4A file. That gets uploaded to a Telegram bot I built and host on my VPS. Telegram has become my automation hub—it receives the file, downloads it, and runs it through Whisper, OpenAI's open-source speech-to-text model, running locally. Out comes a raw transcript.

That transcript goes into a four-stage AI pipeline, each stage gated by my approval in Telegram:

Stage 1—Claude drafts. The raw transcript hits Claude first. The prompt asks it to write an engaging blog post in my voice—casual, honest, technically grounded. It adds structure and flow and pulls out the key ideas. I get a preview link and two buttons: Approve or Edit.

Stage 2—DeepSeek challenges. If I approve, DeepSeek gets the Claude draft and tears into it. The prompt asks for fact-checking, added technical depth, and a slightly contrarian analytical perspective—while keeping the personal story. DeepSeek is surprisingly good at this. It pushes back where Claude was too agreeable.

Stage 3—Claude synthesizes. Claude gets both drafts and merges them. Personal voice from the Claude version, technical depth from the DeepSeek version. This is the final post. One more approval gate before anything goes live.

Publish—DeepSeek translates. When I hit publish, DeepSeek translates the final HTML into 12 languages, preserving all formatting. Each translation gets its own page on the blog with a language selector, a subscriber footer, and affiliate links built in.

Why Three Models Instead of One

Because each model has a different failure mode. Claude writes beautifully but can be too agreeable—it'll polish your bad idea until it sounds convincing. DeepSeek is more analytical and challenges assumptions, but it can lose the personal voice. Using them in sequence creates a system of checks and balances.

The real magic happens in Stage 3, where Claude synthesizes both perspectives. It's like having a good editor who knows when to keep the personality and when to add the rigor.

What Comes Out

A 10-minute rambling session becomes a 900-word post that reads like I spent hours on it. More importantly, it sounds like me—not like a generic AI blog post—because the source material is genuinely my unfiltered thinking and the prompts are tuned to preserve that voice.

The 12-language deployment means every post reaches audiences I'd never reach writing in English only, with zero extra effort on my part.

The Real Point

This system optimizes for the right thing—it removes all friction from the part that actually matters, which is the thinking. When I know the pipeline handles everything downstream, I stop worrying about how I sound and just think out loud. That's the whole point.

```

Get new posts

Subscribe in your language

New posts delivered to your inbox. Unsubscribe anytime.

Receive in: