[ ] Developer Philosophy

$ Self-Encapsulated Agents: Building Autonomous CLI Tools in Rust

From mainframes to modern Rust CLIs—exploring how compile-time embeddings and offline-first architecture embody the principles of self-contained autonomous agents. A deep dive into building a blog distribution CLI that needs nothing but itself to run.

Starting Where I Started

I began programming in 2009 working with mainframes in the Israeli Defense Force’s J6 & Cyber Defense Directorate. Those systems had a particular elegance to them—self-contained, deterministic, with clear boundaries. Everything you needed was there. No npm install, no network requests, no version conflicts. Just a binary that did exactly what it promised.

Fast forward to 2025, and I found myself building a CLI tool in Rust that brought me back to that same principle: self-encapsulation. Not out of nostalgia, but because the problem demanded it.

What If Agents Were Distributed Like Software?

Here’s the thing: my blog is built with Astro, deployed as static HTML. Standard stuff. But I started wondering—what if knowledge bases weren’t just websites? What if you could compile them?

Not just the code. The knowledge itself.

What if you had a build pipeline that consolidates data, embeds it at compile time, tests it as a unit, and ships it as a single binary—like we do with operating system distributions? Download, install, run. No setup. No configuration. Just a purpose-built agent that knows what it needs to know.

That’s what I built. A CLI named kobi that’s a snapshot of my blog posts. But the real point isn’t the blog—it’s the pattern.

Self-Encapsulated Agents: A New Distribution Paradigm

Let me be direct about what this is:

A self-encapsulated agent is a purpose-built software entity where data, code, and capabilities are compiled together at build time, tested as a unit, and distributed as a single artifact that requires nothing but an OS to run.

This is different from general-purpose AI agents like ChatGPT or Claude. Those are flexible, cloud-based, and require constant connectivity. Self-encapsulated agents are the opposite:

  • Purpose-built - Designed for a specific domain with specific knowledge
  • Compile-time data - All knowledge embedded during the build
  • Offline-capable - Works without any network dependency
  • Tested as a unit - Data + code + capabilities verified together
  • Distributed like software - Download a binary, run it, done

Right now, kobi answers questions about my writing. It has the posts embedded, knows how to search them, can surface statistics. It’s a knowledge agent for a specific corpus.

But imagine this pattern applied broadly:

  • A legal research agent with case law and statutes embedded
  • A medical reference agent with current treatment guidelines
  • A codebase agent with your entire repository and its history
  • An operations agent with your infrastructure topology and runbooks

Each one: data + code + tools, compiled together, tested together, shipped together. Like building different Linux distributions for different purposes, but for agents.

The build pipeline becomes the key. You’re not just compiling code—you’re compiling knowledge into a deployable form.

The Architecture: Compile-Time Content Embedding

Here’s the core insight: instead of reading blog posts from the filesystem at runtime, we embed them at compile time using Rust’s build system.

The build.rs Pattern

Rust’s build.rs is a powerful tool that runs before compilation. We use it to generate Rust code from our MDX blog posts:

// build.rs
use std::env;
use std::fs;
use std::path::Path;

fn main() {
    let manifest_dir = env::var("CARGO_MANIFEST_DIR").unwrap();
    let content_dir = Path::new(&manifest_dir)
        .parent()
        .unwrap()
        .join("apps/site/src/content/blog");

    let mut posts = Vec::new();

    // Walk through blog directory
    for entry in fs::read_dir(&content_dir).unwrap() {
        let entry = entry.unwrap();
        let path = entry.path();

        if path.extension().and_then(|s| s.to_str()) == Some("mdx") {
            let content = fs::read_to_string(&path).unwrap();
            let metadata = extract_frontmatter(&content);

            posts.push(Post {
                slug: path.file_stem().unwrap().to_str().unwrap().to_string(),
                title: metadata.title,
                description: metadata.description,
                date: metadata.date,
                content: strip_frontmatter(&content),
                tags: metadata.tags,
                category: metadata.category,
            });
        }
    }

    // Generate Rust code with embedded posts
    generate_post_data(&posts);
}

This creates a compile-time snapshot of all blog content. The generated code looks like:

// Generated at compile time
pub const POSTS: &[Post] = &[
    Post {
        slug: "self-encapsulated-agents-rust-cli",
        title: "Self-Encapsulated Agents: Building Autonomous CLI Tools in Rust",
        content: include_str!("../generated/posts/self-encapsulated-agents-rust-cli.md"),
        // ... metadata
    },
    // ... all other posts
];

Why This Matters

The binary becomes a time capsule of knowledge—a snapshot of my thinking at the moment it was compiled. All content is in memory, searches are instant, and the binary and its data are atomically versioned together.

This is the foundation of the pattern: knowledge and code as a single deployable unit.

The CLI Interface: Clarity Through Constraints

With content embedded, we can focus on the interface. I used clap for argument parsing and designed around four core commands:

use clap::{Parser, Subcommand};

#[derive(Parser)]
#[command(name = "kobi")]
#[command(about = "Kobi Kadosh's blog posts in your terminal")]
struct Cli {
    #[command(subcommand)]
    command: Commands,
}

#[derive(Subcommand)]
enum Commands {
    /// List all blog posts
    List {
        #[arg(short, long)]
        tag: Option<String>,

        #[arg(short, long)]
        category: Option<String>,
    },

    /// Read a specific post
    Read {
        /// Post slug or search term
        query: String,
    },

    /// Search through all posts
    Search {
        /// Search query
        query: String,

        #[arg(short, long)]
        content: bool,
    },

    /// Show blog statistics
    Stats,
}

Each command operates on the embedded data—everything it needs is compiled in.

Rendering: Terminal as a First-Class Interface

One challenge was rendering Markdown in the terminal. I wanted syntax highlighting, proper formatting, and a good reading experience. Enter termimad:

use termimad::{MadSkin, rgb};

fn render_post(post: &Post) {
    let mut skin = MadSkin::default();

    // Custom styling for terminal
    skin.headers[0].set_fg(rgb(255, 200, 100));
    skin.headers[1].set_fg(rgb(200, 150, 100));
    skin.code_block.set_bg(rgb(30, 30, 40));
    skin.inline_code.set_bg(rgb(40, 40, 50));

    // Render with automatic width detection
    let width = terminal_size::terminal_size()
        .map(|(w, _)| w.0 as usize)
        .unwrap_or(80);

    skin.print_text(&post.content);
}

This gives us beautiful terminal rendering with syntax highlighting, tables, and proper word wrapping—all without external dependencies.

The Agent Parallel: Autonomy Through Self-Knowledge

Here’s where it gets interesting. The Rust CLI embodies many principles I’ve written about regarding AI agents:

1. Self-Verification

Just like AI agents need to verify their outputs, our CLI verifies its embedded content at compile time:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn all_posts_have_valid_dates() {
        for post in POSTS.iter() {
            assert!(post.date.is_valid(),
                "Post '{}' has invalid date", post.title);
        }
    }

    #[test]
    fn no_duplicate_slugs() {
        let mut slugs = std::collections::HashSet::new();
        for post in POSTS.iter() {
            assert!(slugs.insert(&post.slug),
                "Duplicate slug: {}", post.slug);
        }
    }

    #[test]
    fn all_content_is_valid_markdown() {
        for post in POSTS.iter() {
            // Basic markdown validation
            assert!(!post.content.is_empty());
        }
    }
}

These tests run at compile time. If the blog content is malformed, the binary won’t build. The agent verifies itself before it ever runs.

2. Bounded Autonomy

The CLI operates within clear boundaries:

  • Input: User queries and commands
  • Processing: Local search, filtering, rendering
  • Output: Terminal display
  • State: Read-only embedded data

Pure, deterministic transformations with no side effects.

3. Offline Intelligence

Search, filtering, and statistics all run locally against the embedded content:

impl Cli {
    fn search(&self, query: &str, search_content: bool) -> Vec<&Post> {
        let query_lower = query.to_lowercase();

        POSTS.iter()
            .filter(|post| {
                // Search in title and description always
                let in_metadata = post.title.to_lowercase().contains(&query_lower)
                    || post.description.to_lowercase().contains(&query_lower)
                    || post.tags.iter().any(|t|
                        t.to_lowercase().contains(&query_lower));

                // Optionally search in content
                let in_content = if search_content {
                    post.content.to_lowercase().contains(&query_lower)
                } else {
                    false
                };

                in_metadata || in_content
            })
            .collect()
    }
}

Knowledge and intelligence, both embedded. This is the essence of self-encapsulation.

Distribution: The Single Binary Dream

One of the most satisfying aspects is distribution. The entire build and release process is automated:

# .github/workflows/release.yml
name: Release

on:
  push:
    tags:
      - 'cli-v*'

jobs:
  build:
    strategy:
      matrix:
        include:
          - os: ubuntu-latest
            target: x86_64-unknown-linux-gnu
          - os: macos-latest
            target: x86_64-apple-darwin
          - os: macos-latest
            target: aarch64-apple-darwin
          - os: windows-latest
            target: x86_64-pc-windows-msvc

    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable

      - name: Build release binary
        run: |
          cd cli
          cargo build --release --target ${{ matrix.target }}

      - name: Create release archive
        run: |
          # Package binary for distribution
          tar czf kobi-${{ matrix.target }}.tar.gz \
            -C cli/target/${{ matrix.target }}/release \
            kobi

      - name: Upload to release
        uses: softprops/action-gh-release@v1
        with:
          files: kobi-${{ matrix.target }}.tar.gz

Users download a single binary for their platform. No runtime, no interpreter, no package manager. Just:

# Download
curl -L https://github.com/wildcard/kobi.kadosh.me/releases/latest/download/kobi-x86_64-unknown-linux-gnu.tar.gz | tar xz

# Run
./kobi read "feedback loops"

This is software distribution at its simplest.

The Binary Size Tradeoff

The compiled binary with all blog posts embedded is approximately 10MB. Compared to a Node.js app with dependencies (100MB+) or an Electron app (100MB+), this is lean. The HTML version is smaller (~5MB), but needs infrastructure to serve it.

The tradeoff is space for autonomy. 10MB buys you instant startup, offline access, and zero dependencies. For a self-encapsulated agent, that’s a bargain.

Practical Implementation Details

Content Processing

The frontmatter extraction is straightforward:

use serde::{Deserialize, Serialize};
use regex::Regex;

#[derive(Debug, Serialize, Deserialize)]
struct Frontmatter {
    title: String,
    description: String,
    date: String,
    draft: bool,
    #[serde(default)]
    tags: Vec<String>,
    #[serde(default)]
    category: Option<String>,
}

fn extract_frontmatter(content: &str) -> Frontmatter {
    let re = Regex::new(r"(?s)^---\n(.*?)\n---").unwrap();

    if let Some(caps) = re.captures(content) {
        let yaml = &caps[1];
        serde_yaml::from_str(yaml)
            .expect("Failed to parse frontmatter")
    } else {
        panic!("No frontmatter found");
    }
}

fn strip_frontmatter(content: &str) -> String {
    let re = Regex::new(r"(?s)^---\n.*?\n---\n").unwrap();
    re.replace(content, "").to_string()
}

Statistics Generation

One of my favorite features is the stats command:

fn generate_stats() {
    let total_posts = POSTS.len();
    let total_words: usize = POSTS.iter()
        .map(|p| p.content.split_whitespace().count())
        .sum();

    let mut tags_count: HashMap<&str, usize> = HashMap::new();
    for post in POSTS.iter() {
        for tag in &post.tags {
            *tags_count.entry(tag).or_insert(0) += 1;
        }
    }

    let avg_words = total_words / total_posts;

    println!("📊 Blog Statistics");
    println!("─────────────────");
    println!("Total posts: {}", total_posts);
    println!("Total words: {}", total_words.separated_string());
    println!("Average words per post: {}", avg_words);
    println!();
    println!("Top tags:");

    let mut tags_vec: Vec<_> = tags_count.iter().collect();
    tags_vec.sort_by(|a, b| b.1.cmp(a.1));

    for (tag, count) in tags_vec.iter().take(10) {
        println!("  {} ({})", tag, count);
    }
}

Running kobi stats gives me instant insights into my writing patterns—no analytics service required.

Lessons: What Rust Taught Me About Agents

Building this CLI reinforced several insights about autonomous systems:

1. Constraints Enable Creativity

The constraint of “no network, no filesystem” forced creative solutions. Compile-time embedding isn’t just an optimization—it’s a feature that enables true offline operation.

2. Type Safety Prevents Runtime Surprises

Rust’s type system catches errors at compile time that would be runtime failures in dynamic languages:

// This won't compile if date format is invalid
impl Post {
    fn published_date(&self) -> chrono::NaiveDate {
        chrono::NaiveDate::parse_from_str(&self.date, "%Y-%m-%d")
            .expect("Invalid date format")
    }
}

The binary cannot be built with invalid data. This is self-verification at the language level.

3. Binary Size is State Transparency

In AI agents, we worry about context windows and state management. In compiled software, binary size is a similar metric—it tells you how much “state” the agent carries.

A 10MB binary is honest about what it contains. There’s no hidden state, no remote configuration, no telemetry. What you download is what runs.

4. Determinism Builds Trust

Users trust the CLI because it’s deterministic. Same binary, same query → same result. Every time.

This is harder to achieve with AI agents, but the principle holds: predictability builds confidence.

The Meta Layer: AI-Assisted Rust Development

Here’s a confession: I built most of this CLI with AI assistance—specifically Claude Code. Not because I don’t know Rust (I do), but because the iteration speed is incredible.

The pattern was:

  1. Describe the feature in natural language
  2. Claude generates Rust code
  3. I review, understand, refine
  4. Compile and test
  5. Iterate

This is vibe engineering applied to systems programming. I orchestrated; the AI typed. The result is production-grade Rust code built in a fraction of the time.

The irony isn’t lost on me: I used an AI agent to build a tool about self-encapsulated agents. But that’s exactly the point—good tools enable better tool-making.

Future Directions: The Build Pipeline for Agents

This is where it gets interesting. Right now, kobi is a blog reader. But the pattern extends much further.

Embedding Models, Not Just Data

Imagine a build pipeline that embeds not just markdown files, but a pre-trained model optimized for the specific domain:

# Build pipeline consolidates data and model
$ cargo build --release --features=embed-model

# The build process:
# 1. Consolidates all blog posts
# 2. Generates vector embeddings
# 3. Embeds a small, domain-specific language model
# 4. Runs evaluation harness against test queries
# 5. Only builds if evaluations pass
# 6. Produces a single binary with data + model + tools

Now you have an agent that can:

  • Answer semantic queries about the content
  • Summarize across multiple posts
  • Find conceptual connections
  • Generate insights from the embedded knowledge

All offline. All in one binary.

Purpose-Built Distributions

Like building Linux distributions for different use cases (Ubuntu for desktops, Alpine for containers), you build agent distributions for different domains:

# Legal research agent
kobi-legal-v2.3.1-x86_64-linux.tar.gz
 Embedded: Supreme Court opinions 2020-2025
 Model: Fine-tuned on legal reasoning
 Tools: Citation finder, precedent search
 Evaluated: 95% accuracy on test cases

# Medical reference agent
kobi-medical-v1.2.0-aarch64-darwin.tar.gz
 Embedded: Current treatment guidelines
 Model: Medical knowledge base
 Tools: Drug interaction checker, ICD-10 lookup
 Evaluated: Validated against peer-reviewed corpus

# Codebase agent
kobi-repo-myproject-v4.1.2-x86_64-windows.tar.gz
 Embedded: Full git history, documentation
 Model: Trained on your coding patterns
 Tools: Symbol search, dependency analyzer
 Evaluated: 90% accuracy on historical bug queries

Each distribution is:

  1. Built from a data pipeline that consolidates the relevant corpus
  2. Tested through an evaluation harness with domain-specific benchmarks
  3. Versioned with the data, model, and tools as a single unit
  4. Distributed as a binary you download and run

The Build Pipeline

The build.rs pattern extends to full ML pipelines:

// build.rs
fn main() {
    // 1. Consolidate data
    let corpus = consolidate_blog_posts();

    // 2. Generate embeddings
    let embeddings = generate_vector_db(&corpus);

    // 3. Select and quantize model
    let model = load_and_optimize_model("phi-3.5-mini");

    // 4. Run evaluation harness
    let eval_results = run_evaluations(&corpus, &model);
    assert!(eval_results.accuracy > 0.90, "Model failed eval");

    // 5. Embed everything
    embed_data(&corpus);
    embed_model(&model);
    embed_embeddings(&embeddings);

    println!("cargo:rerun-if-changed=content/");
    println!("cargo:rerun-if-changed=model/");
}

Now your build fails if the model doesn’t meet quality standards on the embedded data. The binary is the guarantee.

Why This Matters

This inverts the current paradigm:

Today: General-purpose agents in the cloud that know everything and nothing Tomorrow: Purpose-built agents on your machine that know one thing deeply

Today, you ask ChatGPT about your codebase and it hallucinates. Tomorrow, you download kobi-myproject, and it knows your codebase because it was compiled with it.

The agent isn’t fetching data at runtime. It is the data, the model, and the tools—tested together, versioned together, distributed together.

Interactive Mode with Embedded Models

With an embedded model, the CLI becomes conversational:

$ kobi ask "What's the connection between agents and feedback loops?"

Analyzing embedded posts...

Based on your writing, you see agents as systems that require continuous
verification through feedback loops. In "Feedback Loops Part 3", you argue
that autonomous agents need self-verification mechanisms. This connects to
your broader theme of "write less, read more"—agents should spend more time
validating than generating.

Related posts:
- Feedback Loops Part 3: Agents
- Vibe Engineering: Orchestrating Instead of Typing
- Self-Encapsulated Agents (this post)

$ kobi ask "follow up: how does this relate to Rust's type system?"

Interesting connection. Rust's type system is itself a verification mechanism—a
compile-time feedback loop. You mentioned in this post that "type safety prevents
runtime surprises." That's the same principle as agent verification, but at a
different layer...

All of this runs offline. No API calls. No rate limits. Just your machine, your agent, your data.

Conclusion: Data and Code, Compiled Together

Building this Rust CLI brought me full circle—from mainframes that were self-contained by necessity, to modern web apps that depend on everything, back to binaries that depend on nothing.

But the real insight isn’t about blog posts in a CLI. It’s about a new distribution paradigm for agents.

We’ve spent decades perfecting how to build and distribute software: build pipelines, test harnesses, versioning, releases. We compile code, run tests, and ship binaries. It works.

Now we’re adding AI agents to our systems, and we’re treating them completely differently—cloud APIs, runtime data fetching, no versioning of knowledge with code, no evaluation gates before deployment.

What if we didn’t?

What if we treated agent knowledge like we treat code? Compile it. Test it. Version it. Distribute it. Run it locally.

Purpose-built agents with embedded knowledge, tested against evaluation criteria, shipped as binaries. Not general-purpose agents that know everything vaguely, but specialized agents that know one thing deeply.

The CLI named kobi is just one example. But imagine every codebase, every legal corpus, every medical knowledge base, every operations runbook—compiled into purpose-built agents you can download and run.

Data and code, hand in hand. Built together, tested together, distributed together.

That’s self-encapsulation. And it’s just getting started.


About this post: This essay began as a technical deep dive and evolved into a meditation on autonomy and self-sufficiency in software design. It was written collaboratively with Claude Code—an AI agent helping me write about building self-encapsulated agents. The recursion is intentional.

The complete source code for the CLI is available at github.com/wildcard/kobi.kadosh.me/tree/main/cli.

Want to try it?

# Download the binary for your platform
curl -L https://github.com/wildcard/kobi.kadosh.me/releases/latest/download/kobi-$(uname -s)-$(uname -m).tar.gz | tar xz

# Read this post in your terminal
./kobi read "self-encapsulated"

# Or search for other topics
./kobi search "agents" --content

Welcome to the future of agent distribution. Download, run, learn.