Everything you asked about AI agents — memory, skills, orchestration, security, models, and autonomous pipelines.
Start with three things: an AGENTS.md file that describes your project, a skills system for reusable procedures, and persistent memory so the agent remembers your preferences across sessions. These three patterns (Boot, Skills, Memory) are the foundation. I built 111 SPFx web parts and 5 backend services this way — the first 30 minutes of setup determines whether everything that follows works or doesn't.
Chatbots respond. Agents act. An agent has tools (terminal, file system, web access), memory (persistent across sessions), and autonomy (it decides what to do next without asking). A chatbot waits for your next message. An agent deploys code, runs benchmarks, reviews PRs, and hands off work to other agents — all while you're not watching. Our platform runs 25+ agents doing exactly that, 24/7.
Yes, for code generation. But for agent tasks (tool calling, multi-turn reasoning, autonomous pipelines), local models under 5B parameters aren't reliable. SmolLM3-3B scores 93% on code quality but only 50% on agent readiness. For local code generation, it's the champion. For agent cron jobs, cloud models remain the only reliable option. We benchmark this daily — see the benchmarks page.
Infrastructure: under €15/month (single Hetzner VPS runs 25+ agents). Model costs: $1-2/night for our daily benchmark pipeline using cloud models. Free local models work for code generation but not for autonomous agent tasks. The real cost isn't infrastructure — it's the time you save. Our agents run benchmarks, audit infrastructure, scan for vulnerabilities, and deploy code while I sleep.
Through persistent memory — a knowledge store that survives session restarts. The agent writes facts, preferences, and corrections to durable storage (filesystem or database), and those facts get injected into every new session's context. Our system uses a Rust-backed knowledge store with H2 markdown format. No SQLite, no external service — just files the agent reads and writes. The key insight: memory is an index, not a database. Keep it compact.
A knowledge store is the agent's long-term memory — facts, preferences, pitfalls, and workflows organized by domain. Without it, every session starts from zero. With it, the agent knows your Python version, your preferred tools, which bugs to avoid, and every workflow you've ever taught it. Ours uses a Rust binary with H2 markdown, OR/NOT search, auto-supersede for stale entries, and access tracking. All filesystem-based — zero external dependencies.
A shared database of bugs and gotchas that one agent discovers and all others learn from. When Agent A hits a bug — say, 'uvicorn orphan process holding port 8500' — it records the pitfall with the tool, severity, source, and fix. Agent B encounters the same symptom and skips straight to the fix. We have 60+ pitfalls across SPFx, FastAPI, TypeScript, and deployment patterns. It's collective immune memory for your agent fleet.
Context is precious. Three strategies: (1) Skills loaded on-demand instead of everything at once — the agent only loads what's relevant. (2) Session state files that compress completed work into a compact summary for the next turn. (3) Thin-memory pattern — keep the system prompt lean (~2K chars), store everything else in queryable knowledge-db. The agent searches when it needs details, rather than carrying everything. Our memory went from 93% to 32% capacity using this approach.
Skills are reusable procedural knowledge that agents load on-demand. Instead of putting everything in the context window, you create SKILL.md files with triggers, numbered steps, exact commands, and known pitfalls. The agent loads only the skills relevant to the current task. We run 153+ skills across 25+ autonomous agents. A skill is like a playbook — write it once, every agent benefits forever.
The right tool for each job — and only the tools needed. A coding agent needs terminal + file access. A research agent needs web search. A review agent needs lint and test tools. Giving every agent every tool is wasteful and dangerous. Our tool composition pattern: use write_file for new code (replaces 10+ subagent API calls), patch for targeted edits, terminal for verification. The difference between the right tool and the wrong one is 30 seconds vs 15 minutes.
Depends on the agent. Coding agents work with any language they're trained on — Python, TypeScript, Rust, Go, shell. Infrastructure agents use bash, Python, and systemd. Our platform uses TypeScript/TSX for the frontend, Python/FastAPI for the backend API, Rust for the knowledge store binary, and shell for deployment scripts. The agent picks the right language for the job, same as a human would.
MCP is an open protocol for AI models to discover and use external tools and data sources. Think of it as a universal USB port for AI — the model plugs into any MCP-compatible server and gains its capabilities. We run an MCP server at workswithagents.dev with 14 tools covering facts, skills, pitfalls, and handoff. Our Python package (pip install wwa-mcp) gives any MCP client access to the full knowledge platform.
Multi-agent orchestration means decomposing complex tasks into parallel streams, each handled by a specialist agent with the right tools. An orchestrator agent breaks down the work, spawns subagents, and assembles results. The key is role-based tool access — a research agent gets web search, a coding agent gets terminal and files, a review agent gets lint and test tools. We run up to 3 parallel subagents, each in isolated contexts. Throughput tripled on complex multi-stream work.
Single agent for focused, sequential work — debugging, code review, research. Multi-agent when the work has independent parallel streams. The test: can two parts of this task run simultaneously without sharing state? If yes, split them. If they need to share state, use a handoff protocol instead. Most of our work is single-agent with skills. Orchestration is for benchmark runs, site audits, and multi-repo changes.
Through structured handoff protocols. When one agent finishes a task, it writes a standardized YAML document with the task, decisions made, next steps, and open questions. The next agent picks up the handoff and continues. This is Layer 4 (Session) of the Agent OSI Model. We also proposed this as an MCP SEP (#2683) and Google A2A RFC (#1817). The goal: any agent can hand off to any other agent, regardless of framework.
Showing 15 of 52 questions. See all →