The first essay in this series argued that when everyone can build software, the factory wins. That primitives — composable building blocks designed for agents — dissolve the old tradeoff between scalability and customization. That the bottleneck has shifted from writing code to designing architecture.
Several people asked the same question: how do you actually build one?
This is the blueprint.
The Harness
Before the factory, the tool.
A coding harness is an interface between a human and a language model that operates directly on a codebase. Coding harnesses like Claude Code and OpenCode are two examples. You open a terminal, describe what you want, and the model reads files, writes code, runs commands, creates commits. It operates inside your project like a very fast, very literal colleague who never forgets a function signature but needs clear instructions about where to put things.
That last part is the key. A coding harness is only as good as the environment it operates in. Give it a messy codebase with no conventions and it will produce messy code with no conventions, confidently. Give it a well-structured codebase with clear boundaries and explicit constraints, and it will produce code that fits. The model doesn't get better. The architecture does.
This means the traditional distinction between "writing code" and "designing systems" isn't just a seniority ladder anymore. It's a division of labor between human and machine. The human designs the constraints. The machine operates within them. Every hour you spend on architecture saves ten hours of agent output you'd otherwise have to fix.
The factory is what happens when you take this division seriously.
The Filesystem as Constitution
Most developers think of a project directory as a place where code lives. In a software factory, the directory is the architecture. Every folder, every configuration file, every naming convention is an instruction to the agent about how to behave.
Here is the structure, abstracted:
Three zones. Three audiences. One coherent system.
Three layers of instruction govern what the agent does.
Skills are constitutional. They encode design patterns that apply across the entire workspace. "Every module splits into three layers." "The host owns identity, never the primitive." "Single-shot reasoning stays in the backend; multi-turn reasoning moves to the intelligence service." These aren't suggestions. They're laws. An agent that reads the primitive-pattern skill before building a new module will produce code that follows the same structural guarantees as every other module in the system — without being told which module to imitate.
The root CLAUDE.md is the workspace map. It tells the agent which folders exist, what each one does, which commands to run, and how the pieces relate. This is the file that turns a collection of repositories into a coherent system. Without it, the agent treats each folder as an isolated project. With it, the agent understands that a frontend module calls a backend component which is coordinated by an intelligence service — and can make changes across all three in a single session.
Per-module CLAUDE.md files are the implementation contracts. They specify the stack, the API, the edge cases, the things that are true about this module but not necessarily about any other. The agent reads the constitution first, then the local contract. The combination produces code that is structurally consistent across the system and contextually correct within each module.
This hierarchy mirrors how good organizations work. A constitution doesn't tell you how to run a hospital. It tells you the principles that any institution must follow. The hospital's operating procedures are a separate document, written within those constraints. The agent operates the same way: principles first, then specifics.
The Three-Layer Primitive
The building block of the factory is the primitive: a module simple enough to be universal but powerful enough to be composed into complex systems. Every primitive follows the same three-layer split.
The absence of a line from frontend to intelligence is the point.
The frontend primitive is a package of React components and hooks that renders UI. It receives user identity from the host application through context — it never authenticates, never stores tokens, never knows which app it's running inside. This means the same email module renders inside an accounting platform, a law firm dashboard, or a medical practice portal. The UI is identical. The host decides who the user is.
The backend primitive is an isolated database component. Its own tables, its own queries, its own mutations, deployable independently and composable with others. It can verify that a request is authenticated, but it doesn't own the authentication system. Think of it as a microservice that happens to share the same database engine as the host — but with strict isolation guarantees that prevent coupling.
The intelligence service is where reasoning lives. Classification, knowledge building, multi-turn agent loops. It reads and writes business data to the caller's database, not its own. The only state it owns is infrastructure: API credentials, connection metadata, subscription tracking. This is the key to multi-tenancy. Each customer's data stays in their database. The intelligence service is a processor, not a store.
The rule for deciding which layer owns a piece of logic: if it's a single question with a single answer — generate a draft, summarize a thread — it can live in the backend as a lightweight action. If it requires iteration, tool calling, accumulated context, multiple model turns, it belongs in the intelligence service. This boundary is where most architectures fail. Pushing complex reasoning into the backend creates timeouts and maintenance nightmares. Pulling simple operations into a separate service creates latency for no gain.
Freedom and Constraint, Revisited
The first essay argued that the hard part isn't making agents more capable. It's finding the balance between freedom and constraint. The blueprint makes this concrete.
Maximum constraint, minimum specificity — then the inverse.
The skills directory is maximum constraint, minimum specificity. "Every module has three layers." The agent has zero freedom to violate this, but complete freedom in how it implements each layer. The root CLAUDE.md is medium constraint — it specifies relationships between modules but doesn't dictate internal structure. The per-module CLAUDE.md is minimum constraint, maximum specificity: it describes this module's stack, this module's edge cases. The agent has freedom to write any code that satisfies the contract.
This gradient is not accidental. It mirrors how constitutions work in political systems. The United States Constitution doesn't specify tax rates. It specifies that Congress has the power to levy taxes. The specifics are delegated to legislation, which is delegated to regulation. Each layer adds specificity and reduces universality. The software factory works the same way. And for the same reason: if the constitution specifies tax rates, you have to amend the constitution every time the economy changes. If a skill file specifies which database to use, you have to rewrite the skill every time you switch databases.
What the Factory Produces
The output is not a SaaS product. It's not a consultancy deliverable. It's custom software assembled from composable primitives.
A new customer signs up. They describe their practice: a Belgian accounting firm with twelve employees and a specialization in international tax structures. The system seeds their environment with that description. The same frontend primitives render, but the intelligence service adapts: it classifies their incoming communications differently than it would for a medical practice, builds knowledge structures around tax regulations rather than patient intake, generates responses in the formal register of accounting correspondence rather than clinical notes.
Nothing was configured. Nothing was hard-coded. The primitives are profession-agnostic. The intelligence adapts. The customer sees a product that feels built for them. The factory sees another assembly from the same components.
This is the tradeoff that dissolves. The output is bespoke. The process is industrial.
The Compound Effect
Every engagement enriches the factory. A new primitive built for one customer — say, a document management module — becomes available for every future assembly. An intelligence pattern discovered while processing one domain transfers to another with zero modification, because the classification agent reads the practice description, not a hardcoded list of roles.
The tenth deployment takes half the time of the first. The hundredth takes a fraction. Not because the agents got smarter, but because the component registry got deeper and the constraints got tighter.
This is the margin structure that neither SaaS nor consulting can replicate. SaaS amortizes development cost across identical copies. Consulting amortizes nothing — every engagement starts roughly from scratch. The factory amortizes architecture: the primitives, the patterns, the constraints. Each deployment is different, but each one is cheaper than the last.
Who Builds This
Not a team of fifty. Not yet. The leverage of coding harnesses means a single founder with the right architecture can maintain a system that would have required a department five years ago. The constraint is not engineering bandwidth. It's architectural judgment: knowing which abstractions to build, which constraints to enforce, which boundaries to draw.
This is the part that doesn't automate. An agent can write a module. It cannot decide that the module should exist, or where its boundaries should be, or which constitutional principle it should follow. That requires the kind of intuition that comes from building the same system many times, noticing which decisions aged well and which ones didn't, and encoding those observations into the architecture before the next build.
I have been building this way — solo, with a coding harness as my primary engineering process — for just over two years. The compound effect is real. What separates the factories from the prototypes is not the models. It's the filesystem.
The storm is here. The models are faster and cheaper every quarter. Build the constitution before the weather turns.
The first essay in this series, "The Software Factory," is available at /trial_and_essay/the_software_factory.
