System & Architecture
How smpl is built.
Persistent codebase intelligence is not a single product. It is four layers operating above your codebase as one integrated system — legibility, investigation, memory, and execution — each making the next more reliable.
This page is the architectural middle layer. The homepage gives the strategic frame. The technical paper goes deep into the optimization hierarchy. Here is what sits between them.
§ 01 · THE FOUR LAYERS
Four capabilities, stacked.
Persistent codebase intelligence breaks into four functions: making the system legible, evaluating work before execution, preserving what is learned, and acting on the result.
The order matters. Legibility makes investigation honest. Investigation makes memory accurate. Memory makes execution grounded. Each layer narrows the space of things the next layer can get wrong, which is why deployments that try to start from execution drift quickly and deployments that start from legibility tend to compound.
- 01LayerLegibility · Lux
The codebase, made readable as a system.
Lux combines semantic retrieval, structural code intelligence, relation-aware retrieval, and domain discovery into a continuously maintained structural view of your codebase — symbol-, type-, and reference-level truth that plain text search cannot provide.
The system is readable as a whole, not just searchable as text.
- 02LayerInvestigation · Recon
Evaluating work before engineers lose time.
Recon evaluates each ticket through a pipeline of bounded-epistemology stages — reconnaissance, archetype classification, problem scoping, and archetype-specific investigation — and emits the targeted pushback questions a senior engineer would ask during planning, grounded in actual codebase analysis.
Ambiguity surfaces before, not after, engineering time is spent on it.
- 03LayerMemory · Corpus
Institutional knowledge that compounds.
Corpus is the structured, git-backed knowledge repository beneath the system — institutional memory as a queryable substrate, not a wiki. Investigations, architectural decisions, and resolved incidents are captured as structured findings; specialist faculty agents accumulate domain expertise across many tickets in the same service.
The 50th answer is better than the 5th — and the system can show you why.
- 04LayerExecution · WorkStream
Grounded in everything above.
WorkStream orchestrates execution as a DAG: tasks decomposed with explicit dependencies, parallel work identified, each task run in its own bounded context. Where the layers above are stable, the system drafts changes, scaffolds migrations, and opens pull requests with tests and evidence — all carrying the context of the layers beneath them.
Execution without understanding is a liability. The architecture refuses it by design.
Execution expands only where the layers above it are mature enough to support it on a given codebase. We do not run pull requests against systems we cannot read structurally.
§ 02 · ARCHITECTURE
How smpl sits relative to your codebase.
A cross-section. Your repository and its history is the substrate; smpl is the layered intelligence maintained above it. Investigation agents query through ephemeral, read-only access scoped to a single evaluation.
§ 03 · OPERATIONAL MODEL
Read-only by default. Ephemeral by construction.
smpl operates in your environment, not on a third-party fork of your codebase. Investigation agents query through ephemeral, read-only access scoped per evaluation. Credentials are resolved per workspace and injected only for the duration of the work.
- In your environment
Deployed inside the boundary you already trust. We do not require source code to leave your infrastructure for any layer of the system to operate.
- Read-only by default
Lux, Recon, and Corpus operate without write access to the codebase. The only write surface in the architecture is execution-layer artifacts — patches, pull requests, tests — and those land as proposals subject to human approval.
- Bounded epistemology
Every investigation stage is a distinct agent with a single mandate, typed input and output, and a bounded epistemic domain. No stage sees the full investigation. No stage is permitted to reason outside its scope. The pipeline enforces the principle structurally.
- Human approval before merge
Even where execution is enabled, every change carries tests, type checks, linter compliance, and before/after evidence — and merges only after human review. Authority over what reaches production stays with humans.
§ 04 · WHAT THIS IS NOT
Not a coding assistant. Not a chatbot with a repo URL.
The category most adjacent to smpl is the AI coding assistant — autocomplete, chat-with-your-code, agent-in-an-IDE. The work smpl does looks similar from the outside (tickets become pull requests) but it operates on a fundamentally different mechanism, and the difference compounds.
A coding assistant is a tool a developer wields. smpl is a system that holds the codebase as a peer member of the team — building a structural view, evaluating work before implementation, preserving the result of every investigation, and grounding execution in everything that came before.
The differences worth naming:
- Persistence vs. session. Coding assistants forget between sessions; smpl preserves and deepens context across every ticket the team handles.
- Investigation vs. completion. Coding assistants generate code from intent; smpl evaluates whether the intent is solid before any code is written.
- Whole-system vs. file-level. Coding assistants reason within open files; smpl reasons across the system as a structural whole.
- Compound memory vs. fresh context. Coding assistants restart from zero; smpl's institutional memory is the substrate every new investigation queries.
None of this is about replacing engineers. It is about removing the operational tax of repeatedly rediscovering things the system already knows — and grounding execution in that recovered understanding.
§ 05 · DEEPER READING
Where this thinking is articulated in full.
This page is the architectural summary. Two longer pieces sit underneath it for evaluators who want the full mechanism and the principles it rests on.
-
What Is Neil?
The applied paper. Five layers of optimization, the seven-tier learning engine, the investigation pipeline, the four-level defense against context exhaustion, and the architectural reasons that "AI engineer" is not a thing you can buy.
Read the paper → -
Ontological Foundations
The underlying argument. What an agentic system has to be in order to operate inside a real codebase without losing it. Three layers, four modes, bounded epistemology, and the architectural reasons monolithic agents fail.
Read the paper →
This is the architecture. Your codebase is what makes it real.
Start with the Codebase Intelligence Review. We analyze one of your repositories — architecture, dependencies, domain structure, knowledge concentration — and deliver a written assessment that shows where the system is hardest to hold and what to do next.