AI systems and agent infrastructure
The harness is the product, not the prompt.
We build the system that survives production. State, recovery, observability, orchestration. Full stack, database to deployment.
Every business is moving onto AI, the way they once moved off paper. Anyone can ship a prompt and a demo. The hard part is the architecture that keeps an agent correct, cheap, and recoverable in the real world. That is harness design — where systems are won or lost.
- Paper
- Prompt + demo
- Harness
What separates a system from a feature
Reliability is a design decision, not a patch.
Prompt-first
Sub-agent runs
side effect lands — API call, row written
Sub-agent dies
mid-task, after the effect
Orchestrator retries from the top
knowing none of it
Duplicate charge · corrupted data
the agent still looks fine
The default for agents built prompt-first.
How the design holds
Checkpoint state
before any stateful operation
Sub-agent runs
idempotency token per call
skip duplicateSub-agent dies
failure still happens — by design
retryRecovering agent resumes
reads checkpoint, does not repeat
resumeFailure is recoverable by construction.
Execution trace
- 00.00checkpoint written
- 00.01tool call started
- 00.04side effect landed
- 00.05worker failed
- 00.06recovery read checkpoint
- 00.06duplicate avoided
- Idempotency tokens
- Circuit breakers
- Cost guards
- budget ok
idem_key = sha256(agent_id + step + payload)
MRHF-Codec vs. Meta's DAC
A neural audio codec we built and measured. High-frequency reconstruction at 6 kHz and above, against Meta's Descript Audio Codec baseline.
| Metric | MRHF-Codec | Baseline (Meta DAC) | Delta |
|---|---|---|---|
| HF-SI-SDR @ 6 kHz+ | +11.64 dB | −30.44 dB | +42 dB |
| Bitrate | 11.8% lower | baseline | −11.8% |
| Metrics won | 7 / 10 | — | 7 / 10 |
MRHF-Codec — 11.7M-param generator, trained on 275,527 files (187.4 hrs), 240+ tests across 9 modules. Source: research project benchmark.
What we build
Eight things we do, backed by shipped work.
From systems design to a research-grade ML edge.
Evidence, not adjectives
Shipped systems and open-source infrastructure.
Design judgment shows up in what survived production.
- Production
- A multi-tool AI-agent SaaS, web and desktop. 12+ API integrations, hardened connectors.
- Ecosystem
- goalkeeper on the Claude Code marketplace, plus reaper-mcp and vst-bench. Servers others run.
- Day one
- HIPAA-adjacent. Audit logging, on-device ML, designed in.
- Research
- Neural codecs and diffusion models, benchmarked.
How we work together
Three ways to bring us in.
01
Advisory
02
Fractional CTO
03
Build
Your first move onto AI, or keeping a production system honest. The work we do.
Tell us where you are with AI. We will tell you plainly if we can help.