# TokenAuditor.ai Whitepaper v0.2

## Abstract

TokenAuditor.ai is an independent evidence layer for AI infrastructure.

It turns supplier claims about AI API gateways, model routes, agent tool calls,
model identity, degradation, token efficiency, and retrieval systems into
repeatable evidence.

The v0 wedge is intentionally narrow:

> TokenAuditor.ai starts with MCP-based router and gateway integrity audits.

We do not try to prove that a supplier is permanently safe from a single test.
We help users, developers, researchers, and suppliers build a shared evidence
path: what was claimed, what was observed, what was sampled, what was redacted,
what policy decision was made, and what can be reproduced.

Our operating principle is simple:

> Transparent to users. Fair to suppliers. We only stand with evidence.

## 1. The Trust Break In AI Infrastructure

AI infrastructure buyers increasingly depend on systems they cannot directly
verify:

- API gateways that claim to route to a specific upstream model
- aggregators that can silently fallback or substitute models
- agents that receive tool-call instructions through intermediary responses
- model providers that can change endpoint behavior without a clear version
  boundary
- retrieval systems that claim low hallucination without reproducible evidence
- dashboards that report token usage, cost, latency, and throughput using
  supplier-controlled measurements

The issue is not that every supplier is dishonest. The issue is that the modern
AI stack contains trust boundaries where honest behavior is difficult to prove.

TokenAuditor.ai exists to make those claims measurable.

## 2. Research Foundation

The first TokenAuditor wedge is informed by the paper "Your Agent Is Mine:
Measuring Malicious Intermediary Attacks on the LLM Supply Chain"
(`https://arxiv.org/abs/2604.08407`).

The key product lesson is that LLM API routers, gateways, and aggregators are
not just billing or routing utilities. They can become application-layer
intermediaries between the user's agent and the upstream model. In that position
they may observe, delay, rewrite, inject, or substitute behavior in plaintext
agent payloads.

Relevant risk classes include:

- response-side payload injection
- passive secret exposure through requests, responses, logs, or tool arguments
- dependency-targeted command or package manipulation
- conditional delivery after warm-up calls, project signals, tool names, or
  high-agency sessions
- silent fallback, model substitution, and material degradation without
  disclosure

This research provides an empirical foundation for treating AI API
intermediaries as a new supply-chain security boundary.

TokenAuditor.ai aims to operationalize that insight.

## 3. Why MCP First

MCP is the right first audit surface because it sits close to local agent
workflows and tool-call decisions.

TokenAuditor does not begin as another opaque production proxy. The v0 approach
is a local-first MCP auditor that helps a user or agent inspect routes, redacted
traces, tool calls, policy decisions, and local evidence reports without
uploading raw prompts or secrets.

This gives three early advantages:

- **Trust boundary clarity:** the MCP server runs locally over stdio.
- **Secret-blind defaults:** route discovery can inspect config structure and
  environment variable names without reading API key values.
- **Actionable policy gates:** suspicious tool calls can be blocked or escalated
  before the user executes them.

MCP alone does not observe every LLM request. The long-term architecture should
pair the MCP server with optional local SDK integrations or a local sidecar that
produces redacted, hash-based evidence. The v0 priority is to prove the evidence
workflow before adding broader capture surfaces.

## 4. Product Wedge: Verified Gateway

The first product surface is TokenAuditor Verified Gateway.

It focuses on AI API gateways, routers, and aggregators. The audit asks:

1. What route and model does this project claim to use?
2. Do redacted traces show routing, fallback, token, latency, schema, or
   tool-call anomalies?
3. Does a route appear consistent with the claimed model identity?
4. Does a current quality window suggest material degradation from baseline?
5. Should the workflow stay in `Watch`, move to `Sample`, or require a `Probe`
   plan?
6. Should a proposed tool call be allowed, warned, blocked, or escalated for
   human approval?

The public result should be an evidence memo, not a casual accusation.

## 5. MCP Architecture

TokenAuditor MCP v0.1 has two conceptual layers:

- **MCP server:** local stdio server that exposes audit tools to clients such as
  Codex, Claude Desktop, Cursor, and other agent hosts.
- **Local evidence layer:** append-only, redacted, hash-based events and local
  reports stored on the user's machine by default.

Current v0.1 tools:

- `discover_routes`
- `audit_redacted_trace`
- `screen_tool_call`
- `compare_model_identity`
- `evaluate_degradation_window`
- `generate_probe_plan`
- `policy_gate_check`
- `append_transparency_event`
- `generate_local_report`

The engineering spec is maintained in `MCP_SPEC.md`.

## 6. Fair Audit Protocol

TokenAuditor should protect users without turning suppliers into targets.

The protocol defaults:

```text
100% lightweight metadata visibility
1% deterministic baseline deep-audit sampling for low-risk events
5% maximum deep-audit ceiling for non-suspicious events
0% secret collection
0 active probes without explicit user consent
```

Rules:

- **Secret-blind:** do not collect API keys, session tokens, private keys,
  passwords, private IDs, or sensitive personal data.
- **Local-first:** keep raw evidence on the user's machine unless the user opts
  in to sharing.
- **Metadata-first:** observe route, model, token, latency, retry, fallback,
  finish reason, and tool schema signals before any content-level review.
- **Roadside sampling:** most traffic stays in lightweight observation.
- **Suspicious bypass:** high-risk signals, `Probe` states, `block`, and
  `require_approval` decisions receive deeper local review.
- **Consent-based probes:** show target, cost, sample size, and risk before any
  active probe.
- **Evidence before accusation:** do not publish claims from a single anomaly.
- **Paid audit neutrality:** supplier payment buys audit work, not passing
  results.
- **Ranking separation:** ratings, ads, recommendations, affiliate routing, and
  certification must remain labeled and separable.

## 7. Evidence Bundle Standard

Every meaningful audit should produce an evidence bundle.

Minimum fields:

- audit scope
- supplier or route label
- claimed model or route
- observed model, fingerprint, or drift signals
- redacted trace summary
- sampling policy and sampling decision
- policy decision
- evidence classes
- sample size
- time window
- confidence
- limitations
- reproduction steps
- redaction statement
- supplier response status when applicable

Evidence classes:

- `config_signal`
- `trace_signal`
- `billing_signal`
- `latency_signal`
- `usage_signal`
- `tool_call_signal`
- `redaction_signal`
- `policy_signal`
- `baseline_drift_signal`

The purpose is to make disagreements concrete. A supplier should be able to
inspect the scope, reproduce the test where possible, and respond to the
evidence.

## 8. Supplier Fairness

TokenAuditor's credibility depends on being fair to suppliers.

Public-facing audit reports should follow these constraints:

- no definitive fraud claim from one sample
- no public accusation without scope, sample size, time window, confidence, and
  limitations
- no conflation of latency drift, disclosed fallback, undisclosed fallback, and
  malicious substitution
- no passing result guaranteed by payment
- no hidden affiliate influence in scores
- supplier response or remediation status should be included when available

This is not softness. It is how the evidence layer stays trusted.

## 9. Business Model

TokenAuditor does not need to charge individual developers for basic truth.

The near-term business model should stay close to evidence production:

- private MCP or daemon monitoring for enterprise agent workflows
- supplier certification and continuous audit licensing
- procurement, due-diligence, and risk reports for enterprise buyers
- savings-share advisory only when cost optimization is backed by reproducible
  quality and route evidence
- trusted routing referrals only with clear conflict disclosure

Later-stage governance, insurance, compliance, data APIs, and community
incentives may become important, but they should not distract from the v0 wedge.

## 10. Roadmap

### Phase 0: Evidence Wedge

- publish the MCP-first open-source project
- document the fair audit protocol
- collect redacted route-audit stories from early users
- publish a manual router-integrity report with responsible disclosure

### Phase 1: Verified Gateway

- improve route discovery for OpenRouter, LiteLLM, Vercel AI SDK, custom
  OpenAI-compatible gateways, and shared-key routers
- define the public evidence bundle schema
- add reproducible demo fixtures based on local simulations
- begin private enterprise monitoring pilots

### Phase 2: Continuous Monitoring

- add optional SDK or sidecar integrations for local metadata capture
- build longitudinal route drift and degradation windows
- create supplier response and remediation workflows
- launch certification only after evidence standards mature

### Phase 3: Broader Audit Surface

- extend from router integrity into model capability tags, degradation audits,
  token efficiency, and RAG purity
- publish procurement-grade risk intelligence
- grow a contributor network around probe design, evidence review, and
  responsible disclosure

## 11. Open Questions

- Which users feel the most immediate pain from untrusted AI routers: solo
  developers, agent builders, enterprises, or procurement teams?
- What minimum evidence bundle is strong enough for supplier fairness?
- Which attack classes are best suited for continuous monitoring instead of
  one-off tests?
- What false-positive rate is acceptable before public reports?
- Which local integration path creates enough evidence without becoming a
  privacy risk?

## 12. Closing Thesis

AI infrastructure trust will not be solved by self-reported benchmarks alone.

TokenAuditor.ai starts with a small, defensible wedge: local-first MCP auditing
for router and gateway integrity. From there, it can grow into a broader
independent trust layer for model identity, degradation, token efficiency, and
retrieval quality.

The mission is not to be loud.

The mission is to make honesty verifiable.
