Deep Tech · Seed · Patent Pending

Hypernym

Runtime compression and provenance infrastructure for the AI stack. Hypernym compresses context at the semantic level—making models faster, cheaper, and provably grounded without changing your workflow.

AI inference is expensive.
Context is why.

Every AI agent—coding, research, legal, medical—needs to read massive amounts of context before it can act. This consumes tokens, burns compute, and scales cost linearly. The more capable the model, the worse the problem gets.

$3.54
Cost Per Task
Average agentic coding task on Devin without compression
388s
Time Per Task
Wall-clock SWE-bench task completion without optimization
0%
Baseline Resolution
Control group pass rate on hard SWE-bench problems (0/5)
Stack Depth
Hardware → Models → Context → Orchestrators → Apps. Every layer guesses.

Semantic compression that makes AI structurally smarter.

Hypernym operates at the layer between your data and your models. It compresses context while preserving—and proving—semantic fidelity.

01  Compression

87% context reduction through word-level semantic compression. A 3B model outperforms an 8B on Hypernym’s engine—same weights, denser context. 90%+ token reduction with controllable fidelity.

02  Provenance

Every output carries a mechanical source chain. Where a fact came from, how it was derived, whether it survives verification. Your provenance—portable, auditable, not platform-owned.

03  Compounding

What one agent figures out, the next one builds on. Intelligence compounds across agents and sessions without data crossing boundaries. The algorithms improve for everyone.

Mercury: Semantic Field Constriction

Based on our patent-pending research (Forrester & Sulea, 2025). A novel word-level semantic compression scheme that achieves 90%+ token reduction while preserving semantic similarity to source text.

Phase 1
P-Span Sweep
20 compression levels scanned to find optimal parameters for the input domain and content type.
Phase 2
Stochastic Trials
60 independent compression trials across high-dimensional slices. Each trial compresses via hypernym substitution independently.
Phase 3
Coherence Analysis
Pairwise cosine similarity scoring across all 60 trials. Pareto-optimal frontier identified for compression vs. fidelity.
Phase 4
Semantic Assembly
Single-linkage clustering at 0.85 similarity threshold. Facts ranked by cross-trial frequency. 90%+ = highly stable.
Core Mechanism
Semantic Field Identification

Analyzes input text to map lexical items to semantic hierarchies. Identifies which specific terms share broader categorical relationships.

Hypernym Substitution

Related words are replaced with their higher-order semantic category (hypernym). Positional markers and metadata preserve reconstruction context.

Lossless Reconstruction

Sufficient contextual information is retained to recover original semantic precision. Controllable fidelity dial—from aggressive compression to fully lossless.

Compression vs. Semantic Fidelity (Pareto Frontier)
SECTORS: Code · Prose · Legal · Medical · Academic · Financial PAPER: arxiv.org/abs/2505.08058

Measured results, not projections.

SWE-bench Verified, model compression across 1,200 samples, and mechanical verification across 800 samples. All reproducible.

SWE-Bench Speed
388s → 192s task completion.
30× more context in same window.
10× throughput per dollar.
3B > 8B
Model Compression
Granite 3B outperforms Llama 3.1 8B.
1.36× faster. 2.3× smaller.
1,200 sample benchmark.
100%
Grounding Rate
Cross-model hybrid architecture.
Zero fabrication. Every claim traced.
800 sample mechanical verification.
Compression Performance Across Benchmarks
Cost & Time Reduction

Agent Benchmark: Devin on SWE-bench Verified

Measuring wall-clock time reduction and task resolution when AI coding agents receive Hypernym-compressed codebase context. Controlled experiment, reproducible results.

Experiment Design
test: pytest-dev__pytest-5787
difficulty: 1–4 hours
external_solves: 31
control_runs: 5
treatment_runs: 5
benchmark: SWE-bench Verified (Opus 4.6)
Control — Without Hypernym
0 / 5
Treatment — With Hypernym
4 / 5 — 80%
→ Full interactive benchmark at mark2.hypernym.ai
87%
Context Compressed
Semantic fidelity preserved through stochastic trial consensus
96.4%
Size Reduction
Raw token count reduction while maintaining all critical facts
$3.54
$0.35
Per-Task Cost
388s
192s
Wall-Clock Time
At Scale: Annual Savings Estimate
$11,500
Per Developer / Year
730 hrs
Dev Time Saved / Year
30–60%
Routine Task Speedup

Compression metrics across evaluation.

Metric Value Method Notes
Context Compression 87% 60-trial stochastic consensus Measured on SWE-bench Verified codebases
Semantic Similarity 87.2% Cosine similarity, source vs. compressed Cross-genre: code, prose, scientific text
Token Reduction 90%+ Mercury hypernym substitution Controllable fidelity dial (aggressive to lossless)
Compression Ratio 9.7× Input tokens / output tokens Demonstrated on Wizard of Oz Chapter 1 (1,840 → 191 tokens)
Grounding Rate 100% Mechanical verification, not LLM 800 sample benchmark, zero fabrication
Task Resolution Lift 0% → 80% SWE-bench Verified (Opus 4.6) pytest-5787, 1–4h difficulty
Cost Per Task $3.54 → $0.35 Token cost at inference 10× reduction, Devin benchmark
Wall-Clock Speedup End-to-end task time 388s → 192s on SWE-bench

One engine. Every domain that runs inference.

Hypernym’s compression is domain-aware and sector-configurable. The same engine adapts to code, scientific research, legal text, and medical literature. Partners define their domains—Hypercore is the engine they run on.

Models & Inference

Model providers use Hypernym to make smaller models competitive with larger ones. Compression algorithms score in ways model providers are years behind.

Llama · Granite · Devin · Command · Claude
LivePilotPipeline

Code & Developer Tools

AI coding agents get compressed codebase context via simple API. GitHub OAuth integration, every session, all day. 30–60% raw speedup on tasks.

Hypercode · Windsurf · Claude Code · Cline
LivePipeline

Memory & Knowledge

Memory platforms use Hypernym to compress and cross-reference persistent context. What one session learns carries forward to the next.

Amble · Wordware · Boardy · Halo
LivePilotPipeline

Biomedical & Legal

Domain-specialized compression for research literature, clinical data, patent filings, and legal documents. 597M+ records indexed across 26 databases.

Osmium (biomedical) · TrustFoundry (legal)
LivePilot

Deploys into your stack. Invisible to your users.

Cloud API

Managed service. One API call. GitHub OAuth for repo access. Cerebras-powered inference for rapid response.

Private VPC

Deploy to AWS VPC. Use your own inference providers. Data completely insulated, end to end. Containerized.

Compatible With

Claude Code · Codex · Devin · MCP-compatible tools. Works orthogonally with existing optimization.

Meta AWS Cohere Intel NVIDIA