Subquadratic, a Miami-based AI startup, released results from third-party testing claiming its SubQ model runs far faster and far cheaper than typical large language models while handling context windows up to 12 million tokens.
Appen, an independent testing firm, measured SubQ as 56 times faster than models using FlashAttention in a theoretical speed test. The model scored 89.7% on LiveCodeBench for coding and 98% on a long-context retrieval test at 6 million and 12 million token windows.
The cost comparison is striking. Running Nvidia's RULER 128 benchmark on Anthropic's Opus 4.6 costs $2,600. Subquadratic says the same test costs $8 on SubQ. If accurate, that is not an incremental improvement. That is a different category of efficiency.
CEO Justin Dangel said the company is "kicking off a new age of efficiency." The claim rests on replacing transformers' dense attention mechanism with dynamic sparse attention that selectively multiplies token pairs rather than computing attention across all pairs. In theory, that solves the quadratic compute bottleneck that has constrained long-context processing.
In theory.
Opacity problem
Subquadratic declines to disclose the exact selection algorithm used for its dynamic sparse attention. That is a notable choice. If the company has solved a fundamental problem in AI efficiency, publishing the algorithm would establish credibility, attract talent and potentially command a premium valuation.
The silence suggests either that the solution is incremental rather than fundamental, or that the company believes it can maintain competitive advantage through secrecy. Neither is reassuring.
The company also bootstrapped SubQ from weights of an open-source Qwen model rather than training wholly from scratch. That means SubQ is an optimized version of an existing model, not a new architecture proving the approach works from first principles.
Access bottleneck
Subquadratic has kept access severely limited. The company says tens of thousands have joined a waitlist and more than 500 enterprises have signed up for early access. But very few people have live access to test the claims.
That is the real red flag. If SubQ delivers what the benchmarks claim, access would be the constraint limiting adoption, not proof of concept. Companies would fight for access. The fact that Subquadratic is managing access carefully suggests the company is controlling the narrative around what SubQ can actually do.
What remains to be proven
The third-party testing is real. Appen is a credible firm. The benchmarks are impressive on paper. But benchmarks are not products. A model that is 56 times faster in theoretical tests may not be 56 times faster in production workloads with real data and real constraints.
Subquadratic will eventually have to grant access to enough users that independent verification becomes possible. Until then, the efficiency gains remain claims, not proven capability