Subquadratic, a Miami startup that emerged from stealth in May with $29mn in seed funding, says its SubQ language model can bypass the quadratic attention bottleneck that makes large language models slow, costly, and power-hungry. The company says SubQ can read up to 12 times as much text at once by replacing dense attention with a dynamic sparse attention system that focuses only on the most relevant word pairs.
Independent testing by Appen gave the company stronger evidence after early skepticism and comparisons with Theranos. SubQ ran 56 times faster than FlashAttention in a raw speed test and scored 89.7 per cent on a difficult coding benchmark. Subquadratic also says one long-context test that costs about $2,600 on Anthropic’s top model cost eight dollars on SubQ.
Significant questions remain. SubQ is not broadly available, with tens of thousands on a waitlist and only limited access so far. The model was adapted from an existing open-weight model rather than trained from scratch, and an independent researcher said the public evidence does not yet prove that Subquadratic has solved the quadratic attention bottleneck.
