Large language models have access to broader knowledge than any single human, yet they have not produced notable new ideas in the way that human polymaths sometimes do. Critics argue this limitation is structural: because language models are fundamentally designed to output the most likely next token based on massive training data, they are inherently tied to recombining pre-existing information rather than inventing something new. However, the author contests this, suggesting that human creativity is also about reshuffling ideas within the boundaries set by personal experience, which parallels the way language models function.
Examining whether language models have had new ideas, the article notes that while models are capable of providing novel feedback or solving new programming problems as presented by users, these instances do not typically fit what people mean by ´creative breakthrough.´ Academics increasingly use language models for data gathering and analysis, but the real test is whether these tools can independently generate the kind of groundbreaking ´Eureka moments´ that drive progress. While there are scattered examples—such as GPT-4 discovering results in combinatorics or studies suggesting LLM-generated ideas are more novel—these are critiqued as either serendipitous or lacking genuine promise, often seeming too random or reliant on external verification.
The article explores possible reasons for these creative constraints. One is the absence of real-world experimentation; many human breakthroughs emerge from physical interaction with the environment, which language models cannot directly replicate. Mathematics, however, is noted as an exception, yet even there, models have yet to produce substantial results. Another likely reason is the current lack of engineered frameworks to elicit new ideas from language models in a purposeful way. The author speculates that better scaffolding, akin to systems designed to help with code generation, may eventually unlock this creative potential. Until such systematic tools exist, the production of genuinely original ideas by language models will likely remain rare or unconvincing.