Developers debate large language model coding in complex production codebases

December 20, 2025

Hacker News users shared detailed experiences using large language models inside messy, established codebases, from fully agent-driven workflows to strict bans on generated code. The discussion highlights productivity gains, testing strategies, and persistent limits around context, integration testing, and trust.

The original poster describes a startup that has deeply integrated large language models into everyday development across a monorepo that includes scheduled Python data workflows, two Next.js apps, Temporal workers and a Node worker. Each engineer receives Cursor Pro with Bugbot, Gemini Pro, OpenAI Pro, and optionally Claude Pro, and the poster estimates that large language models are worth about 1.5 excellent junior/mid-level engineers per engineer, which they argue easily justifies paying for multiple models. Heavy use of pre-commit hooks, type checkers, tests and auto-formatting lets models focus on producing types and tests, while coding standards and conventions are encoded in .cursor/rules and AGENT.md-style files to steer agents away from raw SQL and toward specific schema files.

The team leans on GitHub Enterprise primarily for its Copilot issue assignment feature: their rule is that if you open an issue you must assign it to Copilot, which then opens a pull request, and roughly 25% of “open issue → Copilot PR” results are mergeable as-is and get to ~50% with a few comments. The poster says that overall, for roughly ?k/month, they are getting the equivalent of 1.5 additional junior/mid engineers per engineer, with these “large language model engineers” consistently writing tests, following standards, producing good commit messages and working 24/7. However, they also report pain points: Copilot’s model choice cannot be controlled for issues or reviews, agents in worktrees are fragile, and verifying changes often requires spinning up Temporal, two Next.js apps, several Python workers, a Node worker, and a browser, which makes integration testing slow and difficult to automate.

Other commenters report a wide spectrum of experience and caution. Some developers find large language models highly effective for boilerplate, unit and integration test generation, one-off scripts, and refactoring in smaller or well-structured areas, treating tools such as Claude Code, Copilot or Cursor as a junior pair programmer and insisting on small, incremental changes with plans and tests first. Several teams describe elaborate guardrails: dockerized dev containers without production credentials, CONTRIBUTING.md or Claude.md files encoding rules, custom linting and test pipelines, feature or roadmap markdown files that act as persistent memory, and staged, stacked pull requests with multiple automated review agents. Others emphasize that context window limits, legacy code complexity and long-range architectural concerns still defeat current models, arguing that they cannot replace a human mental model of a large, messy codebase and that they tend to duplicate code, miss subtle concurrency bugs or fail on giant legacy files. At the far end, one open source maintainer states that their project has banned all large language model generated code after repeated experiments produced plausible but fundamentally wrong suggestions, reflecting ongoing skepticism about relying on these tools in critical, long-lived systems.

Source

55

Impact Score

Latest News

Research excellence at the UF College of Medicine in 2025

January 30, 2026

In 2025, the University of Florida College of Medicine expanded its research footprint across cancer, neuromedicine, diabetes, and women’s and children’s health, leveraging artificial intelligence to accelerate discovery and clinical impact.

What EO 14365 means for state artificial intelligence laws and business compliance

January 30, 2026

Executive order 14365 signals a push toward a national artificial intelligence policy that could preempt certain state regulations without immediately changing existing compliance obligations. Businesses using artificial intelligence are advised to monitor forthcoming federal actions while continuing to follow current state laws.

Generative Artificial Intelligence reshapes europe’s economy, society and policy

January 30, 2026

The european commission’s joint research centre outlines how generative artificial intelligence is altering research, industry, labour markets and social equality in the EU, while highlighting gaps in patents, investment and safeguards. The report points to both productivity gains and rising risks that demand coordinated policy responses.

How UK SMEs are using artificial intelligence to compete with larger rivals in 2026

January 30, 2026

United Kingdom small and medium sized enterprises are using practical artificial intelligence tools to close the gap with larger competitors, shifting from experimentation to targeted productivity gains. Adoption is rising quickly, but a lack of skills and understanding is emerging as a bigger barrier than cost.

Intel produces chips for apple and nvidia as foundry capacity tightens

January 30, 2026

Intel is positioned to manufacture chips for apple and nvidia as capacity constraints challenge established contract chipmakers and geopolitical dynamics shift government support toward domestic production.

Developers debate large language model coding in complex production codebases

55

Impact Score

Latest News

Research excellence at the UF College of Medicine in 2025

What EO 14365 means for state artificial intelligence laws and business compliance

Generative Artificial Intelligence reshapes europe’s economy, society and policy

How UK SMEs are using artificial intelligence to compete with larger rivals in 2026

Intel produces chips for apple and nvidia as foundry capacity tightens

Contact Us