Developers debate large language model coding in complex production codebases

Hacker News users shared detailed experiences using large language models inside messy, established codebases, from fully agent-driven workflows to strict bans on generated code. The discussion highlights productivity gains, testing strategies, and persistent limits around context, integration testing, and trust.

The original poster describes a startup that has deeply integrated large language models into everyday development across a monorepo that includes scheduled Python data workflows, two Next.js apps, Temporal workers and a Node worker. Each engineer receives Cursor Pro with Bugbot, Gemini Pro, OpenAI Pro, and optionally Claude Pro, and the poster estimates that large language models are worth about 1.5 excellent junior/mid-level engineers per engineer, which they argue easily justifies paying for multiple models. Heavy use of pre-commit hooks, type checkers, tests and auto-formatting lets models focus on producing types and tests, while coding standards and conventions are encoded in .cursor/rules and AGENT.md-style files to steer agents away from raw SQL and toward specific schema files.

The team leans on GitHub Enterprise primarily for its Copilot issue assignment feature: their rule is that if you open an issue you must assign it to Copilot, which then opens a pull request, and roughly 25% of “open issue → Copilot PR” results are mergeable as-is and get to ~50% with a few comments. The poster says that overall, for roughly ?k/month, they are getting the equivalent of 1.5 additional junior/mid engineers per engineer, with these “large language model engineers” consistently writing tests, following standards, producing good commit messages and working 24/7. However, they also report pain points: Copilot’s model choice cannot be controlled for issues or reviews, agents in worktrees are fragile, and verifying changes often requires spinning up Temporal, two Next.js apps, several Python workers, a Node worker, and a browser, which makes integration testing slow and difficult to automate.

Other commenters report a wide spectrum of experience and caution. Some developers find large language models highly effective for boilerplate, unit and integration test generation, one-off scripts, and refactoring in smaller or well-structured areas, treating tools such as Claude Code, Copilot or Cursor as a junior pair programmer and insisting on small, incremental changes with plans and tests first. Several teams describe elaborate guardrails: dockerized dev containers without production credentials, CONTRIBUTING.md or Claude.md files encoding rules, custom linting and test pipelines, feature or roadmap markdown files that act as persistent memory, and staged, stacked pull requests with multiple automated review agents. Others emphasize that context window limits, legacy code complexity and long-range architectural concerns still defeat current models, arguing that they cannot replace a human mental model of a large, messy codebase and that they tend to duplicate code, miss subtle concurrency bugs or fail on giant legacy files. At the far end, one open source maintainer states that their project has banned all large language model generated code after repeated experiments produced plausible but fundamentally wrong suggestions, reflecting ongoing skepticism about relying on these tools in critical, long-lived systems.

55

Impact Score

MEPs back delay for parts of Artificial Intelligence Act

European Parliament committees have endorsed targeted delays to parts of the Artificial Intelligence Act while adding a proposed ban on certain non-consensual image manipulation tools. The changes aim to give companies clearer deadlines, reduce overlap with other EU rules, and extend support to small mid-cap enterprises.

Publisher alliance seeks leverage over Artificial Intelligence web access

A new publisher coalition is trying to reshape how Artificial Intelligence companies access journalism by combining collective bargaining with tougher technical controls. The effort reflects growing pressure on Artificial Intelligence firms to pay for content used in training, search, and user-facing responses.

Military advantage in the age of algorithmic diffusion

American leadership in Artificial Intelligence research and infrastructure may not translate into lasting military advantage. Rapid diffusion of algorithms is shifting the contest toward compute, talent, and the speed of military adoption.

Artificial Intelligence adoption rises among small businesses

Small businesses are increasingly using Artificial Intelligence and reporting strong gains in efficiency, productivity, and expected revenue. Many still face practical barriers and want more training, resources, and policy support to move from experimentation to full implementation.

Corporate legal teams in 2026

In-house legal teams are being pushed beyond traditional advisory roles into strategic business functions spanning contracts, compliance, governance, and risk. Artificial Intelligence is increasingly central to that shift, especially in high-volume workflows such as contract review, due diligence, and regulatory monitoring.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.