In the mid 20th century B.F. Skinner ran a now-famous secret project that seems almost absurd today: he tried to train birds to steer missiles. The work, called ´Project Pigeon´, began after experiments with crows proved difficult and ordinary pigeons from a feed shop turned out to be unusually tractable learners. Skinner rewarded pecks at targets on photographs and concluded that associative, trial-and-error learning could reliably shape behavior. He called it operant conditioning; he treated the pigeon as a practical instrument for studying the mechanics of learning.
Those behavioral experiments quietly seeded ideas that computer scientists later adapted. Richard Sutton and Andrew Barto recast instrumental learning as reinforcement learning, a formal framework that gives an agent incentives to explore and to remember which actions lead to rewards. Their work, synthesized in the book ´Reinforcement Learning: An Introduction´, underpins systems that improved at games, control tasks, and more as computing power grew. AlphaGo Zero, trained by self-play with a simple reward scheme, is a dramatic example: it discovered deep strategies through millions of trial-and-error games rather than by mimicking human rules.
Today, reinforcement methods are layered into many products, from game agents to large language models fine-tuned via reinforcement learning from human feedback. Some companies describe these systems as ´reasoning´ models; critics and pioneers like Sutton push back, arguing that what these models do is associative search and memory, not humanlike reasoning. That distinction matters: anthropomorphizing model behavior misleads users and researchers about what the systems represent and what they can feel. A pigeon learns by association and can suffer; a chatbot does not.
Recent biological research has looped back on itself. Comparative psychologists such as Ed Wasserman and biologists like Johan Lind argue that associative learning can produce far more complex behavior in animals than previously credited. Experiments show pigeons discriminating medical scans and complex visual categories, sometimes matching or exceeding novice human performance. Those findings challenge neat separations between ´simple´ learning and ´cognitive´ abilities and invite a reassessment of animal intelligence in light of machine results.
The article thus makes a double claim. Historically, pigeons and behaviorist experiments helped inspire a dominant paradigm in machine learning. Conceptually, modern successes force scientists to reconsider associative learning as a powerful engine of intelligence across species. The pigeon is both a literal participant in the labs that birthed reinforcement learning and a useful metaphor for how many of our most capable systems actually learn: slowly, iteratively, and by reward.