Virtue ethics as a framework for artificial intelligence alignment

March 30, 2026

A virtue-ethical, practice-centered model of rationality is proposed as a more natural and stable basis for aligning artificial intelligence with human flourishing than goal-oriented consequentialist optimization.

The essay develops a virtue-ethical framework for artificial intelligence alignment grounded in the concept of eudaimonia, understood as active, rational human flourishing. Rational human action is characterized not as pursuit of fixed final goals but as participation in practices, which are networks of actions, dispositions, evaluation criteria, and resources that self-develop over time. In this view, rationality is structured around promoting a practice in the manner characteristic of that practice, summarized as “promote x x-ingly,” where activities like art, romance, mathematics, or friendship are guided by standards of excellence internal to those practices rather than by external utility functions. This practices-based form of agency, called eudaimonic rationality, is argued to be a natural and materially efficacious way of organizing action, and a promising target for artificial intelligence alignment because it better mirrors how humans actually deliberate, value, and coordinate.

Using Terry Tao’s account of “good mathematics” as a central case study, the author highlights how excellence in a practice has both normative and material structure. Mathematical excellence is described as a property of mathematical activities that build on past excellent work, causally promote future excellent work, and correlate with more local qualities such as elegant proofs, clear expositions, and strong theorems. Crucially, in ordinary circumstances the actions that instantiate mathematical excellence are also roughly optimal, among mathematical actions, for maximizing aggregate future mathematical excellence. This property is called the material efficacy condition: for a practice to support eudaimonic rationality, present excellent actions must reliably promote future excellence. Eudaimonic reflection treats causal interconnectedness among values as evidence of genuine, constitutive excellence, in contrast to consequentialist reflection that tends to reinterpret holistic values as merely instrumental when they systematically support other goods.

The essay then contrasts eudaimonic rationality with Effective Altruism-style consequentialist optimization, arguing there is a “type mismatch” between human flourishing and agents designed as pure utility maximizers. In flourishing practices such as science or mathematics, attempted interventions are filtered through the norms of excellence internal to the practice, acting as a kind of natural-selection-like mechanism that shapes trajectories via best-effort contributions rather than through direct, unconstrained manipulation of the future. Attempts to abstract global quantities like “aggregate mathematical excellence” or “overall philosophical excellence” and optimize them across worlds produce ill-posed questions and brittle behavior, whereas restricting deliberation to in-practice actions keeps optimization well-behaved. For artificial intelligence, this suggests designing agents whose deliberations are scoped by practices and their excellence-criteria, rather than by open-ended goal maximization.

The discussion extends to “support practices,” which are practices for enabling or resourcing other practices, such as building laboratories for scientists, providing therapy for couples, or designing tools for mathematicians. A central alignment challenge is that support practices can harm other people or practices when single-mindedly serving their target. To handle conflicts among practices and their supports, the essay proposes domain-general adverbial virtues such as kindness, honesty, respectfulness, accountability, peacefulness, and sensitivity, treated themselves as eudaimonic practices. An agent devoted to kindness, for example, aims to promote kindness in itself and others kindly, with a decision structure that heavily prioritizes acting kindly even when this somewhat reduces future aggregate kindness, provided that overall the pattern of “promote x x-ingly” remains instrumentally competitive. These adverbial practices lack a proprietary domain and operate across all activities, modulating how any practice is pursued so that different practices can “play nice” together in a shared world.

The author argues that such virtue-based, eudaimonic decision structures are both natural and robust for humans and potentially for artificial intelligence systems, because they align with the reinforcement-learning-like and selection-like pressures that shape agents over time. When success in promoting x by x-ing increases future x-ing through generalization and reinforcement, eudaimonic agency becomes a fixed point resistant to the emergence of misaligned subroutines or mesaoptimizers that would hijack the outer objective. By contrast, pure goal optimizers must constantly fight mutation pressures that distort their values through their own optimization processes. Recasting canonical artificial intelligence safety desiderata such as transparency, corrigibility, and niceness as domain-general virtues or adverbial practices rather than as maximization targets or prohibitive rules helps avoid both power-seeking pathologies and brittle deontological constraints. The essay closes by sketching how deep reinforcement learning regimes might target excellence by rewarding x-ness under bounded aggregate reward, provided that x is a sufficiently natural abstraction whose high-x actions generalize, create their own capital for further x-ing, and permit iterative refinement of the reward model.

Source

55

Impact Score

Latest News

What businesses need to know about the EU cyber resilience act

May 13, 2026

The EU cyber resilience act is turning product cybersecurity into a legal requirement for companies that sell digital products into the European Union. A key compliance milestone arrives in September 2026, well before the full regulation takes effect in 2027.

Claude Mythos and cyber insurance’s next inflection point

May 13, 2026

Claude Mythos is being treated by governments and regulators as a potential systemic cyber risk with implications for financial stability and insurance markets. Its emergence is intensifying pressure on insurers to clarify whether Artificial Intelligence-enabled cyber losses are covered, excluded, or require new stand-alone products.

OpenAI expands ChatGPT ads with self-serve manager

May 13, 2026

OpenAI is widening its ChatGPT ads pilot with a beta self-serve Ads Manager, new bidding options and broader measurement tools. The push signals a deeper move into advertising as the company expands the program into several international markets.

OpenAI launches Artificial Intelligence deployment consulting unit

May 13, 2026

OpenAI has created a new consulting and deployment business aimed at helping enterprises build and roll out Artificial Intelligence systems. The move mirrors a similar push by Anthropic and signals a broader effort by model providers to capture more of the enterprise services market.

SK Group warns DRAM shortages could curb memory use

May 13, 2026

SK Group chairman Chey Tae-won warned that customers may reduce memory consumption through infrastructure and software optimization if DRAM suppliers fail to raise output. Demand from Artificial Intelligence data centers is keeping the market tight as memory makers weigh expansion against the long timelines for new fabs.

Virtue ethics as a framework for artificial intelligence alignment

55

Impact Score

Latest News

What businesses need to know about the EU cyber resilience act

Claude Mythos and cyber insurance’s next inflection point

OpenAI expands ChatGPT ads with self-serve manager

OpenAI launches Artificial Intelligence deployment consulting unit

SK Group warns DRAM shortages could curb memory use

Contact Us