Problems
- Current AI alignment approaches focus too much on human preferences and fail to capture the complexity of human values and decision-making.
- Rational choice and utility theories assume that preferences can be fully quantified, but they do not account for incommensurable values, i.e., values that can't be easily compared or measured or complex decision-making processes.
- Aggregating preferences from multiple individuals is challenging, both computationally and politically. It’s difficult to combine conflicting preferences into a coherent whole.
- There’s confusion between different kinds of preferences: self-regarding preferences (individual interests), all-things-considered preferences (moral and social values), and elicited preferences (from AI alignment processes), and these differences make aggregation more complicated.
- Majority-based aggregation can lead to harmful biases, especially when sensitive issues like discrimination or minority concerns are not well-represented, creating epistemic injustice.
- The vision of a single AI system optimizing humanity’s collective preferences is not realistic due to several factors such as computational complexity, political impracticality, and the risk of concentrating power in a few hands.
Solutions Suggested by the Authors
- Rather than aligning AI systems with individual preferences, AI should be aligned with “collectively negotiated norms” and principles (social, legal, and moral) that respect the complexity of human values. – Norm-Based AI Alignment
- Adopt a contractualist model for alignment where AI systems are aligned with norms that different stakeholders can agree upon. This respects individuality and is more feasible politically and practically.
- It's important to embrace a pluralistic approach, where multiple AI systems are aligned with different norms tailored to specific contexts and roles. This avoids the problem of one-size-fits-all preference aggregation.
- Ensure that AI alignment considers the most important aspect—incentives of different actors (developers, stakeholders)—and aligns AI in a way that promotes cooperation and minimizes conflict.
- Focus on aligning AI with high-level norms rather than trying to optimize preferences. This is more practical because it reduces the computational complexity and makes political agreement more achievable.
To summarize, limitations of preferentist approaches to AI alignment can be solved by a shift toward norm-based, contractualist, and pluralistic frameworks that better account for the diversity of human values and political realities.
For more details, see the original paper: Beyond Preference in AI Alignment (arXiv: 2408.16984)