Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment
Published in Doctoral Dissertation, 2024
Abstract
The critical inquiry pervading the realm of Philosophy, and perhaps extending its influence across all Humanities disciplines, revolves around the intricacies of morality and normativity. Surprisingly, in recent years, this thematic thread has woven its way into an unexpected domain, one not conventionally associated with pondering “what ought to be”: the field of artificial intelligence (AI) research. Central to morality and AI, we find “alignment”: a problem related to the challenges of expressing human goals and values in a way that artificial systems can follow without causing unwanted adversarial effects. More explicitly and with our current paradigm of AI development in mind, we can think of alignment as teaching human values to non-anthropomorphic entities trained through opaque, gradient-based learning techniques. This work addresses alignment as a technical-philosophical problem that requires solid philosophical foundations and practical implementations that bring normative theory to AI system development. To accomplish this, we propose two sets of necessary and sufficient conditions that should be considered in any alignment process. While necessary conditions serve as metaphysical and metaethical roots for the permissibility of alignment, sufficient conditions provide a blueprint for aligning AI systems within a learning-based paradigm. After laying such foundations, we present implementations of this approach by using state-of-the-art techniques and methods for aligning general-purpose language systems. We call this framework Dynamic Normativity. Its central thesis is that any alignment process within a learning paradigm that cannot satisfy its necessary and sufficient conditions will fail to produce aligned systems.
BibTeX
@phdthesis{correa2024dynamicnormativity,
author={Corr{\^e}a, Nicholas Kluge},
title={Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment},
school={Rheinische Friedrich-Wilhelms-Universit{\¨a}t Bonn and Pontif{\´i}cia Universidade Cat{\´o}lica do Rio Grande do Sul},
year={2024},
url={https://hdl.handle.net/20.500.11811/11595}
}
