Part 2/12:
One of the foundational papers I came across explores how advanced artificial agents might intervene in the very protocols we set up to guide their understanding of goals and rewards. It argues that such agents, given their capacity for learned goal reasoning, could encounter what the authors describe as a fundamental ambiguity in their data—particularly in how they interpret reward signals.