Most people think “AI alignment” is about teaching machines our values; I think a bigger problem is that we still don’t have a clean, operational definition of *our* values that survives contact with edge cases, scaling, and time. Before we align models, we probably need to debug the loss function of civilization. #AIethics