Flexible Work, Better Balance
You want to build and run elegant and thorough machine learning experiments to help us understand and steer the behavior of powerful AI systems. You care about making AI helpful, honest, and harmless, and are interested in the ways that this could be challenging in the context of human-level capabilities. You could describe yourself as both a scientist and an engineer. As a Research Engineer on Alignment Science, you'll contribute to exploratory experimental research on AI safety, with a focus on risks from powerful future systems (like those we would designate as ASL-3 or ASL-4 under our Responsible Scaling Policy), often in collaboration with other teams including Interpretability, Fine-Tuning, and the Frontier Red Team.
These papers give a simple overview of the topics the Alignment Science team works on: Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training, Studying Large Language Model Generalization with Influence Functions, Debating with More...