Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.
ABOUT THE OPPORTUNITYWe develop and run evaluations that help assess the risks posed by scheming AIs. You will get to work with frontier labs like OpenAI, Anthropic, and Google DeepMind and be amongst the first to interact with new models before anyone else. The ideal candidate loves rigorously testing frontier AI models, and enjoys building efficient pipelines and automating them.
YOU WILL HAVE THE OPPORTUNITY TOWe want to emphasize that people who feel they don't fulfill all of these characteristics but think they would be a good fit for the position, nonetheless, are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine. We don't require a formal background or industry experience and welcome self taught candidates.
BENEFITSThe rapid rise in AI capabilities offer tremendous opportunities, but also present significant risks. At Apollo Research, we're primarily concerned with risks from Loss of Control, i.e. risks coming from the model itself rather than e.g. humans misusing the AI. We're particularly concerned with deceptive alignment / scheming, a phenomenon where a model appears to be aligned but is, in fact, misaligned and capable of evading human oversight.
We work on the detection of scheming (e.g. building evaluations and novel evaluation techniques), the science of scheming (e.g. model organisms and the study of scaling trends), and scheming mitigations (e.g. control). We closely work with multiple frontier AI companies, e.g. to test their models before deployment and collaborate on fundamental research.
At Apollo, we aim for a culture that emphasizes truth seeking, being goal oriented, giving and receiving constructive feedback, and being friendly and helpful. If you're interested in more details about what it's like working at Apollo, you can find more information here.
ABOUT THE TEAMThe current evals team consists of Jérémy Scheurer, Alex Meinke, Bronson Schoen, Felix Höfstäter, Axel Højmark, Teun van der Weij, Alex Lloyd and Mia Hopman. Alex Meinke coordinates the research agenda with guidance from Marius Hobbhahn, though team members lead individual projects. You will mostly work with the evals team as well as our team of software engineers, but you will likely sometimes interact with the governance team to translate technical knowledge into concrete recommendations. You can find our full team here.
Equality StatementApollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.
How to applyPlease complete the application form with your CV. The provision of a cover letter is optional but not necessary. Please also feel free to share links to relevant work samples.
About the interview processOur multi stage process includes a screening interview, a take home test (approx. 2.5 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job. There are no LeetCode style general coding interviews. If you want to prepare for the interviews, we suggest working on hands on LLM evals projects (e.g. as suggested in our starter guide), such as building LM agent evaluations in Inspect.
Your Privacy and Fairness in Our Recruitment ProcessWe are committed to protecting your data, ensuring fairness, and adhering to workplace fairness principles in our recruitment process. To enhance hiring efficiency, we use AI powered tools to assist with tasks such as resume screening. These tools are designed and deployed in compliance with internationally recognized AI governance frameworks. Your personal data is handled securely and transparently. We adopt a human centred approach: all resumes are screened by a human and final hiring decisions are made by our team. If you have questions about how your data is processed or wish to report concerns about fairness, please contact us at .