What does this job signal mean?

Anthropic opened Anthropic Fellows Program, AI Safety (London, UK; Ontario, CAN; Remote-Friendly, United States; San Francisco, CA). This hiring signal is demand evidence for teams, locations, and technical bets being staffed. High-signal details: location London, UK; Ontario, CAN; Remote-Friendly, United States; San Francisco, CA · Routine job posting at notable AI lab. onlylabs links this event to 1 captured evidence page and 6 related job signals. It also maps to Safety and policy in the data-business radar.

Anthropic Job: Anthropic Fellows Program, AI Safety

Captured source

source ↗

job-boards.greenhouse.io/job-boards.greenhouse.io/anthropic/jobs

Anthropic Fellows Program, AI Safety

Source ↗

published May 20, 2026seen Jun 5captured Jun 9http 200method plain

Job Application for Anthropic Fellows Program, AI Safety at Anthropic

Anthropic Fellows Program, AI Safety London, UK; Ontario, CAN; Remote-Friendly, United States; San Francisco, CA

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Apply using this link . We are accepting applications on a rolling basis for the next cohort of Anthropic Fellows, which is expected to start in late September. In some circumstances, we can accommodate fellows starting outside the usual cohort timelines — please note in your application if the September start date doesn't work for you.

This page is specific to one of the Anthropic Fellows Workstreams, see also the main Anthropic Fellows posting .

Anthropic Fellows Program overview

The Anthropic Fellows Program is designed to foster AI research and engineering talent. We provide funding and mentorship to promising technical talent - regardless of previous experience.

Fellows will primarily use external infrastructure (e.g. open-source models, public APIs) to work on an empirical project aligned with our research priorities, with the goal of producing a public output (e.g. a paper submission). In one of our earlier cohorts, over 80% of fellows produced papers.

We run multiple cohorts of Fellows each year and review applications on a rolling basis. This application is for cohorts starting in July 2026 and beyond.

What to expect

4 months of full-time research

Direct mentorship from Anthropic researchers

Access to a shared workspace (in either Berkeley, California or London, UK)

Connection to the broader AI safety and security research community

Weekly stipend of 3,850 USD / 2,310 GBP / 4,300 CAD + benefits (these vary by country)

Funding for compute (~$15k/month) and other research expenses

Interview process

The interview process will include an initial application & reference check, technical assessments & interviews, and a research discussion.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Compensation

The expected base stipend for this role is 3,850 USD / 2,310 GBP / 4,300 CAD per week, with an expectation of 40 hours per week for 4 months (with possible extension).

Fellows workstreams

Due to the success of the Anthropic Fellows for AI Safety Research program, we are now expanding it across teams at Anthropic. We expect there to be significant overlap in the types of skills and responsibilities across the roles and will by default consider candidates for all the workstreams.

Some of the workstreams may include unique assessment steps; we therefore ask you for workstream preferences in the application . You can see an overview of the current workstreams below:

AI Safety Fellows

AI Security Fellows

ML Systems & Performance Fellows

Reinforcement Learning Fellows

Economics & Societal Impacts Fellows

This page is specific to one of the Anthropic Fellows Workstreams, see also the main Anthropic Fellows posting .

Across the workstreams, you may be a good fit if you:

Are motivated by making sure AI is safe and beneficial for society as a whole

Are excited to transition into empirical AI research and would be interested in a full-time role at Anthropic

Have a strong technical background in computer science, mathematics, or physics

Thrive in fast-paced, collaborative environments

Can implement ideas quickly and communicate clearly

Strong candidates may also have:

Strong background in a discipline relevant to a specific Fellows workstream (e.g. economics, social sciences, or cybersecurity)

Experience in areas of research or engineering related to their workstream

Candidates must be:

Fluent in Python programming

Available to work full-time on the Fellows program

AI Safety Fellows

Mentors, research areas, & past projects

Fellows will undergo a project selection & mentor matching process. Potential mentors include:

Sam Bowman

Sara Price

Alex Tamkin

Nina Panickssery

Trenton Bricken

Logan Graham

Jascha Sohl-Dickstein

Joe Benton

Collin Burns

Fabien Roger

Samuel Marks

Kyle Fish

Ethan Perez

Our mentors will lead projects in select AI safety research areas, such as:

Scalable Oversight: Developing techniques to keep highly capable models helpful and honest, even as they surpass human-level intelligence in various domains.

Adversarial Robustness and AI Control: Creating methods to ensure advanced AI systems remain safe and harmless in unfamiliar or adversarial scenarios.

Model Organisms: Creating model organisms of misalignment to improve our empirical understanding of how alignment failures might arise.

Model Internals / Mechanistic Interpretability: Advancing our understanding of the internal workings of large language models to enable more targeted interventions and safety measures.

AI Welfare: Improving our understanding of potential AI welfare and developing related evaluations and mitigations.

On our Alignment Science and Frontier Red Team blogs, you can read about past projects, including:

Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data: Alex Cloud and Minh Le, et al., mentors including Samuel Marks and Owain Evans

Open-source circuits: Michael Hanna and Mateusz Piotrowski with mentorship from Emmanuel Ameisen and Jack Lindsey

For a full list of representative projects for each area, please see these blog posts: Introducing the Anthropic Fellows Program for AI Safety Research , Recommendations for Technical AI Safety Research Directions .

Unique candidate criteria

You might be a particularly great fit for this workstream if you:

Are motivated by reducing...

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine job posting at notable AI lab