You are viewing a preview of this job. Log in or register to view more details about this job.

Governance & Trust – Safety Specialist

We believe the foundation of AI safety is high-quality human data. Models can’t evaluate themselves — they need humans who can apply structured judgment to complex, nuanced outputs.

We’re building a flexible pod of Safety specialists: contributors from both technical and non-technical backgrounds who will serve as expert data annotators. This pod will annotate and evaluate AI behaviors to ensure the systems are safe.

No prior annotation experience is required — instead, we’re looking for people with the ability to make careful, consistent decisions in ambiguous situations.

This role may include reviewing AI outputs that touch on sensitive topics such as bias, misinformation, or harmful behaviors. All work is text-based, and participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources.

What You’ll Do

Produce high-quality human data by annotating AI outputs against safety criteria (e.g., bias, misinformation, disallowed content, unsafe reasoning, etc).

Apply harm taxonomies and guidelines consistently, even when tasks are ambiguous.

Document your reasoning to improve guidelines

Collaborate to provide the human data that powers AI safety research, model improvements, and risk audits.

Who You Are

You have a background in trust & safety, governance, or policy-to-product frameworks.

You’ve worked with harm taxonomies, safety-by-design principles, or regulatory frameworks (EU AI Act, NIST AI RMF).

You’re skilled at translating abstract policies into concrete evaluation criteria.

You’re motivated by reducing user harm and ensuring systems are safe, ethical, and compliant.

Examples of past titles: Trust & Safety Analyst, Online Safety Specialist, Policy Researcher, Governance Specialist, UX Researcher, Risk & Policy Associate, Regulatory Affairs Analyst, Safety Policy Manager, Ethics & Compliance Coordinator.

What Success Looks Like

Your annotations are accurate, high-quality and consistent, even across ambiguous cases.

You help surface risks early that automated tools miss.

Guidelines and taxonomies improve based on your feedback.

The data you produce directly strengthens AI model safety and compliance.