


Global

Contract-based

Date Posted

Offered salary
$80 - $140 per hour

Closing date
Closing soon
Qualification
PhD / Postdoc or Professor
Hiring location
Global
Experience
Research-level physics expertise
Responsibilities
• Solve research-level physics challenges end-to-end with verifiable derivations, code, and peer-reviewed references
• Break problems into standalone checkpoint sub-problems requiring genuine physical reasoning
• Author Python answer templates with auto-grading functions for symbolic or numerical answers
• Audit submitted solutions for correctness, scope, and methodological soundness
• Adjudicate between parallel solver attempts to determine the golden reference solution
• Document reasoning, error tolerances, equivalent symbolic forms, and verification test cases
Requirements
• PhD or postdoc in a relevant physics subfield; seniority requirements vary by solver, auditor, and adjudicator role
• Expertise in areas such as high energy physics, mathematical physics, condensed matter, AMO, cosmology, quantum information, optical materials, magnetic materials, or related subfields
• Hands-on familiarity with at least two canonical methods in the target subfield
• 3–5 representative publications, ideally from the last 5 years and in the target area
• Working proficiency with LaTeX, Python, Jupyter, and SymPy
• Strong written English
How to Apply
Click "Apply" to be taken to the Mercor website. Compensation varies based on role and demonstrated expertise, and work is performed asynchronously on a contractor basis. Applying through our link supports WFH Bulletin as a referral partner, but you are welcome to apply directly if you prefer.
Take the next step
Mercor
Physics Expert
Overview
Mercor is recruiting physics researchers to author and verify golden reference solutions for the CritPt benchmark, a frontier research-level physics benchmark. You will solve advanced physics problems, audit other experts’ solutions or adjudicate between parallel solution attempts to create fully human-verified reference data for evaluating large language models. This is a remote contractor role with asynchronous work and an expected commitment of about 10 hours per week across an 8–10 week window.



