[Remote] Founding Machine learning Engineer - Evaluation
Note: The job is a remote job and is open to candidates in USA. Established Search is focused on building evaluation and evidence infrastructure for safety-critical AI systems, particularly in diagnostic medical imaging. The role involves investigating how AI systems behave in practice, designing evaluations, and generating evidence necessary for deployment and regulatory decisions.
Responsibilities
- Design and execute evaluations for medical imaging AI systems
- Investigate model failure modes, robustness, and generalization gaps
- Analyze behavior across populations, scanners, imaging protocols, and clinical settings
- Determine what evidence is sufficient for stakeholders making deployment or regulatory decisions
- Translate technical findings into actionable recommendations for customers and clinical stakeholders
- Build reusable evaluation pipelines, evidence schemas, and model assessment frameworks
- Work with messy, incomplete, and noisy real-world clinical data
- Help shape how evaluation investigations are conducted across the organization
Skills
- Strong experience in machine learning for medical imaging (radiology, pathology, cardiology imaging, or related domains)
- Experience evaluating or validating real-world ML systems, not just training models
- Deep understanding of: model robustness, distribution shift, uncertainty, failure analysis, and real-world deployment behavior
- Strong Python skills across the full investigation workflow: data analysis, experimentation, evaluation, and reporting
- Experience working with noisy or imperfect clinical datasets
- Ability to communicate technical findings clearly to both technical and non-technical stakeholders
- High tolerance for ambiguity and open-ended investigative work
- Experience with FDA-regulated AI/ML systems or medical device submissions (510(k), De Novo, SaMD, etc.)
- Experience with medical imaging deployment evaluation or clinical validation
- Experience with interpretability, post-deployment monitoring, uncertainty estimation, or model auditing
- Experience designing reproducible evaluation frameworks or benchmarking systems
- Background in healthcare AI or other safety-critical ML domains
- Customer-facing or cross-functional technical leadership experience
- PhD or equivalent research depth in ML, medical imaging, computer vision, or related areas
Company Overview
Apply To This Job