[Remote] Lead, Machine Learning Ops Engineer
Note: The job is a remote job and is open to candidates in USA. The Planet Group is seeking a Lead Machine Learning Ops Engineer to drive their enterprise AI/ML platform strategy. This hands-on leadership role involves guiding a team and ensuring AI systems are built and governed with reliability and efficiency, while also engaging in advisory and planning activities.
Responsibilities
- Drive the enterprise AI/ML platform strategy as a player/coach
- Lead—guiding a team of Machine Learning Operations Engineers while shaping end-to-end execution
- Spend 70% advisory/planning and 30% keyboard and special initiative project work
- Ensure AI systems are built, operated, and governed with reliability, scalability, security, compliance, and cost efficiency—all aligned to business goals
Skills
- Lead experience delivering MLOps platform strategy, execution, and roadmap guidance
- Advanced proficiency in Python
- Object-oriented architectural mastery across dynamically typed languages
- Experience integrating and governing multi-language systems, including Python and JavaScript/TypeScript (enterprise platforms such as .NET experience helpful)
- Leadership-level expertise in AI/ML platform engineering, including MLOps, LLMOps, and AIOps
- Ability to define and enforce enterprise standards for AI model lifecycle management, including: monitoring and reliability, governance, cost control
- Deep understanding of AI system observability, including: drift detection, evaluation frameworks, incident response
- Strong experience with cloud architecture, security, compliance, and enterprise-scale deployments
- Proven ability to guide technical decision-making and platform strategy
- 6+ years of relevant experience
- Experience in MLOps, DevOps, or related fields with a focus on enterprise-level solutions
- Supervisory experience Highly preferred
- Experience specifically strengthening enterprise practices across generative AI and LLM-based systems
- Demonstrated ability to translate observability and operational learnings into continuous platform improvements
Company Overview
Apply To This Job