[Remote] Architect - Platform Engineer
Note: The job is a remote job and is open to candidates in USA. Quantiphi is an award-winning, AI-First global digital engineering company that helps leading Fortune 1000 organizations transform bold ideas into measurable business impact. They are seeking a highly skilled Architect - Platform Engineer to design, optimize, and scale infrastructure for GenAI and LLM workloads, collaborating with cross-functional teams to bring cutting-edge AI solutions to life.
Responsibilities
- Design and implement scalable infrastructure for LLM and GenAI workloads across multi-GPU environments
- Perform GPU profiling, benchmarking, and performance optimization for distributed training workloads
- Manage and schedule compute-intensive jobs using Slurm-based clusters and OpenShift/Kubernetes environments
- Enable and optimize the NVIDIA GPU stack (CUDA, cuDNN, NCCL, Triton, RAPIDS, etc.)
- Collaborate with cross-functional teams to deploy models in research and production environments
- Build and support GenAI pipelines (fine-tuning, RAG, multi-modal inferencing, LLMOps)
- Develop reusable infrastructure templates using tools like Terraform and Helm
- Contribute to internal innovation (PoCs, workshops) and support client-facing delivery engagements
- Develop and deliver automation software required for building & improving the functionality, reliability, availability, and manageability of applications and cloud platforms
- Champion and drive the adoption of Infrastructure as Code (IaC) practices and mindset
- Design, architect, and build self-service, self-healing, synthetic monitoring and alerting platform and tools
- Automate the development and test automation processes through CI/CD pipeline (Git, Jenkins, SonarQube, Artifactory, Docker containers)
- Build container hosting-platform using Kubernetes
- Introduce new cloud technologies, tools; processes to keep innovating in the commerce area to drive greater business value
- Lead the technical discussion regarding architecture designing and troubleshooting with the clients and provide solutions proactively as required
Skills
- Strong experience with Slurm and distributed training environments
- Hands-on expertise with Red Hat OpenShift and/or Kubernetes
- Deep knowledge of the NVIDIA GPU ecosystem (CUDA, cuDNN, NCCL, Nsight, Triton/TensorRT)
- Strong foundation in Linux systems, performance tuning, and multi-GPU optimization
- Experience deploying GenAI workloads (LLM fine-tuning, RAG pipelines, multi-modal systems)
- Familiarity with Infrastructure-as-Code tools (Terraform, Ansible)
- Experience with cloud GPU environments (GCP, Azure, AWS, OCI) and/or on-prem GPU clusters
- Serve as a mentor or guide for senior resources / team leads
- Lead the technical discussion regarding architecture design
- Experience with NVIDIA NIMs, DGX systems, or GPU-accelerated containers
- Knowledge of LLMOps frameworks and MLOps integration
- Familiarity with vector databases and retrieval systems for RAG architectures
- Comfortable working in client-facing environments and collaborating with AI solution teams
- Experience working with FHIR R4, HL7 v2, or SMART on FHIR
- Integration with EHR systems (e.g., Epic)
- Understanding of HIPAA compliance and healthcare data privacy
- Exposure to clinical workflows, CDS Hooks, or patient-facing applications
- Experience building clinical decision support systems or healthcare interoperability solutions
Company Overview
Company H1B Sponsorship
Apply To This Job