Deploy and optimize scalable full-stack ML serving applications across multiple frameworks (PyTorch, TensorFlow, vLLM) and using tech like FastAPI and NVIDIA Triton Inference Server. Build MLOps from the ground up, including data annotation, monitoring, and evaluation. Fine-tune, optimize, and deploy multi-modal models, integrating LLMs with computer vision systems. Integrate computer vision and other ML model results with the rest of the tech stack, including front-end web and mobile interfaces and back-end database services. Requires 5+ years professional experience in software engineering, proficiency in Python, extensive experience with DevOps tools and practices, and experience with AWS technologies or similar cloud platforms.