🌍 Global Opportunities
Updated Hourly
🎓 Student Friendly

parttimejobs.work

Flexible Work, Better Balance

⏰ Full-time

Lead AI Infrastructure Engineer

Zyoin Group
Location 📍 India, India
Posted 📅 June 03, 2026
Work Type ⏰ Full-time

Position Overview

Inference Optimization

  • Drive TTFT below 400ms for multi-step agent pipelines
  • Streaming optimization: first token to user while sub-agents are still running
  • KV cache strategy, prompt compression, dynamic context window management
  • Multi-provider routing: model selection by latency, cost, and task type across OpenAI, Anthropic, Gemini, and open-weight models

Agent Architecture

  • Design and implement Plan-Execute-Synthesize pipelines that run sub-agents in parallel DAGs, not sequential chains
  • Build reliable orchestration on top of Temporal: retries, timeouts, partial failure recovery, idempotency
  • Structured output enforcement: JSON schema validation, retry loops on malformed LLM output, graceful degradation
  • Tool call design: schema design that LLMs actually follow reliably across providers

Evaluation & Harness

  • Own the eval framework en...

Apply Now

Submit Application →

Quick and easy application process

Job Details

Employment Type
Full-time
📊
Category
Computer Occupations
🏠
Work Arrangement
On-site
📍
Location
India, India