Proven experience designing and building agentic system architectures using Amazon Bedrock AgentCore and agent frameworks (e.g., LangChain, LangGraph, Strands Agents).
Strong expertise in orchestrating multi-step reasoning, tool invocation, state management, and workflow automation for AI agents.
Deep hands-on knowledge of training and deploying models with PyTorch and TensorFlow.
Experience defining model strategy, including architecture selection, fine-tuning approaches, inference patterns, and cost/performance trade-offs.
Containerization and orchestration skills: Docker and Kubernetes for scalable, fault-tolerant ML/GenAI deployments.
Solid understanding of networking for ML workloads, including VPC design, ingress/egress, private and internet-facing communication patterns, and low-latency design.