Flexible Work, Better Balance
CoreWeave is the essential cloud for AI, delivering technology, tools, and teams that enable innovators to build and scale AI with confidence. Founded in 2017 and publicly traded (Nasdaq: CRWV) in MarchΒ 2025, CoreWeave is trusted by leading AI labs, startups, and global enterprises.
The Monolith Data Science team is building a layered reliability platform that shifts CoreWeave from reactive troubleshooting to proactive reliability engineering. The platform spans telemetry ingestion, feature engineering, anomaly detection, failure prediction, distributed straggler detection, and agentic root cause analysis. You will partner closely with Fleet, Infrastructure, and AI Platform teams to improve cluster reliability, increase effective utilization (MFU), reduce MTTR, and protect uptime and revenue.
As a Data Science Researcher, you will develop advanced statistical models and machine learning methodologies to optimize GP...