We're looking for Full-stack Software Engineers to join our team. Responsibilities include improving core scheduling algorithms, optimizing vLLM inference runtime, improving logging and observability stack, and building our user-facing dashboard and APIs. We are a well-funded company with a clear path to profitability, offering significant equity.
We're building a serverless LLM inference network that makes use of underutilized capacity from GPU data centers. Our product is a scheduler for running LLM inference workloads on GPUs located all over the world. We currently have over 6,000 GPUs on our network, and are growing quickly.