Cumulus Labs Promises 12-Second GPU Cold Starts and That Changes the Math on Serverless Inference
Cumulus Labs is a serverless GPU inference provider claiming 12.5-second cold starts and scale-to-zero pricing, positioning itself as faster than Modal and cheaper than RunPod for teams running AI models in production.