📝
Tuna — Deploy and Serve LLM Models on GPU Infrastructure
2329
0次下载
2次浏览
2026/3/9
Tuna is a hybrid GPU inference orchestrator. It lets you deploy, serve, and manage LLM models (Llama, Qwen, Mistral, DeepSeek, Gemma, and any HuggingFace model) on serverless GPUs from **Modal, RunPod, Cerebrium, Google Cloud Run, Baseten, or Azure Container Apps**, with optional **spot instance fallback on AWS** via SkyPilot. Every deployment gets an **OpenAI-compatible `/v1/chat/completions` endpoint**.
广告位 300x250
资源信息
- 数据来源
- bigquery-gharchive
- 分类
- development
- 创建时间
- 2026/3/9
- 更新时间
- 2026/3/14
评论 (0)
登录后发表评论
加载中...