📝

Tuna — Deploy and Serve LLM Models on GPU Infrastructure

2329
0次下载
2次浏览
2026/3/9

Tuna is a hybrid GPU inference orchestrator. It lets you deploy, serve, and manage LLM models (Llama, Qwen, Mistral, DeepSeek, Gemma, and any HuggingFace model) on serverless GPUs from **Modal, RunPod, Cerebrium, Google Cloud Run, Baseten, or Azure Container Apps**, with optional **spot instance fallback on AWS** via SkyPilot. Every deployment gets an **OpenAI-compatible `/v1/chat/completions` endpoint**.

广告位 300x250

资源信息

数据来源
bigquery-gharchive
分类
development
创建时间
2026/3/9
更新时间
2026/3/14

评论 (0)

登录后发表评论

加载中...