LLM deployment¶
LLM-TPU project introduction¶
LLM-TPU contains migration deployment routines for various open source generative AI models, mainly LLM, also Stable Diffusion (AI painting).
Project repository link: LLM-TPU.
Examples list¶
| Model | INT4 | INT8 | FP16/BF16 | Huggingface Link |
|---|---|---|---|---|
| Baichuan2-7B | ✔ | LINK | ||
| ChatGLM3-6B | ✔ | ✔ | ✔ | LINK |
| CodeFuse-7B | ✔ | ✔ | LINK | |
| DeepSeek-6.7B | ✔ | ✔ | LINK | |
| Falcon-40B | ✔ | ✔ | LINK | |
| Phi-3-mini-4k | ✔ | ✔ | ✔ | LINK |
| Qwen-7B | ✔ | ✔ | ✔ | LINK |
| Qwen-14B | ✔ | ✔ | ✔ | LINK |
| Qwen-72B | ✔ | LINK | ||
| Qwen1.5-0.5B | ✔ | ✔ | ✔ | LINK |
| Qwen1.5-1.8B | ✔ | ✔ | ✔ | LINK |
| Llama2-7B | ✔ | ✔ | ✔ | LINK |
| Llama2-13B | ✔ | ✔ | ✔ | LINK |
| LWM-Text-Chat | ✔ | ✔ | ✔ | LINK |
| Mistral-7B-Instruct | ✔ | ✔ | LINK | |
| Stable Diffusion | ✔ | LINK | ||
| Stable Diffusion XL | ✔ | LINK | ||
| WizardCoder-15B | ✔ | LINK | ||
| Yi-6B-chat | ✔ | ✔ | LINK | |
| Yi-34B-chat | ✔ | ✔ | LINK |