LLM deployment¶
LLM-TPU project introduction¶
LLM-TPU contains migration deployment routines for various open source generative AI models, mainly LLM, also Stable Diffusion (AI painting).
Project repository link: LLM-TPU.
Examples list¶
Model | INT4 | INT8 | FP16/BF16 | Huggingface Link |
---|---|---|---|---|
Baichuan2-7B | ✔ | LINK | ||
ChatGLM3-6B | ✔ | ✔ | ✔ | LINK |
CodeFuse-7B | ✔ | ✔ | LINK | |
DeepSeek-6.7B | ✔ | ✔ | LINK | |
Falcon-40B | ✔ | ✔ | LINK | |
Phi-3-mini-4k | ✔ | ✔ | ✔ | LINK |
Qwen-7B | ✔ | ✔ | ✔ | LINK |
Qwen-14B | ✔ | ✔ | ✔ | LINK |
Qwen-72B | ✔ | LINK | ||
Qwen1.5-0.5B | ✔ | ✔ | ✔ | LINK |
Qwen1.5-1.8B | ✔ | ✔ | ✔ | LINK |
Llama2-7B | ✔ | ✔ | ✔ | LINK |
Llama2-13B | ✔ | ✔ | ✔ | LINK |
LWM-Text-Chat | ✔ | ✔ | ✔ | LINK |
Mistral-7B-Instruct | ✔ | ✔ | LINK | |
Stable Diffusion | ✔ | LINK | ||
Stable Diffusion XL | ✔ | LINK | ||
WizardCoder-15B | ✔ | LINK | ||
Yi-6B-chat | ✔ | ✔ | LINK | |
Yi-34B-chat | ✔ | ✔ | LINK |