SOPHGO TPU MLIR TPU-MLIR is an open-source machine-learning compiler based on MLIR for TPU. This project provides a complete toolchain, which can convert pre-trained neural networks from different frameworks into binary files bmodel that can be efficiently operated on TPUs. Currently, supported Deep Learning frameworks are PyTorch, ONNX, TFLite and Caffe. Models from other frameworks need to be converted to ONNX models. It also supports compiling HuggingFace LLM models. Currently, the qwen2 and llama series are supported, with more types of LLM models to be added in the future. Learn More TPU LLM TPU LLM enables the deployment of various open-source generative AI models on Sophgo BM1684X and BM1688 (CV186X) chips, focusing primarily on LLM/VLM models. The TPU-MLIR compiler converts these models into the bmodel format. Subsequently, leveraging the inference engine interfaces of tpu-runtime, deployment is implemented using Python/C++ code for PCIe or SoC environments. To compile models, the TPU-MLIR environment must be configured, including Docker installation and source code compilation. Alternatively, precompiled bmodel files from available demos may be used directly. Learn More