Self-developed Algorithm
"CAM-CRV1126S2U/CAM-CRV1109S2U " NPU must use "RKNN" model for model
inference. If users use the algorithm model developed by themselves,
they need to be familiar with the RKNN development process and convert
their own model to RKNN before using it. This chapter explains the use
of RKNN development tools for such users.
RKNN Introduction
RKNN is the model type used by Rockchip NPU platform, and the model
file ends with ".rknn" suffix. Rockchip provides a complete model
conversion Python tool, which is convenient for users to convert self-
developed algorithm models into RKNN models. At the same time,
Rockchip also provides "C/C++" and "Python" API interfaces.
RKNN-Toolkit
Tool Introduction
"RKNN-Toolkit " is a development kit that provides users with model
conversion, reasoning and performance evaluation on PC and Rockchip
NPU platforms. Users can easily complete the following functions
through the Python interface provided by this tool:
Model Conversion: Supports Caffe、TensorFlow
、TensorFlow Lite、ONNX、Darknet、Pytorch、MXNet model to be converted
into RKNN model, support RKNN model import and export, and later can
be loaded and used on Rockchip NPU platform .Since 1.2.0 version,
multi-input model is supported. Pytorch and MXNet are supported
starting from 1.3.0.
Quantization Function: Supports the conversion
of floating-point models to quantized models, currently supported
quantization methods are asymmetric_quantized-u8
(asymmetric_quantized-u8), dynamic fixed-point quantization (
dynamic_fixed_point-8 and dynamic_fixed_point-16). Starting from the
1.0.0 version, RKNN-Toolkit began to support the hybrid quantization
function.
Model Reasoning: It can simulate Rockchip NPU
to run the RKNN model and get the inference result on the PC; it can
also distribute the RKNN model to the designated NPU device for
inference.
Performance Evaluation: Able to simulate
Rockchip NPU to run RKNN model on PC, and evaluate model performance
(including total time and the time of each layer); you can also use
RKNN The model is distributed to run on the designated NPU device to
evaluate the performance of the model when it runs on the actual
device.
Memory Evaluation: Evaluate the system and NPU
memory consumption when the model is running. When using this
function, the RKNN model must be distributed to the NPU device to run,
and the relevant interface must be called to obtain memory usage
information. This function is supported since version 0.9.9.
Model Precompilation: The RKNN model generated
by precompilation technology can reduce the loading time on the
hardware platform. For some models, the size of the model can also be
reduced. But the pre-compiled RKNN model can only be run on NPU
devices. Currently, only the x86_64 Ubuntu platform supports directly
generating a pre-compiled RKNN model from the original model. RKNN-
Toolkit supports model pre-compilation from 0.9.5 version, and has
upgraded the pre-compilation method in 1.0.0. Starting from the 1.4.0
version, the normal RKNN model can also be converted into a pre-
compiled RKNN model through the NPU device.
Model Segmentation: This function is used in
scenarios where multiple models are running at the same time. A single
model can be divided into multiple segments and executed on the NPU,
thereby adjusting multiple models to occupy the NPU The execution time
of this model can be avoided because one model takes up too much
execution time and other models cannot be executed in time. RKNN-
Toolkit supports this function from 1.2.0 version. This function must
be used on hardware with Rockchip NPU, and the NPU driver version must
be greater than 0.9.8.
Custom Operator Function: If the model
contains an operator that is not supported by RKNN-Toolkit, it will
fail in the model conversion stage. At this time, you can use the
custom operator function to add unsupported operators, so that the
model can be converted and run normally. RKNN-Toolkit supports this
function from 1.2.0 version. For the use and development of custom
operators, please refer to the
"Rockchip_Developer_Guide_RKNN_Toolkit_Custom_OP_EN" document.
Quantization Accuracy Analysis Function: This
function will give the Euclidean distance or cosine distance of each
layer of inference results before and after the quantization of the
model to analyze how the quantization error occurs. Provide ideas for
improving the accuracy of the quantitative model. This feature is
supported from version 1.3.0. The 1.4.0 version adds a layer-by-layer
quantization accuracy analysis sub-function. The input of each layer
is specified as the correct floating point value to eliminate the
accumulation of layer-by-layer error, which can more accurately
reflect the quantization of each layer itself influences.
Visualization Function: This function presents
various functions of RKNN-Toolkit in the form of a graphical
interface, simplifying user operation steps. Users can complete
functions such as model conversion and reasoning by filling in forms
and clicking function buttons, instead of manually writing scripts.
For the specific usage of the visualization function, please refer to
the "Rockchip_User_Guide_RKNN_Toolkit_Visualization_EN" document.
Version 1.3.0 began to support this function. The 1.4.0 version
improves the support for multi-input models, and supports new Rockchip
NPU devices such as RK1806, RV1109, RV1126.
Model Optimization Level Function: RKNN-
Toolkit will optimize the model during model conversion, and the
default optimization options may have some impact on the accuracy of
the model. By setting the optimization level, you can turn off some or
all optimization options. For the specific usage of the optimization
level, please refer to the description of the optimization_level
parameter in the config interface. This feature is supported from
version 1.3.0.
Environment Dependence
System Support: Ubuntu 16.04 x64 (above),
Window 7 x64 (above), Mac OS X 10.13.5 x64 (above), Debian 9.8 (x64)
or higher
Python Version: 3.5/3.6/3.7
Python Dependence:
'numpy == 1.16.3'
'scipy == 1.3.0'
'Pillow == 5.3.0'
'h5py == 2.8.0'
'lmdb == 0.93'
'networkx == 1.11'
'flatbuffers == 1.10',
'protobuf == 3.6.1'
'onnx == 1.4.1'
'onnx-tf == 1.2.1'
'flask == 1.0.2'
'tensorflow == 1.11.0' or 'tensorflow-gpu'
'dill == 0.2.8.2'
'ruamel.yaml == 0.15.81'
'psutils == 5.6.2'
'ply == 3.11'
'requests == 3.11'
'pytorch == 1.2.0'
'mxnet == 1.5.0'
PS
:
Windows only provides the installation package of Python3.6.
MacOS provides installation packages of Python3.6 and Python3.7.
ARM64 platform (installing Debian 9 or 10 operating system) provides
installation packages of Python3.5 (Debain 9) and Python3.7
(Debian10).
Except for the MacOS platform, the scipy dependency of other platforms
is >=1.1.0.
Quick Start
The test environment uses the "Ubuntu 16.04 x86_64 PC" host. Other
platforms can refer to sdk/external/rknn-
toolkit/doc/Rockchip_Quick_Start_RKNN_Toolkit_V1.4.0_XX.pdf。
RKNN-Toolkit installation
# Install python 3.5
sudo apt-get install python3.5
# Install pip3
sudo apt-get install python3-pip
# Obtain the RKNN-Toolkit installation package, and then perform the following steps
cd sdk/external/rknn-toolkit/
cp sdk/external/rknn-toolkit ./ -rf
cd rknn-toolkit/package/
pip3 install tensorflow==1.11.0
pip3 install mxnet==1.5.0
pip3 install torch==1.2.0 torchvision==0.4.0
pip3 install opencv-python
pip3 install gluoncv
# Install RKNN-Toolkit
sudo pip3 install rknn_toolkit-1.4.0-cp35-cp35m-linux_x86_64.whl
# Check if the installation is successful, import rknn library
rk@rk:~/rknn-toolkit-v1.4.0/package$ python3
>>> from rknn.api import RKNN
>>>
Typec connect host and device to run demo
cd examples/tflite/mobilenet_v1/
daijh@daijh:~/p/sdk/external/rknn-toolkit/examples/tflite/mobilenet_v1$ python3.6 ./test.py
--> config model
done
--> Loading model
done
--> Building model
W The channel_mean_value filed will not be used in the future!
done
--> Export RKNN model
done
--> Init runtime environment
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.0.0 (8f9ebbc@2020-04-03T09:12:30)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI: API: 1.4.0 (b4a8096 build: 2020-08-12 10:15:19)
D RKNNAPI: DRV: 1.5.2 (e67e5cb build: 2020-12-03 15:04:52)
D RKNNAPI: ==============================================
done
--> Running model
mobilenet_v1
-----TOP 5-----
[156]: 0.8603515625
[155]: 0.0833740234375
[205]: 0.0123443603515625
[284]: 0.00726318359375
[260]: 0.002262115478515625
done
--> Begin evaluate model performance
W When performing performance evaluation, inputs can be set to None to use fake inputs.
========================================================================
Performance
========================================================================
Total Time(us): 5573
FPS: 179.44
========================================================================
done
daijh@daijh:~/p/sdk/external/rknn-toolkit/examples/tflite/mobilenet_v1$
In addition to the python interface, we also provide a C/C++ interface
for model inference. Users can complete the model conversion on the PC
and then use the C/C++ on the board to complete the model inference.
The following is the demo running.
# You need to modify the path of the cross compiler before compiling, vim build.sh modify GCC_COMPILER
# GCC_COMPILER=/home/daijh/p/sdk/prebuilts/gcc/linux-x86/arm/gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf/bin/arm-linux-gnueabihf
# Here is the path of my local 32-bit cross-compilation tool. The user needs to modify it to the path of the cross-compilation tool in the SDK.
daijh@daijh:~$ cd sdk/external/rknpu/rknn/rknn_api/examples/rknn_mobilenet_demo
daijh@daijh:sdk/external/rknpu/rknn/rknn_api/examples/rknn_mobilenet_demo$ ./build.sh
# Put the compiled demo into the device
adb push rknn_mobilenet_demo/ /
# Run demo
cd rknn_mobilenet_demo
[root@RV1126_RV1109:/rknn_mobilenet_demo]# ./build/rknn_mobilenet_demo ./model/mobilenet_v1_rv1109_rv1126.rknn ./model/dog_224x224.jpg
model input num: 1, output num: 1
input tensors:
index=0 name= n_dims=4 dims=[1 224 224 3] n_elems=150528 size=150528 fmt=0 type=3 qnt_type=2 fl=127 zp=127 scale=0.007843
output tensors:
index=0 name= n_dims=2 dims=[0 0 1 1001] n_elems=1001 size=2002 fmt=0 type=1 qnt_type=0 fl=127 zp=127 scale=0.007843
rknn_run
155 - 0.091736
156 - 0.851074
205 - 0.013588
Development Document
After you install "RKNN-Toolkit" and have a preliminary understanding
and verification of the development process through the demo, you can
view the detailed RKNN development API to complete your own
development.
RKNN-Toolkit Document:sdk/external/rknn-
toolkit/doc
C/C++ API Document:
sdk/external/rknpu/rknn/doc