1. NPU¶

RK1126/RK1109 NPU must use RKNN model when making model inference. If users use their own algorithm model, they need to be familiar with the development process of RKNN, and then convert their model into RKNN for use. This chapter is aimed at these users to explain the use of RKNN development tools.

1.1. The Introduction Of RKNN¶

RKNN is the model type used by the Rockchip NPU platform. It is a model file ending with the suffix .rknn . Rockchip provides a complete model transformation Python tool for users to convert their self-developed algorithm model into RKNN model, and Rockchip also providesC/C++andPython API interface.

1.2. RKNN-Toolkit¶

1.2.1. Introduction To Tools¶

RKNN-Toolkit is a development kit for users to carry out model transformation, reasoning and performance evaluation on PC and Rockchip NPU platforms. Users can easily complete the following functions through the Python interface provided by the tool:

Model conversion: supportsCaffe, TensorFlow, TensorFlow Lite, ONNX, Darknet, Pytorch, MXNet The model is transformed into RKNN model, which supports the import and export of RKNN model, and can be subsequently loaded and used on the Rockchip NPU platform. Multi-input model is supported from version 1.2.0. PyTorch and MXNet are supported from version 1.3.0.
Mixed quantization: It supports the transformation of floating point models into quantized models. The quantized methods currently supported include Asymmetric_quantized-u8, dynamic fixed-point quantization (dynamic_fixed_point-8 and dynamic_fixed_point-16). Starting with version 1.0.0, the RKNN-Toolkit supports hybrid quantization.
Model inference: Can simulate Rockchip NPU running RKNN model on PC and obtain inference results; The RKNN model can also be distributed to a specified NPU device for inference.
performance evaluation: Can on the PC simulation Rockchip NPU RKNN operation model, and performance assessment model (including total time and each layer of the time-consuming); You can also distribute the RKNN model to run on a specified NPU device to evaluate the performance of the model when running on a real device.
Memory assessment: Evaluate the model runtime consumption of system and NPU memory. To use this function, the RKNN model must be distributed to the NPU device to run, and the relevant interface is called to get the memory usage information. This feature is supported from version 0.9.9.
Model precompilation: The RKNN model generated by precompilation technology can reduce the load time on the hardware platform. For some models, you can also reduce the model size. However, the precompiled RKNN model can only run on NPU devices.

Currently only the x86_64 Ubuntu platform supports generating a precompiled RKNN model directly from the original model. RKNN-Toolkit has supported model precompilation since version 0.9.5 and updated the precompiled method in version 1.0.0. The updated precompiled model is not compatible with the old driver. Starting from version 1.4.0, ordinary RKNN can also be converted via NPU devices The model is converted to a precompiled RKNN model.

Model segmentation: This function is used in the scenario where multiple models are running at the same time. It can divide a single model into multiple segments for execution on the NPU, so as to adjust the execution time occupied by multiple models on the NPU and avoid timely execution of other models because one model takes too much execution time. RKNN-Toolkit supports this feature from version 1.2.0. This feature must be available on hardware with Rockchip NPU and the NPU driver version must be greater than 0.9.8.
Custom operator functionality: If the model has operators that RKNN-Toolkit does not support, it will fail during the model transformation phase. At this point, you can use the custom operator functionality to add unsupported operators so that the model can be transformed and run properly. RKNN-Toolkit supports this feature from version 1.2.0. Refer to Rockchip_Developer_Guide_RKNN_Toolkit_Custom_OP_CN documentation for the use and development of custom operators.
Quantitative precision analysis function: This function will give the Euclidean distance or cosine distance of the inference results of each layer before and after the quantization of the model, so as to analyze how the quantization error appears and provide ideas for improving the accuracy of the quantization model. This feature is supported from version 1.3.0. Version 1.4.0 adds the sub-function of quantization precision analysis layer by layer, and specifies the input of each layer as the correct floating point value to eliminate the accumulation of error layer by layer, which can more accurately reflect the impact of quantization on each layer itself.
Visual function: This function presents various functions of RKNN-Toolkit in the form of graphical interface and simplifies user operation steps. Users can complete functions such as model transformation and reasoning by filling in forms and clicking function buttons without having to manually write scripts. Refer to the documentation Rockchip_user_guide_rknn_toolkit_visualization_cn for details on how to use the visualization function. This feature is now supported in version 1.3.0. Version 1.4.0 improves support for multi-input models and supports new Rockchip NPU devices such as RK1806, RV1109, and RV1126.
Model optimization level function: RKNN-Toolkit will optimize the model in the process of model transformation, and the default optimization option may have some influence on the model accuracy. You can turn off some or all of the tuning options by setting the tuning level. Please refer to the optimization_level parameter in the config interface for details on how to use the optimization level. This feature is supported from version 1.3.0.

1.2.2. Environment Depends On¶

The system supports：Ubuntu 16.04 x64 (above), Window 7 x64 (above), Mac OS X 10.13.5 x64 (above), Debian 9.8 (x64) above
Python version：3.5/3.6/3.7
Python rely on：

'numpy == 1.16.3'
'scipy == 1.3.0'
'Pillow == 5.3.0'
'h5py == 2.8.0'
'lmdb == 0.93'
'networkx == 1.11'
'flatbuffers == 1.10',
'protobuf == 3.6.1'
'onnx == 1.4.1'
'onnx-tf == 1.2.1'
'flask == 1.0.2'
'tensorflow == 1.11.0' or 'tensorflow-gpu'
'dill == 0.2.8.2'
'ruamel.yaml == 0.15.81'
'psutils == 5.6.2'
'ply == 3.11'
'requests == 3.11'
'pytorch == 1.2.0'
'mxnet == 1.5.0'

PS:

Windows only provides the Python3.6 installation package.
MacOS offersPython3.6andPython3.7packages.
ARM64 platform (Debian 9 or 10 operating system) providesPython3.5(Debian 9) andPython3.7(Debian10) installation packages.
Except for MacOS, the SciPy dependency is >=1.1.0.

1.2.3. Quick Learning¶

Test environment using Ubuntu 16.04 x86_64 PC host. For other platforms refer to sdk/external/rknn-toolkit/doc/Rockchip_Quick_Start_RKNN_Toolkit_V1.4.0_XX.pdf.

1.2.3.1. PC HOST¶

RKNN-Toolkit installation

#  Install python 3.5
sudo apt-get install python3.5
#  Install pip3
sudo apt-get install python3-pip
# Get the RKNN-Toolkit installation package and then perform the following steps
cd sdk/external/rknn-toolkit/
cp sdk/external/rknn-toolkit ./ -rf
cd rknn-toolkit/package/
pip3 install tensorflow==1.11.0
pip3 install mxnet==1.5.0
pip3 install torch==1.2.0 torchvision==0.4.0
pip3 install opencv-python
pip3 install gluoncv
# Install RKNN-Toolkit
sudo pip3 install rknn_toolkit-1.4.0-cp35-cp35m-linux_x86_64.whl
# Check that the installation was successful and import the RKNN library

rk@rk:~/rknn-toolkit-v1.4.0/package$ python3
>>> from rknn.api import RKNN
>>>

The device does not enable OTG function by default. If necessary, please enable OTG in the kernel first to compile and upgrade the kernel.

# sdk/kernel/arch/arm/boot/dts/rv1126-firefly-rk809.dtsi
&usbdrd_dwc3 {
	status = "okay";
	dr_mode = "otg";  #open otg
	extcon = <&u2phy0>;
};

Connect the host with USB OTG and run the demo device

cd examples/tflite/mobilenet_v1/
daijh@daijh:~/p/sdk/external/rknn-toolkit/examples/tflite/mobilenet_v1$ python3.6 ./test.py 
--> config model
done
--> Loading model
done
--> Building model
W The channel_mean_value filed will not be used in the future!
done
--> Export RKNN model
done
--> Init runtime environment
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.0.0 (8f9ebbc@2020-04-03T09:12:30)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI:   API: 1.4.0 (b4a8096 build: 2020-08-12 10:15:19)
D RKNNAPI:   DRV: 1.5.2 (e67e5cb build: 2020-12-03 15:04:52)
D RKNNAPI: ==============================================
done
--> Running model
mobilenet_v1
-----TOP 5-----
[156]: 0.8603515625
[155]: 0.0833740234375
[205]: 0.0123443603515625
[284]: 0.00726318359375
[260]: 0.002262115478515625

done
--> Begin evaluate model performance
W When performing performance evaluation, inputs can be set to None to use fake inputs.
========================================================================
                               Performance                              
========================================================================
Total Time(us): 5573
FPS: 179.44
========================================================================

done
daijh@daijh:~/p/sdk/external/rknn-toolkit/examples/tflite/mobilenet_v1$

In addition to the Python interface, we also provide the C/C++ interface of model reasoning, users can complete model transformation on PC and then complete model reasoning on board using C/C++. Here’s the demo run.

# Modify the path of the cross-compiler before compiling. vim build.sh modifies the GCC_COMPILER
# GCC_COMPILER=/home/daijh/p/sdk/prebuilts/gcc/linux-x86/arm/gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf/bin/arm-linux-gnueabihf
# Here is the path of my native 32-bit cross-compilation tool. Users need to change the path to the cross-compilation tool in the SDK.
daijh@daijh:~$ cd sdk/external/rknpu/rknn/rknn_api/examples/rknn_mobilenet_demo
daijh@daijh:sdk/external/rknpu/rknn/rknn_api/examples/rknn_mobilenet_demo$ ./build.sh

# Put the compiled demo on the device
adb push rknn_mobilenet_demo/ /

# Run The Demo
cd rknn_mobilenet_demo
[root@RV1126_RV1109:/rknn_mobilenet_demo]# ./build/rknn_mobilenet_demo ./model/mobilenet_v1_rv1109_rv1126.rknn ./model/dog_224x224.jpg
model input num: 1, output num: 1
input tensors:
index=0 name= n_dims=4 dims=[1 224 224 3] n_elems=150528 size=150528 fmt=0 type=3 qnt_type=2 fl=127 zp=127 scale=0.007843
output tensors:
index=0 name= n_dims=2 dims=[0 0 1 1001] n_elems=1001 size=2002 fmt=0 type=1 qnt_type=0 fl=127 zp=127 scale=0.007843
rknn_run
155 - 0.091736
156 - 0.851074
205 - 0.013588

1.2.3.2. RV1126 HOST¶

Test environment using RV1126 host. The file system is Debian10 . The following operations are performed on RV1126.

Firefly Debian10 firmware RKNN Toolkit Lite installation steps:

No.1 numpy/psutils/ruamel.yaml is a dependency on numpy/psutils/ruamel.yaml

# If pip3 is not installed, please install it with sudo apt-get update && sudo apt-get install python3-pip first
pip3 install numpy==1.16.3
pip3 install psutil==5.6.2
pip3 install ruamel.yaml==0.15.81

No.2 Install opencv-python. Installations with pip3 will continue to fail, so download the package directly online.

Click: Resource Download to download rknn-toolkit-lite.rar and copy it to the rv1126 system to decompress it.

# The two deb packages used by wget are already in the rknn-toolkit-lite-v1.7.0.dev_0cfb22/requires/ directory
sudo apt-get install multiarch-support unrar
unrar x rknn-toolkit-lite.rar
cd rknn-toolkit-lite/rknn-toolkit-lite-v1.7.0.dev_0cfb22/requires/

sudo dpkg -i libjasper1_1.900.1-debian1-2.4+deb8u6_armhf.deb
sudo dpkg -i libjasper-dev_1.900.1-debian1-2.4+deb8u6_armhf.deb
sudo apt-get install libhdf5-dev
sudo apt-get install libatlas-base-dev
sudo apt-get install libqtgui4
sudo apt-get install libqt4-test
pip3 install opencv_python-4.0.1.24-cp37-cp37m-linux_armv7l.whl

No.3 Install RKNN Toolkit Lite

# Install RKNN Toolkit Lite using the following command
cd rknn-toolkit-lite/rknn-toolkit-lite-v1.7.0.dev_0cfb22/packages

pip3 install rknn_toolkit_lite-1.7.0.dev_0cfb22-cp37-cp37m-linux_armv7l.whl

No.4 To run the example (You need to switch to the root user to execute)

cd rknn-toolkit-lite/rknn-toolkit-lite-v1.7.0.dev_0cfb22/examples-lite/inference_with_lite

python3 test.py

# The output is as follows:
root@firefly:/home/firefly/rknn-toolkit-lite-v1.7.0.dev_0cfb22/examples-lite/inference_with_lite# python3 test.py
--> list devices:
*************************
None devices connected.
*************************
done
--> query support target platform
**************************************************
Target platforms filled in RKNN model:         ['RV1109']
Target platforms supported by this RKNN model: ['RK1109', 'RK1126', 'RV1109', 'RV1126']
**************************************************
done
--> Load RKNN model
done
--> Init runtime environment
done
--> get sdk version:
==============================================
RKNN VERSION:
  API: librknn_runtime version 1.6.0 (6523e57 build: 2021-01-15 15:56:31 base: 1126)
  DRV: 6.4.3.5.293908
==============================================

done
--> Running model
resnet18
-----TOP 5-----
[812]: 0.9993900656700134
[404]: 0.0004593880439642817
[657 833]: 2.9284517950145528e-05
[657 833]: 2.9284517950145528e-05
[895]: 1.850890475907363e-05

done

1.2.4. Develop Documentation¶

Once you have installed the RKNN-Toolkit and have a preliminary understanding and verification of the development process through the demo, you can review the detailed RKNN development API to complete your own development.

RKNN-Toolkit Documentation:sdk/external/rknn-toolkit/doc
C/C++ API Documentation:sdk/external/rknpu/rknn/doc