NPU
"RK1126/RK1109 " NPU must use "RKNN" model when making model
inference. If users use their own algorithm model, they need to be
familiar with the development process of RKNN, and then convert their
model into RKNN for use. This chapter is aimed at these users to
explain the use of RKNN development tools.
The Introduction Of RKNN
RKNN is the model type used by the Rockchip NPU platform. It is a
model file ending with the suffix ".rknn ". Rockchip provides a
complete model transformation Python tool for users to convert their
self-developed algorithm model into RKNN model, and Rockchip also
provides"C/C++"and"Python" API interface.
RKNN-Toolkit
Introduction To Tools
"RKNN-Toolkit " is a development kit for users to carry out model
transformation, reasoning and performance evaluation on PC and
Rockchip NPU platforms. Users can easily complete the following
functions through the Python interface provided by the tool:
Model conversion: supportsCaffe, TensorFlow,
TensorFlow Lite, ONNX, Darknet, Pytorch, MXNet The model is
transformed into RKNN model, which supports the import and export of
RKNN model, and can be subsequently loaded and used on the Rockchip
NPU platform. Multi-input model is supported from version 1.2.0.
PyTorch and MXNet are supported from version 1.3.0.
Mixed quantization: It supports the
transformation of floating point models into quantized models. The
quantized methods currently supported include Asymmetric_quantized-u8,
dynamic fixed-point quantization (dynamic_fixed_point-8 and
dynamic_fixed_point-16). Starting with version 1.0.0, the RKNN-Toolkit
supports hybrid quantization.
Model inference: Can simulate Rockchip NPU
running RKNN model on PC and obtain inference results; The RKNN model
can also be distributed to a specified NPU device for inference.
performance evaluation: Can on the PC
simulation Rockchip NPU RKNN operation model, and performance
assessment model (including total time and each layer of the time-
consuming); You can also distribute the RKNN model to run on a
specified NPU device to evaluate the performance of the model when
running on a real device.
Memory assessment: Evaluate the model runtime
consumption of system and NPU memory. To use this function, the RKNN
model must be distributed to the NPU device to run, and the relevant
interface is called to get the memory usage information. This feature
is supported from version 0.9.9.
Model precompilation: The RKNN model generated
by precompilation technology can reduce the load time on the hardware
platform. For some models, you can also reduce the model size.
However, the precompiled RKNN model can only run on NPU devices.
Currently only the x86_64 Ubuntu platform supports generating a
precompiled RKNN model directly from the original model. RKNN-Toolkit
has supported model precompilation since version "0.9.5" and updated
the precompiled method in version "1.0.0". The updated precompiled
model is not compatible with the old driver. Starting from version
"1.4.0", ordinary RKNN can also be converted via NPU devices The model
is converted to a precompiled RKNN model.
Model segmentation: This function is used in
the scenario where multiple models are running at the same time. It
can divide a single model into multiple segments for execution on the
NPU, so as to adjust the execution time occupied by multiple models on
the NPU and avoid timely execution of other models because one model
takes too much execution time. RKNN-Toolkit supports this feature from
version 1.2.0. This feature must be available on hardware with
Rockchip NPU and the NPU driver version must be greater than 0.9.8.
Custom operator functionality: If the model
has operators that RKNN-Toolkit does not support, it will fail during
the model transformation phase. At this point, you can use the custom
operator functionality to add unsupported operators so that the model
can be transformed and run properly. RKNN-Toolkit supports this
feature from version 1.2.0. Refer to
Rockchip_Developer_Guide_RKNN_Toolkit_Custom_OP_CN documentation for
the use and development of custom operators.
Quantitative precision analysis function: This
function will give the Euclidean distance or cosine distance of the
inference results of each layer before and after the quantization of
the model, so as to analyze how the quantization error appears and
provide ideas for improving the accuracy of the quantization model.
This feature is supported from version 1.3.0. Version 1.4.0 adds the
sub-function of quantization precision analysis layer by layer, and
specifies the input of each layer as the correct floating point value
to eliminate the accumulation of error layer by layer, which can more
accurately reflect the impact of quantization on each layer itself.
Visual function: This function presents
various functions of RKNN-Toolkit in the form of graphical interface
and simplifies user operation steps. Users can complete functions such
as model transformation and reasoning by filling in forms and clicking
function buttons without having to manually write scripts. Refer to
the documentation Rockchip_user_guide_rknn_toolkit_visualization_cn
for details on how to use the visualization function. This feature is
now supported in version 1.3.0. Version 1.4.0 improves support for
multi-input models and supports new Rockchip NPU devices such as
RK1806, RV1109, and RV1126.
Model optimization level function: RKNN-
Toolkit will optimize the model in the process of model
transformation, and the default optimization option may have some
influence on the model accuracy. You can turn off some or all of the
tuning options by setting the tuning level. Please refer to the
optimization_level parameter in the config interface for details on
how to use the optimization level. This feature is supported from
version 1.3.0.
Environment Depends On
The system supports:Ubuntu 16.04 x64 (above),
Window 7 x64 (above), Mac OS X 10.13.5 x64 (above), Debian 9.8 (x64)
above
Python version:3.5/3.6/3.7
Python rely on:
'numpy == 1.16.3'
'scipy == 1.3.0'
'Pillow == 5.3.0'
'h5py == 2.8.0'
'lmdb == 0.93'
'networkx == 1.11'
'flatbuffers == 1.10',
'protobuf == 3.6.1'
'onnx == 1.4.1'
'onnx-tf == 1.2.1'
'flask == 1.0.2'
'tensorflow == 1.11.0' or 'tensorflow-gpu'
'dill == 0.2.8.2'
'ruamel.yaml == 0.15.81'
'psutils == 5.6.2'
'ply == 3.11'
'requests == 3.11'
'pytorch == 1.2.0'
'mxnet == 1.5.0'
PS
:
Windows only provides the Python3.6 installation package.
MacOS offersPython3.6andPython3.7packages.
ARM64 platform (Debian 9 or 10 operating system)
providesPython3.5(Debian 9) andPython3.7(Debian10) installation
packages.
Except for MacOS, the SciPy dependency is >=1.1.0.
Quick Learning
Test environment using "Ubuntu 16.04 x86_64 PC" host. For other
platforms refer to sdk/external/rknn-
toolkit/doc/Rockchip_Quick_Start_RKNN_Toolkit_V1.4.0_XX.pdf.
PC HOST
RKNN-Toolkit installation
# Install python 3.5
sudo apt-get install python3.5
# Install pip3
sudo apt-get install python3-pip
# Get the RKNN-Toolkit installation package and then perform the following steps
cd sdk/external/rknn-toolkit/
cp sdk/external/rknn-toolkit ./ -rf
cd rknn-toolkit/package/
pip3 install tensorflow==1.11.0
pip3 install mxnet==1.5.0
pip3 install torch==1.2.0 torchvision==0.4.0
pip3 install opencv-python
pip3 install gluoncv
# Install RKNN-Toolkit
sudo pip3 install rknn_toolkit-1.4.0-cp35-cp35m-linux_x86_64.whl
# Check that the installation was successful and import the RKNN library
rk@rk:~/rknn-toolkit-v1.4.0/package$ python3
>>> from rknn.api import RKNN
>>>
The device does not enable OTG function by default. If necessary,
please enable OTG in the kernel first to compile and upgrade the
kernel.
# sdk/kernel/arch/arm/boot/dts/rv1126-firefly-rk809.dtsi
&usbdrd_dwc3 {
status = "okay";
dr_mode = "otg"; #open otg
extcon = <&u2phy0>;
};
Connect the host with USB OTG and run the demo device
cd examples/tflite/mobilenet_v1/
daijh@daijh:~/p/sdk/external/rknn-toolkit/examples/tflite/mobilenet_v1$ python3.6 ./test.py
--> config model
done
--> Loading model
done
--> Building model
W The channel_mean_value filed will not be used in the future!
done
--> Export RKNN model
done
--> Init runtime environment
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.0.0 (8f9ebbc@2020-04-03T09:12:30)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI: API: 1.4.0 (b4a8096 build: 2020-08-12 10:15:19)
D RKNNAPI: DRV: 1.5.2 (e67e5cb build: 2020-12-03 15:04:52)
D RKNNAPI: ==============================================
done
--> Running model
mobilenet_v1
-----TOP 5-----
[156]: 0.8603515625
[155]: 0.0833740234375
[205]: 0.0123443603515625
[284]: 0.00726318359375
[260]: 0.002262115478515625
done
--> Begin evaluate model performance
W When performing performance evaluation, inputs can be set to None to use fake inputs.
========================================================================
Performance
========================================================================
Total Time(us): 5573
FPS: 179.44
========================================================================
done
daijh@daijh:~/p/sdk/external/rknn-toolkit/examples/tflite/mobilenet_v1$
In addition to the Python interface, we also provide the C/C++
interface of model reasoning, users can complete model transformation
on PC and then complete model reasoning on board using C/C++. Here's
the demo run.
# Modify the path of the cross-compiler before compiling. vim build.sh modifies the GCC_COMPILER
# GCC_COMPILER=/home/daijh/p/sdk/prebuilts/gcc/linux-x86/arm/gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf/bin/arm-linux-gnueabihf
# Here is the path of my native 32-bit cross-compilation tool. Users need to change the path to the cross-compilation tool in the SDK.
daijh@daijh:~$ cd sdk/external/rknpu/rknn/rknn_api/examples/rknn_mobilenet_demo
daijh@daijh:sdk/external/rknpu/rknn/rknn_api/examples/rknn_mobilenet_demo$ ./build.sh
# Put the compiled demo on the device
adb push rknn_mobilenet_demo/ /
# Run The Demo
cd rknn_mobilenet_demo
[root@RV1126_RV1109:/rknn_mobilenet_demo]# ./build/rknn_mobilenet_demo ./model/mobilenet_v1_rv1109_rv1126.rknn ./model/dog_224x224.jpg
model input num: 1, output num: 1
input tensors:
index=0 name= n_dims=4 dims=[1 224 224 3] n_elems=150528 size=150528 fmt=0 type=3 qnt_type=2 fl=127 zp=127 scale=0.007843
output tensors:
index=0 name= n_dims=2 dims=[0 0 1 1001] n_elems=1001 size=2002 fmt=0 type=1 qnt_type=0 fl=127 zp=127 scale=0.007843
rknn_run
155 - 0.091736
156 - 0.851074
205 - 0.013588
RV1126 HOST
Test environment using "RV1126" host. The file system is "Debian10" .
The following operations are performed on "RV1126".
Firefly Debian10 firmware RKNN Toolkit Lite installation steps:
No.1 "numpy/psutils/ruamel.yaml" is a dependency on
"numpy/psutils/ruamel.yaml"
# If pip3 is not installed, please install it with sudo apt-get update && sudo apt-get install python3-pip first
pip3 install numpy==1.16.3
pip3 install psutil==5.6.2
pip3 install ruamel.yaml==0.15.81
No.2 Install "opencv-python". Installations with pip3 will continue to
fail, so download the package directly online.
Click:
Resource Download
to download rknn-toolkit-lite.rar and copy it to the rv1126 system to
decompress it.
# The two deb packages used by wget are already in the rknn-toolkit-lite-v1.7.0.dev_0cfb22/requires/ directory
sudo apt-get install multiarch-support unrar
unrar x rknn-toolkit-lite.rar
cd rknn-toolkit-lite/rknn-toolkit-lite-v1.7.0.dev_0cfb22/requires/
sudo dpkg -i libjasper1_1.900.1-debian1-2.4+deb8u6_armhf.deb
sudo dpkg -i libjasper-dev_1.900.1-debian1-2.4+deb8u6_armhf.deb
sudo apt-get install libhdf5-dev
sudo apt-get install libatlas-base-dev
sudo apt-get install libqtgui4
sudo apt-get install libqt4-test
pip3 install opencv_python-4.0.1.24-cp37-cp37m-linux_armv7l.whl
No.3 Install RKNN Toolkit Lite
# Install RKNN Toolkit Lite using the following command
cd rknn-toolkit-lite/rknn-toolkit-lite-v1.7.0.dev_0cfb22/packages
pip3 install rknn_toolkit_lite-1.7.0.dev_0cfb22-cp37-cp37m-linux_armv7l.whl
No.4 To run the example (You need to switch to the root user to
execute)
cd rknn-toolkit-lite/rknn-toolkit-lite-v1.7.0.dev_0cfb22/examples-lite/inference_with_lite
python3 test.py
# The output is as follows:
root@firefly:/home/firefly/rknn-toolkit-lite-v1.7.0.dev_0cfb22/examples-lite/inference_with_lite# python3 test.py
--> list devices:
*************************
None devices connected.
*************************
done
--> query support target platform
**************************************************
Target platforms filled in RKNN model: ['RV1109']
Target platforms supported by this RKNN model: ['RK1109', 'RK1126', 'RV1109', 'RV1126']
**************************************************
done
--> Load RKNN model
done
--> Init runtime environment
done
--> get sdk version:
==============================================
RKNN VERSION:
API: librknn_runtime version 1.6.0 (6523e57 build: 2021-01-15 15:56:31 base: 1126)
DRV: 6.4.3.5.293908
==============================================
done
--> Running model
resnet18
-----TOP 5-----
[812]: 0.9993900656700134
[404]: 0.0004593880439642817
[657 833]: 2.9284517950145528e-05
[657 833]: 2.9284517950145528e-05
[895]: 1.850890475907363e-05
done
Develop Documentation
Once you have installed the "RKNN-Toolkit" and have a preliminary
understanding and verification of the development process through the
demo, you can review the detailed RKNN development API to complete
your own development.
RKNN-Toolkit Documentation:sdk/external/rknn-
toolkit/doc
C/C++ API
Documentation:sdk/external/rknpu/rknn/doc