NPU

RK3568 has a NPU(

Neural Process Unit

) that Neural network acceleration engine with processing performance
up to 1 TOPS. Using this NPU module needs to download

RKNN SDK

 which provides programming interfaces for RK3566/RK3568 chip
platforms with NPU. This SDK can help users deploy RKNN models
exported by RKNN-Toolkit2 and accelerate the implementation of AI
applications

The contents of what you will get:

   └── RK_NPU_SDK_1.3.0
       ├── rknn-toolkit2-1.3.0        // toolkit2, for X86_64 PC
       │   ├── ...
       │   └── rknn_toolkit_lite2     // lite version of toolkit2, for arm64 platform
       ├── rknpu2_1.3.0.tar.gz        // this is the RKNN SDK, remember to unzip it
       └── Rockchip_Quick_Start_RKNN_SDK_V1.3.0_CN.pdf

RKNN Model

RKNN is the model type used by the Rockchip NPU platform. It is a
model file ending with the suffix ".rknn ". RKNN SDK provides a
complete model transformation Python tool for users to convert their
self-developed algorithm model into RKNN model

The RKNN model can run directly on the RK3568 platform. There are
demos under "rknpu2_1.3.0/examples". Refer to the "README.md" to
compile Android or Linux Demo (Need cross-compile environment). You
can also just download compiled

demo

.

First prepare the runtime environment for AIO-3568J :

Android

   adb root && adb remount
   adb push rknpu2_1.3.0/runtime/RK356X/Android/librknn_api/arm64-v8a/* /vendor/lib64
   adb push rknpu2_1.3.0/runtime/RK356X/Android/librknn_api/arm64-v8a/* /vendor/lib

Linux

   adb push rknpu2_1.3.0/runtime/RK356X/Linux/librknn_api/aarch64/* /usr/lib

Push demo to AIO-3568J and run:

   :/ # cd /data/rknn_ssd_demo_Android/    (use rknn_ssd_demo_Linux in Linux System)
   :/data/rknn_ssd_demo_Android # chmod 777 rknn_ssd_demo
   :/data/rknn_ssd_demo_Android # export LD_LIBRARY_PATH=./lib
   :/data/rknn_ssd_demo_Android # ./rknn_ssd_demo model/RK356X/ssd_inception_v2.rknn model/road.bmp (In linux it's bus.jpg)
   Loading model ...
   rknn_init ...
   model input num: 1, output num: 2
   input tensors:
     index=0, name=Preprocessor/sub:0, n_dims=4, dims=[1, 300, 300, 3], n_elems=270000, size=270000, fmt=NHWC, type=UINT8, qnt_type=AFFINE, zp=0, scale=0.007812
   output tensors:
     index=0, name=concat:0, n_dims=4, dims=[1, 1917, 1, 4], n_elems=7668, size=30672, fmt=NHWC, type=FP32, qnt_type=AFFINE, zp=53, scale=0.089455
     index=1, name=concat_1:0, n_dims=4, dims=[1, 1917, 91, 1], n_elems=174447, size=697788, fmt=NHWC, type=FP32, qnt_type=AFFINE, zp=53, scale=0.143593
   rknn_run
   loadLabelName
   ssd - loadLabelName ./model/coco_labels_list.txt
   loadBoxPriors
   person @ (13 125 59 212) 0.984696
   person @ (110 119 152 197) 0.969119
   bicycle @ (171 165 278 234) 0.969119
   person @ (206 113 256 216) 0.964519
   car @ (146 133 216 170) 0.959264
   person @ (49 133 58 156) 0.606060
   person @ (83 134 92 158) 0.606060
   person @ (96 135 106 162) 0.464163

RKNN-Toolkit-lite2

The demo mentioned above is using C/C++ program to deploy models. This
requires developers to be familiar with RKNN API to customize.

Toolkit-lite2 with python can simplify the deployment and operation of
the model, making it easy for developers to get started quickly, and
it is highly recommended.

Toolkit-lite2 can only be used to deploy models on arm64 platform, NOT
to transfer models. For Non-RKNN models, use the PC tools mentioned
later

Environment Dependence

System dependency: Toolkit-lite2 needs to work on Debian 10/11
(aarch64)，<font color=red>This tool can only be installed on RK3568
Debian, Ubuntu is not supported yet</font>

Python version: 3.7/3.9

Python library:

   numpy
   ruamel.yaml
   psutils

Toolkit-lite2 installation

   # 1）Install Python3.7/3.9 and pip3
   sudo apt-get install python3 python3-dev python3-pip gcc
   # 2）Install dependent libraries
   sudo apt-get install -y python3-opencv
   sudo apt-get install -y python3-numpy
   # PS: Toolkit-lite2 itself does not rely on opencv-python, but demo does. If you don't need image processing, you can choose not to install it.
   # 3）Install Toolkit-lite2
   # Debian10 ARM64 with python3.7
   pip3 install rknn_toolkit_lite2-1.2.0-cp37-cp37m-linux_aarch64.whl
   # Debian11 ARM64 with python3.9
   pip3 install rknn_toolkit_lite2-1.2.0-cp39-cp39m-linux_aarch64.whl

Running Demo

First prepare the runtime environment, put libraries in RK3568
platform. Methods mentioned above, not repeat here.

Place the demo "inference_with_lite" which under rknn-
toolkit2-1.3.0/rknn_toolkit_lite2/examples directory into AIO-3568J,
then run it:

   root@firefly:~# cd inference_with_lite/
   root@firefly:~/inference_with_lite# python3 test.py
   --> Load RKNN model
   done
   --> Init runtime environment
   I RKNN: [03:46:35.193] RKNN Driver Information: version: 0.4.2
   I RKNN: [03:46:35.193] RKNN Runtime Information: librknnrt version: 1.2.0 (9db21b35d@2022-01-14T15:16:23)
   I RKNN: [03:46:35.194] RKNN Model Information: version: 1, toolkit version: 1.2.0(compiler version: 1.1.2b17 (2d31041c6@2022-01-10T17:56:44)), target: RKNPU lite, target platform: rk3566, framework name: PyTorch, framework layout: NCHW
   done
   --> Running model
   resnet18
   -----TOP 5-----
   [812]: 0.9996383190155029
   [404]: 0.00028062614728696644
   [657]: 1.6321087969117798e-05
   [833 895]: 1.015903580992017e-05
   [833 895]: 1.015903580992017e-05

   done

Non-RKNN Model

For other models like Caffe, TensorFlow, etc, to run on RK3568
platform, conversions are needed. Use RKNN-Toolkit2 to convert other
model into RKNN model.

RKNN-Toolkit2

Introduction of Tool

RKNN-Toolkit2 is a development kit that provides users with model
conversion, inference and performance evaluation on PC and Rockchip
NPU platforms. Users can easily complete the following functions
through the Python interface provided by the tool:

<font color=blue>Model conversion</font>: support to convert Caffe /
TensorFlow / TensorFlow Lite / ONNX / Darknet / PyTorch model to RKNN
model, support RKNN model import/export, which can be used on Rockchip
NPU platform later <br></br>

<font color=blue>Quantization</font>: support to convert float model
to quantization model, currently support quantized methods including
asymmetric quantization(asymmetric_quantized-8,
asymmetric_quantized-16). and support hybrid quantization. <font
color=red>Asymmetric_quantized-16 not supported yet</font> <br></br>

<font color=blue>Model inference</font>: Able to simulate Rockchip NPU
to run RKNN model on PC and get the inference result. This tool can
also distribute the RKNN model to the specified NPU device to run, and
get the inference results <br></br>

<font color=blue>Performance evaluation</font>: distribute the RKNN
model to the specified NPU device to run, and evaluate the model
performance in the actual device <br></br>

<font color=blue>Memory evaluation</font>: Evaluate memory consumption
at runtime of the model. When using this function, the RKNN model must
be distributed to the NPU device to run, and then call the relevant
interface to obtain memory information <br></br>

<font color=blue>Quantitative error analysis</font>: This function
will give the Euclidean or cosine distance of each layer of inference
results before and after the model is quantized. This can be used to
analyze how quantitative error occurs, and provide ideas for improving
the accuracy of quantitative models

Environment Dependence

The system needs: Ubuntu 18.04 (x64) or later. <font color=red> The
Toolkit can only be installed on PC, and Windows, MacOS or Debian not
supported yet</font>

Python version: 3.6/3.8

Python rely on：

   #Python3.6
   cat doc/requirements_cp36-1.3.0.txt
   numpy==1.16.6
   onnx==1.7.0
   onnxoptimizer==0.1.0
   onnxruntime==1.6.0
   tensorflow==1.14.0
   tensorboard==1.14.0
   protobuf==3.12.0
   torch==1.6.0
   torchvision==0.7.0
   psutil==5.6.2
   ruamel.yaml==0.15.81
   scipy==1.2.1
   tqdm==4.27.0
   requests==2.21.0
   opencv-python==4.4.0.46
   PuLP==2.4
   scikit_image==0.17.2
   # if install bfloat16 failed, please install numpy manually first. "pip install numpy==1.16.6"
   bfloat16==1.1
   flatbuffers==1.12

   #Python3.8
   cat doc/requirements_cp38-1.3.0.txt
   numpy==1.17.3
   onnx==1.7.0
   onnxoptimizer==0.1.0
   onnxruntime==1.6.0
   tensorflow==2.2.0
   tensorboard==2.2.2
   protobuf==3.12.0
   torch==1.6.0
   torchvision==0.7.0
   psutil==5.6.2
   ruamel.yaml==0.15.81
   scipy==1.4.1
   tqdm==4.27.0
   requests==2.21.0
   opencv-python==4.4.0.46
   PuLP==2.4
   scikit_image==0.17.2
   # if install bfloat16 failed, please install numpy manually first. "pip install numpy==1.17.3"
   bfloat16==1.1

RKNN-Toolkit2 installation

It is recommended to use virtualenv to manage the python environment
because there may be multiple versions of the python environment in
the system at the same time. Let's start with Python 3.6 as an
example:

   # 1）Install virtualenv、Python3.6 and pip3
   sudo apt install virtualenv \
   sudo apt-get install python3 python3-dev python3-pip
   # 2）Install dependent libraries
   sudo apt-get install libxslt1-dev zlib1g zlib1g-dev libglib2.0-0 libsm6 \
   libgl1-mesa-glx libprotobuf-dev gcc
   # 3）Use virtualenv and install Python dependency, such as requirements_cp36-1.3.0.txt 
   virtualenv -p /usr/bin/python3 venv
   source venv/bin/activate
   pip3 install -r doc/requirements_cp36-*.txt
   # 4）Install RKNN-Toolkit2，such as rknn_toolkit2-1.3.0_11912b58-cp36-cp36m-linux_x86_64.whl
   sudo pip3 install packages/rknn_toolkit2*cp36*.whl
   # 5）Check if RKNN-Toolkit2 is installed successfully or not，press key ctrl+d to exit
   (venv) firefly@T-chip:~/rknn-toolkit2$ python3
   >>> from rknn.api import RKNN
   >>>

The installation is successful if the import of RKNN module doesn't
fail. One of the failures is as follows:

   >>> from rknn.api import RKNN
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   ImportError: No module named 'rknn'

Model Conversion Demo

Toolkit Demos are under "rknn-toolkit2-1.3.0/examples". Here we run a
model conversion demo for example, this demo shows the process of
converting tflite to RKNN, exporting model, inferencing, deploying on
NPU and fetching results. For detailed implementation of the model
conversion, please refer to the source code in Demo and the documents
at the end of this page.

Simulate the running example on PC

RKNN-Toolkit2 has a built-in simulator, simply run a demo to deploy on
NPU simulator.

   (venv) firefly@T-chip:~/rknn-toolkit2-1.3.0$ cd examples/tflite/mobilenet_v1
   (venv) firefly@T-chip:~/rknn-toolkit2-1.3.0/examples/tflite/mobilenet_v1$ ls
   dataset.txt  dog_224x224.jpg  mobilenet_v1_1.0_224.tflite  test.py
   (venv) firefly@T-chip:~/rknn-toolkit2-1.3.0/examples/tflite/mobilenet_v1$ python3 test.py
   W __init__: rknn-toolkit2 version: 1.3.0-11912b58
   --> Config model
   W config: 'target_platform' is None, use rk3566 as default, Please set according to the actual platform!
   done
   --> Loading model
   INFO: Initialized TensorFlow Lite runtime.
   done
   --> Building model
   Analysing : 100%|█████████████████████████████████████████████████| 58/58 [00:00<00:00, 1869.33it/s]
   Quantizating : 100%|████████████████████████████████████████████████| 58/58 [00:00<00:00, 68.07it/s]
   W build: The default input dtype of 'input' is changed from 'float32' to 'int8' in rknn model for performance!
                         Please take care of this change when deploy rknn model with Runtime API!
   done
   --> Export rknn model
   done
   --> Init runtime environment
   Analysing : 100%|█████████████████████████████████████████████████| 60/60 [00:00<00:00, 1434.93it/s]
   Preparing : 100%|██████████████████████████████████████████████████| 60/60 [00:00<00:00, 373.17it/s]
   W init_runtime: target is None, use simulator!
   done
   --> Running model
   mobilenet_v1
   -----TOP 5-----
   [156]: 0.9345703125
   [155]: 0.0570068359375
   [205]: 0.00429534912109375
   [284]: 0.003116607666015625
   [285]: 0.00017178058624267578

   done

Run on AIO-3568J NPU connected to the PC

RKNN Toolkit2 runs on the PC and connects to the AIO-3568J through the
PC's USB. RKNN Toolkit2 transfers the RKNN model to the NPU device of
AIO-3568J to run, and then obtains the inference results, performance
information, etc. from the AIO-3568J

AIO-3568J Android need to refer to the section "

ADB USE

" open ADB，ADB is enabled by default on Linux. We could see adb
device after opening

   (venv) firefly@T-chip:~$ adb devices 
   List of devices attached
   XXXXXXXX	device

First prepare AIO-3568J environment: update librknnrt.so and run
rknn_server

Android

   adb root && adb remount
   adb push rknpu2_1.3.0/runtime/RK356X/Android/rknn_server/arm64-v8a/vendor/bin/rknn_server /vendor/bin
   adb push rknpu2_1.3.0/runtime/RK356X/Android/librknn_api/arm64-v8a/librknnrt.so /vendor/lib64
   adb push rknpu2_1.3.0/runtime/RK356X/Android/librknn_api/arm64-v8a/librknnrt.so /vendor/lib
   adb shell reboot

   # Android System will automatically run rknn_server, chekc it by using command "ps -ef | grep rknn_server"

Linux

   adb push rknpu2_1.3.0/runtime/RK356X/Linux/librknn_api/aarch64/* /usr/lib

   # The system generally comes with rknn_server, use "systemctl status rknn_server" to check if the service is running
   # If service doesn't exist or is not runnung, manually add and run rknn_server
   adb push rknpu2_1.3.0/runtime/RK356X/Linux/rknn_server/aarch64/usr/bin/* /usr/bin/

   # We can use "systemctl status rknn_server" to check whether the rknn_server is running.
   # Without running rknn_server, run it on the serial terminal
   chmod +x /usr/bin/rknn_server
   /usr/bin/rknn_server

Then modify the demo file rknn-
toolkit2-1.3.0/examples/tflite/mobilenet_v1/test.py on PC, add the
target platform in it.

   diff --git a/rknn-toolkit2-1.3.0/examples/tflite/mobilenet_v1/test.py b/examples/tflite/mobilenet_v1/test.py
   index 0507edb..fd2e070 100755
   --- a/examples/tflite/mobilenet_v1/test.py
   +++ b/examples/tflite/mobilenet_v1/test.py
   @@ -24,11 +24,11 @@ def show_outputs(outputs):
    if __name__ == '__main__':
    
        # Create RKNN object
   -    rknn = RKNN(verbose=True)
   +    rknn = RKNN()
    
        # Pre-process config
        print('--> Config model')
   -    rknn.config(mean_values=[128, 128, 128], std_values=[128, 128, 128])
   +    rknn.config(mean_values=[128, 128, 128], std_values=[128, 128, 128], target_platform='rk3568')
        print('done')
    
        # Load model
   @@ -62,7 +62,7 @@ if __name__ == '__main__':
    
        # Init runtime environment
        print('--> Init runtime environment')
   -    ret = rknn.init_runtime()
   +    ret = rknn.init_runtime(target='rk3568')
        if ret != 0:
            print('Init runtime environment failed!')
            exit(ret)

Run test.py on host PC

   (venv) firefly@T-chip:~/rknn-toolkit2-1.3.0/examples/tflite/mobilenet_v1$ python3 test.py 
   W __init__: rknn-toolkit2 version: 1.3.0-11912b58
   --> Config model
   done
   --> Loading model
   INFO: Initialized TensorFlow Lite runtime.
   done
   --> Building model
   Analysing : 100%|█████████████████████████████████████████████████| 58/58 [00:00<00:00, 1730.77it/s]
   Quantizating : 100%|███████████████████████████████████████████████| 58/58 [00:00<00:00, 366.86it/s]
   W build: The default input dtype of 'input' is changed from 'float32' to 'int8' in rknn model for performance!
                         Please take care of this change when deploy rknn model with Runtime API!
   done
   --> Export rknn model
   done
   --> Init runtime environment
   I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.1.0 (b5861e7@2020-11-23T11:50:36)
   D RKNNAPI: ==============================================
   D RKNNAPI: RKNN VERSION:
   D RKNNAPI:   API: 1.3.0 (121b661 build: 2022-04-29 11:07:20)
   D RKNNAPI:   DRV: rknn_server: 1.3.0 (121b661 build: 2022-04-29 11:11:34)
   D RKNNAPI:   DRV: rknnrt: 1.3.0 (9b36d4d74@2022-05-04T20:16:47)
   D RKNNAPI: ==============================================
   done
   --> Running model
   mobilenet_v1
   -----TOP 5-----
   [156]: 0.93505859375
   [155]: 0.057037353515625
   [205]: 0.0038814544677734375
   [284]: 0.0031185150146484375
   [285]: 0.00017189979553222656

   done

Other Toolkit Demo

Other Toolkit demos can be found under "rknn-
toolkit2-1.3.0/examples/", such as quantization, accuracy analysis
demos. For detailed implementation, please refer to the source code in
Demo and the detailed development documents.

Detailed Development Documents

Please refer to pdf files under docs directory in RKNN and Toolkit SDK
for development.