YOLO and MediaPipe Image Object Detection Testing on UNIHIKER

PythonIoTSBCUNIHIKER 10/09/2024 7272

Introduction

In the previous three tutorial posts How to Run YOLOv8 Object Detection Model on UNIHIKERr Single Board Computer, How to Run YOLOv10 on UNIHIKER: A Step-by-Step Guide for Efficient Object Detection and How to Install and Run Mediapipe on UNIHIKER, we've explored the basics of running object detection AI models using the official code and conducted initial tests and comparisons across various models. Now, this article will dive deeper into a comprehensive analysis of different object detection models, examining factors like input sizes and model configurations. Finally, we’ll offer practical recommendations to help you choose the most suitable model for your image detection tasks.

Test results of models used by Yolo and Mediapipe

Dark green indicates the fastest model with the best accuracy at each resolution;

Light green indicates the fastest model with the second-best accuracy at each resolution.

Based on the above statistics, when performing image object detection, you can balance your choice of model depending on the image resolution, required speed, and accuracy.

The accuracy criteria used in this test are as follows (and similarly in the following sections):

Summary of Test Results

In the Ultralytics official library, the YOLO series models can be exported in ONNX format, but they do not support INT8 quantization, only half-precision (float16) quantization. However, since the UNIHIKER does not have speed optimizations for half-precision, the speed is the same as with float32. Therefore, in this section's comparison, the YOLO series models are not quantized.

The previous three tutorial posts have already explained how to perform image object detection using the official code, so we will not repeat that here. Instead, this article will focus solely on statistical analysis. Below is a summary of the test results:

We found the following characteristics after statistical testing:

As the image resolution of the yolo series is reduced, the detection time is significantly reduced; while the Mediapipe is not obvious. This shows that in small-resolution image detection, the yolo series has a significant speed advantage;
The speed of the two det models is significantly improved after int8 quantization, with almost no loss of accuracy;
Mediapipe's model has a significant speed advantage compared to yolo at larger resolutions, but its accuracy is slightly lower. When applying, there is a trade-off between accuracy and speed.

Model selection recommendations

If the object is close or large, features can be easily extracted from low-resolution images. In this case, it is recommended to use a lower-resolution but faster model, such as the yolov10n series;
If the object is far away or the object is small, you need a higher resolution to extract enough features. In this case, you need to use high-resolution images. If speed is more important, you can use ssd_mobilenet_v2. If accuracy is more important, It is recommended to choose yolov10n. You can refer to the following process to select a model.

If you need any help or want to join more discussions, feel free to join our Discord: https://discord.gg/PVAWBMPwsk