How to use TensorFlow Lite Micro on UNIHIKER K10 (ESP32 S3) with Arduino IDE

News DFRobotAIPythonUNIHIKERSTEM 22/01/2025 15220

Introduction

With the growing popularity of IoT and embedded systems, implementing efficient machine learning inference on resource-constrained hardware has become a key challenge. The K10 AI hardware and TensorFlow Lite Micro (TFLM) offer a lightweight solution for TinyML development on embedded devices. This article will provide a detailed guide on quickly implementing embedded machine learning applications using the K10 hardware and the Arduino IDE development environment, from model training to deployment. This tutorial also applies to similar hardware platforms, such as ESP32 S3 AI devices.

In this article, we will walk through the entire TinyML development process, from model training to deployment, using the DFRobot UNIHIKER K10 hardware and the TensorFlow Lite Micro Arduino library. This guide aims to help developers quickly get started with machine learning on resource-constrained devices. Code: https://github.com/polamaxu/TFLM

UNIHIKER K10

The UNIHIKER K10 is a board designed specifically for programming education, IoT, and AI projects in information technology courses.

Core:

MCU: ESP32-S3, 32-bit dual-core processor, with a clock speed of 240MHz.

Wireless Communication:

Wi-Fi: Supports 2.4G Wi-Fi.
Bluetooth: Supports Bluetooth 5.0.

Display Module:

Screen: 2.8-inch LCD.

Onboard Components:

Camera: Integrated camera for image capture and detection, suitable for AI applications like facial recognition and object detection.
Microphone: Equipped with two microphones to capture sound signals, supporting voice recognition and voice interaction functions.
Speaker: Built-in speaker for audio playback, used for voice prompts and music playback.
Sensors: Includes a digital ambient light sensor, temperature and humidity sensor, and an accelerometer, capable of detecting environmental and motion parameters to provide data for IoT applications.
RGB Lights: Three RGB lights capable of displaying multiple colors.
Buttons: Two buttons for input and control operations.

Tensorflow vs TensorFlow-Lite-Mirco

TensorFlow is an open-source machine learning framework that provides a rich set of APIs, enabling developers to easily build, train, and deploy machine learning models.

TensorFlow Lite Micro is a subset of TensorFlow, specifically designed for resource-constrained devices like microcontrollers. It allows developers to deploy machine learning models on these devices, enabling them to perform local data processing and inference without relying on cloud computing.

TensorFlow: Suitable for complex model training and development in desktop or server environments, supporting highly customizable and intricate operations.
TensorFlow Lite for Microcontrollers (TFLM): Designed for embedded devices, it optimizes storage and computational efficiency, making it ideal for running machine learning models quickly in resource-constrained environments.

Optimizing Model Performance on Micro Devices — （source：https://github.com/johnosbb/MicroTFLite）

TinyML Development Process (Software: Arduino IDE, Hardware: K10)

Data Collection:

Create an Arduino sketch to collect data from sensors or other input devices, forming the dataset required for model training.

Define and Train Model:

In the TensorFlow development environment (e.g., Google Colab), define a deep neural network (DNN) model and train it using the collected data.

Model Conversion and Saving:

Convert the trained model to TensorFlow Lite format and save it as a model.h file, which contains the model's binary representation in Flat Buffer format.

Deploy:

Prepare the inference code in Arduino IDE, including the following steps:

- Import necessary header files (e.g., TensorFlow Lite Micro and the model file model.h).

- Define the TensorArena (memory buffer).

- Initialize the model.

- Set up input data and run inference.

- Read the inference output.

Testing and Optimization:

Use the serial debug tool to check the inference results and optimize model performance as needed.

MicroTFLite

MicroTFLite is a TensorFlow Lite Micro library designed for Arduino, aimed at simplifying the process of using TensorFlow Lite Micro on Arduino boards. The MicroTFLite library is suitable for a range of tasks such as classification, regression, and prediction.

Features:

Arduino-style API: MicroTFLite provides a typical Arduino-style API, avoiding the use of pointers or other C++ syntax constructs in Arduino code, making it more user-friendly for Arduino developers.
Support for Quantized and Floating Point Data: The library can handle both quantized data and raw floating-point values, automatically detecting the appropriate processing method based on the model's metadata.
Debugging Functions: It offers several functions to help developers understand the model deployment process and assist with debugging model issues.

Target Platforms:

MicroTFLite is compatible with a variety of embedded devices, including:

Arduino Nano series (e.g., Nano 33 BLE, Nano ESP32)
Arduino Nicla, Portenta series
ESP32 and Arduino Giga R1 WiFi

Supported Operators in MicroTFLite:

MicroTFLite supports a range of common machine learning operators:

Convolution and Pooling:

- 2D Convolution (CONV_2D), Depthwise Convolution (DEPTHWISE_CONV_2D)

- Max Pooling (MAX_POOL_2D), Average Pooling (AVERAGE_POOL_2D)

Fully Connected Layer: FULLY_CONNECTED
Activation Functions:

- ReLU, ReLU6, TANH, SIGMOID

MicroTFLite API Overview:

Model Initialization and Deployment:

Initialize Model: Use the ModelInit() function to initialize the TensorFlow Lite model and interpreter. This includes loading model data and allocating necessary memory space (e.g., tensorArena).
Load Model File: Supports loading pre-trained TensorFlow Lite models from binary files (such as model.h), enabling them to run on Arduino devices.

Input and Output Processing:

Set Input Data: Use the ModelSetInput() function to set input data into the model's input tensor. This function also supports quantization, automatically adjusting input values based on the model's quantization parameters.
Get Output Results: The ModelGetOutput() function allows you to read the model's output after inference, enabling developers to retrieve predictions or classification results.

Inference and Debugging:

Run Inference: The ModelRunInference() function starts the model's inference process. It performs the forward pass and generates output.
Print Tensor Information: Several functions are available to print tensor information, such as ModelPrintInputTensorDimensions() and ModelPrintOutputTensorDimensions(), which help developers understand the dimensions of input and output tensors.
Debugging and Debug Information: The ModelPrintTensorQuantizationParams() function prints the quantization parameters for input and output tensors, which is helpful for debugging quantized models. The ModelPrintMetadata() function displays the model's metadata, such as description and version, helping developers understand basic model information.

Install MicroTFLite

1. In Arduino IDE 1.8.19 install the K10 board

preferences -> Additional Boards Manager URLs：

https://downloadcd.dfrobot.com.cn/UNIHIKER/package_unihiker_index.json

Compiler warnings: None

In Arduino IDE 1.8.19 install the K10 board

Tools -> Boards Manager

Search esp32 and install

done

2. The Library Manager searches for "MicroTFLite" to install it.

1. Example Test: Real-time Industrial Equipment Fault Prediction Using Sensor Data

The ArduinoLite_preventive_maintenance project uses a feedforward neural network architecture consisting of an input layer, two dropout layers, one hidden layer, and an output layer to handle a binary classification task. It employs binary_crossentropy as the loss function and the Adam optimizer for training, with support for class weight adjustment and early stopping callbacks to ensure effective training even in cases of class imbalance. The example uses sensor data, such as speed, temperature, vibration, and current, simulated by random numbers. To reduce the model size, the model is quantized to int8.

The main functionalities of this program are:

Simulating the collection of various equipment parameters (speed, temperature, vibration, current).
Using these parameters for fault prediction.
Making actual fault determinations based on predefined thresholds.
Comparing predicted results with actual faults and calculating prediction accuracy.

Effects and Impact of Quantization:

Quantization in TensorFlow Lite for Microcontrollers (TFLM) is the process of converting model data from floating-point (float32) to integer (int8).

Reducing Model Size: Quantization converts floating-point numbers to integers, significantly reducing the model's storage requirements.
Improving Computational Efficiency: On resource-constrained devices, integer calculations are typically more efficient than floating-point calculations. Integer processors (such as the DSP extensions in ARM Cortex-M) can perform operations like integer multiplication and addition more quickly, thus improving the inference speed of the model.
Accuracy Loss: Quantization introduces some degree of accuracy loss because integer representations cannot fully precisely represent floating-point numbers. However, with proper quantization strategies and model optimization, significant performance improvements can be achieved while maintaining relatively high accuracy.

CODE

/* Copyright 2024 John O'Sullivan, TensorFlow Authors. All Rights Reserved.

This is an example program that runs TensorFlow Lite models using the MicroTFLite library.
It is mainly used for preventive maintenance of equipment by predicting potential failures from sensor data.

For more information, refer to the library documentation:
https://github.com/johnosbb/MicroTFLite

Licensed under the Apache License, Version 2.0 (the "License");
... License Information ...
==============================================================================*/

#include <MicroTFLite.h>
#include "model.h"

// Statistics of feature data (before scaling and balancing):
// RPM - Mean: 1603.866, Standard Deviation: 195.843
// Temperature (°C) - Mean: 24.354, Standard Deviation: 4.987
// Vibration (g) - Mean: 0.120, Standard Deviation: 0.020
// Current (A) - Mean: 3.494, Standard Deviation: 0.308
constexpr float tMean = 24.354f;      // Temperature mean value
constexpr float rpmMean = 1603.866f;  // RPM mean value
constexpr float vMean = 0.120f;       // Vibration mean value
constexpr float cMean = 3.494f;       // Current mean value
constexpr float tStd = 4.987f;        // Temperature standard deviation
constexpr float rpmStd = 195.843f;    // RPM standard deviation
constexpr float vStd = 0.020f;        // Vibration standard deviation
constexpr float cStd = 0.308f;        // Current standard deviation

// Thresholds for failure conditions
const float highTempThreshold = 30.0f;            // High temperature threshold (Celsius)
const float lowRpmThreshold = 1500.0f;            // Low RPM threshold
const float highVibrationThreshold = 0.60f;       // High vibration threshold (g)
const float abnormalCurrentLowThreshold = 0.2f;   // Low current threshold (Amps)
const float abnormalCurrentHighThreshold = 10.8f; // High current threshold (Amps)

// Prediction statistics counters
int totalPredictions = 0;    
int truePositives = 0;       
int falsePositives = 0;      
int trueNegatives = 0;       
int falseNegatives = 0;      
float rollingAccuracy = 0.0f; // Rolling accuracy calculation
bool showStatistics = false; // Whether to display statistics

// Allocate memory for TensorFlow Lite
constexpr int kTensorArenaSize = 4 * 1024;
alignas(16) uint8_t tensorArena[kTensorArenaSize];

void setup() {
    // Initialize serial communication and wait for the serial monitor to open
    Serial.begin(115200);
    while (!Serial);
    delay(5000);
    Serial.println("Preventive Maintenance Example.");
    Serial.println("Initializing TensorFlow Lite Micro Interpreter...");
    
    // Initialize TensorFlow Lite model
    if (!ModelInit(model, tensorArena, kTensorArenaSize)) {
        Serial.println("Model initialization failed!");
        while (true);
    }
    Serial.println("Model initialization done.");
    ModelPrintMetadata();
    ModelPrintTensorQuantizationParams();
    ModelPrintTensorInfo();
}

// Generate normally distributed random values using Box-Muller transform
float GenerateRandomValue(float mean, float stddev) {
    float u1 = random(0, 10000) / 10000.0f;
    float u2 = random(0, 10000) / 10000.0f;
    // Box-Muller transform to generate normally distributed values
    float z0 = sqrt(-2.0f * log(u1)) * cos(2.0f * PI * u2);
    float value = mean + z0 * stddev;
    return value;
}

// Simulated function to read sensor data
float ReadRpm() {
    return GenerateRandomValue(rpmMean, rpmStd);
}

float ReadVibration() {
    return GenerateRandomValue(vMean, vStd);
}

float ReadTemperature() {
    return GenerateRandomValue(tMean, tStd);
}

float ReadCurrent() {
    return GenerateRandomValue(cMean, cStd);
}

// Check for failure conditions
bool CheckFailureConditions(float temperature, float rpm, float vibration, float current) {
    bool conditionMet = false;
    String failureReason = "";

    // Check if any parameter exceeds thresholds
    if (temperature > highTempThreshold) {
        conditionMet = true;
        failureReason += "High temperature; ";
    }
    if (rpm < lowRpmThreshold) {
        conditionMet = true;
        failureReason += "Low RPM; ";
    }
    if (vibration > highVibrationThreshold) {
        conditionMet = true;
        failureReason += "High vibration; ";
    }
    if (current < abnormalCurrentLowThreshold) {
        conditionMet = true;
        failureReason += "Low current; ";
    }
    if (current > abnormalCurrentHighThreshold) {
        conditionMet = true;
        failureReason += "High current; ";
    }

    if (conditionMet) {
        Serial.print("Warning: Sensor readings indicate potential failure. Cause: ");
        Serial.println(failureReason);
    }

    return conditionMet;
}

void loop() {
    // Read sensor data (simulated values)
    float rpm = ReadRpm();
    float temperature = ReadTemperature();
    float current = ReadCurrent();
    float vibration = ReadVibration();
    bool inputSetFailed = false;

    // Check actual failure conditions
    bool actualFailure = CheckFailureConditions(temperature, rpm, vibration, current);

    // Data normalization
    float temperatureN = (temperature - tMean) / tStd;
    float rpmN = (rpm - rpmMean) / rpmStd;
    float currentN = (current - cMean) / cStd;
    float vibrationN = (vibration - vMean) / vStd;

    // Input data into the model
    if (!ModelSetInput(temperatureN, 0))
        inputSetFailed = true;
    if (!ModelSetInput(rpmN, 1))
        inputSetFailed = true;
    if (!ModelSetInput(currentN, 2))
        inputSetFailed = true;
    if (!ModelSetInput(vibrationN, 3))
        inputSetFailed = true;

    // Run model inference
    if (!ModelRunInference()) {
        Serial.println("Model inference failed!");
        return;
    }

    // Get model output results
    float prediction = ModelGetOutput(0);
    bool predictedFailure = (prediction > 0.50f); // Predicted failure if value > 0.5

    // Update prediction statistics
    if (predictedFailure && actualFailure) {
        truePositives++;
    } else if (predictedFailure && !actualFailure) {
        falsePositives++;
        showStatistics = true;
    } else if (!predictedFailure && actualFailure) {
        falseNegatives++;
        showStatistics = true;
    } else if (!predictedFailure && !actualFailure) {
        trueNegatives++;
    }

    totalPredictions++;

    // Calculate rolling accuracy
    rollingAccuracy = (float)(truePositives + trueNegatives) / totalPredictions * 100.0f;

    // Display statistics when false positives or false negatives occur
    if (showStatistics) {
        Serial.println("-----------------------------");
        Serial.print("Prediction Confidence: ");
        Serial.println(prediction);
        Serial.print("Predicted failure: ");
        Serial.println(predictedFailure);
        Serial.print("Actual failure: ");
        Serial.println(actualFailure);
        Serial.print("RPM: ");
        Serial.print(rpm);
        Serial.print(" | Temperature: ");
        Serial.print(temperature);
        Serial.print(" °C");
        Serial.print(" | Current: ");
        Serial.print(current);
        Serial.print(" A");
        Serial.print(" | Vibration: ");
        Serial.print(vibration);
        Serial.print(" m/s^2\n");
        Serial.println("Total predictions: " + String(totalPredictions) + 
                      ", True Positives: " + String(truePositives) + 
                      ", False Positives: " + String(falsePositives) + 
                      ", True Negatives: " + String(trueNegatives) + 
                      ", False Negatives: " + String(falseNegatives) + 
                      ", Accuracy (%): " + String(rollingAccuracy));

        showStatistics = false;
    }

    delay(10000); // Wait for 10 seconds before the next check
}

2. Connect the PC to the K10 and upload the program:

3. K10 compilation result: The program occupies very little storage space and dynamic memory.

Sketch uses 539913 bytes (10%) of program storage space. Maximum is 5242880 bytes.

Global variables use 29740 bytes (9%) of dynamic memory, leaving 297940 bytes for local variables.

4. Serial Monitor output:

RPM: 1493.15
Temperature: 32.01°C
Current: 3.52 A
Vibration: 0.09 m/s²

At 7:59:07.295, the serial output indicated that the sensor readings showed potential failure conditions due to high temperature. At 7:59:37.328, the serial output again indicated potential failure conditions, including high temperature and low RPM.

Train and Deploy the Model

1. Install Python on the PC

Download and install Python 3.6 or higher: Python

2. Install TensorFlow and NumPy

Open the terminal (Command Prompt on Windows, Terminal on macOS or Linux), then run the following commands to install the required libraries:

CODE

# TensorFlow 2.x 
pip install tensorflow  
# NumPy 
pip install numpy

3. Define a simple regression network

The input is a number between 1 and 5, and the output is its square value.

Run the test.py file in the terminal:

CODE

python test.py

CODE

"""
Environment requirements:
- Python 3.6+
- TensorFlow 2.x
- NumPy

Function description:
This program demonstrates a simple regression model for predicting the square of a number.
It includes three main steps: model training, saving, converting to TFLite format, and generating a C array.
"""

import tensorflow as tf
import numpy as np

# 1. Prepare training data
# Create a simple dataset: input numbers from 1 to 5, output their square values
x_train = np.array([[1], [2], [3], [4], [5]], dtype=np.float32)
y_train = np.square(x_train)  # Calculate the square values as target output

# 2. Define the model
# Use the Sequential API to create a simple model
# The Lambda layer directly implements the square operation without the need for trainable parameters
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(1,), name='input_layer'),
    tf.keras.layers.Lambda(lambda x: x ** 2, name='square_layer')
])

# 3. Compile the model
# Note: Since the Lambda layer directly computes the square, no actual training is needed
model.compile(optimizer='sgd', loss='mean_squared_error')

# 4. Test model performance
print("Test results:")
for x in x_train:
    y_pred = model.predict([x])
    print(f"Input: {x[0]}, Predicted output: {y_pred[0][0]}")

# 5. Save the model
# Save the model in HDF5 format
model.save('./saved_model.h5')

# 6. Convert to TFLite format
# TFLite format is better suited for running on embedded devices
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the TFLite model
with open('./model.tflite', 'wb') as f:
    f.write(tflite_model)

# 7. Convert the TFLite model to a C array
def convert_tflite_to_c_array(tflite_model_path, c_file_path):
    """
    Convert a TFLite model file to a C array format
    
    Parameters:
        tflite_model_path: Path to the TFLite model file
        c_file_path: Output C header file path
    """
    with open(tflite_model_path, 'rb') as file:
        content = file.read()

    with open(c_file_path, 'w') as file:
        file.write('const unsigned char model_data[] = {')
        for i, val in enumerate(content):
            if i % 12 == 0:  # Insert a newline every 12 numbers to improve readability
                file.write('\n  ')
            file.write('0x{:02x}, '.format(val))
        file.write('\n};\n')
        file.write('const int model_data_len = {};\n'.format(len(content)))

if __name__ == '__main__':
    convert_tflite_to_c_array('./model.tflite', './model_data.h')

4. Upload the C model file to the K10 and using the MicroTFLite library

Place the model file (model.h) and the .ino file in the same folder, then upload the program using the Arduino IDE software.

CODE

#include <MicroTFLite.h>
#include "model.h" // Import model data

// Define memory area for storing intermediate tensors
constexpr int kTensorArenaSize = 2 * 1024; // Adjust based on model size
alignas(16) uint8_t tensorArena[kTensorArenaSize];

void setup() {
  // Initialize serial communication
  Serial.begin(9600);

  // Initialize model
  if (!ModelInit(model, tensorArena, kTensorArenaSize)) {
    Serial.println("Model initialization failed!");
    while (true); // Stop execution if initialization fails
  }

  Serial.println("Model initialization done.");
  ModelPrintMetadata();
  ModelPrintTensorInfo();
  ModelPrintInputTensorDimensions(); // Print input tensor dimensions
  ModelPrintOutputTensorDimensions(); // Print output tensor dimensions
}

void loop() {
  float input_value = 3.0; // Test input value

  // Display the input value
  Serial.print("Input value: ");
  Serial.println(input_value);

  // Set input value to the model
  if (!ModelSetInput(input_value, 0)) { // Set the first input
    Serial.println("Failed to set input value!");
    return;
  }

  // Run inference
  if (!ModelRunInference()) {
    Serial.println("Inference failed!");
    return;
  }

  // Get the model's output
  float prediction = ModelGetOutput(0);

  // Print the output result (predicted square value)
  Serial.print("Predicted output: ");
  Serial.println(prediction);

  delay(2000); // Run every 2 seconds
}

5. K10 compilation result: The program occupies very little storage space and dynamic memory.

Sketch uses 530373 bytes (10%) of program storage space. Maximum is 5242880 bytes.

Global variables use 28492 bytes (8%) of dynamic memory, leaving 299188 bytes for local variables. Maximum is 327680 bytes.

6. Serial Monitor output：

15:42:46.823 -> Output Tensor Information:

15:42:46.823 -> Type: float32

15:42:46.823 -> Dimensions: 1 x 1

15:42:46.823 -> Input tensor dimensions: 2

15:42:46.823 -> Dimension 0: 1

15:42:46.823 -> Dimension input value: 3.00

15:42:46.823 -> Predicted output: 9.00

Reference

1. https://github.com/tensorflow/tflite-micro

2. https://github.com/johnosbb/MicroTFLite

3. https://github.com/polamaxu/TFLM

If you need any help or want to join more discussions, feel free to join our Discord: https://discord.gg/PVAWBMPwsk