With the growing popularity of IoT and embedded systems, implementing efficient machine learning inference on resource-constrained hardware has become a key challenge. The K10 AI hardware and TensorFlow Lite Micro (TFLM) offer a lightweight solution for TinyML development on embedded devices. This article will provide a detailed guide on quickly implementing embedded machine learning applications using the K10 hardware and the Arduino IDE development environment, from model training to deployment. This tutorial also applies to similar hardware platforms, such as ESP32 S3 AI devices.
In this article, we will walk through the entire TinyML development process, from model training to deployment, using the DFRobot UNIHIKER K10 hardware and the TensorFlow Lite Micro Arduino library. This guide aims to help developers quickly get started with machine learning on resource-constrained devices. Code: https://github.com/polamaxu/TFLM
The UNIHIKER K10 is a board designed specifically for programming education, IoT, and AI projects in information technology courses.
TensorFlow is an open-source machine learning framework that provides a rich set of APIs, enabling developers to easily build, train, and deploy machine learning models.
TensorFlow Lite Micro is a subset of TensorFlow, specifically designed for resource-constrained devices like microcontrollers. It allows developers to deploy machine learning models on these devices, enabling them to perform local data processing and inference without relying on cloud computing.

Data Collection:
Define and Train Model:
Model Conversion and Saving:
model.h file, which contains the model's binary representation in Flat Buffer format.Deploy:
- Import necessary header files (e.g., TensorFlow Lite Micro and the model file model.h).
- Define the TensorArena (memory buffer).
- Initialize the model.
- Set up input data and run inference.
- Read the inference output.
Testing and Optimization:
MicroTFLite is a TensorFlow Lite Micro library designed for Arduino, aimed at simplifying the process of using TensorFlow Lite Micro on Arduino boards. The MicroTFLite library is suitable for a range of tasks such as classification, regression, and prediction.
MicroTFLite is compatible with a variety of embedded devices, including:
MicroTFLite supports a range of common machine learning operators:
- 2D Convolution (CONV_2D), Depthwise Convolution (DEPTHWISE_CONV_2D)
- Max Pooling (MAX_POOL_2D), Average Pooling (AVERAGE_POOL_2D)
- ReLU, ReLU6, TANH, SIGMOID
ModelInit() function to initialize the TensorFlow Lite model and interpreter. This includes loading model data and allocating necessary memory space (e.g., tensorArena).model.h), enabling them to run on Arduino devices.ModelSetInput() function to set input data into the model's input tensor. This function also supports quantization, automatically adjusting input values based on the model's quantization parameters.ModelGetOutput() function allows you to read the model's output after inference, enabling developers to retrieve predictions or classification results.ModelRunInference() function starts the model's inference process. It performs the forward pass and generates output.ModelPrintInputTensorDimensions() and ModelPrintOutputTensorDimensions(), which help developers understand the dimensions of input and output tensors.ModelPrintTensorQuantizationParams() function prints the quantization parameters for input and output tensors, which is helpful for debugging quantized models. The ModelPrintMetadata() function displays the model's metadata, such as description and version, helping developers understand basic model information.preferences -> Additional Boards Manager URLs:
https://downloadcd.dfrobot.com.cn/UNIHIKER/package_unihiker_index.json
Compiler warnings: None

Tools -> Boards Manager

Search esp32 and install

done


1. Example Test: Real-time Industrial Equipment Fault Prediction Using Sensor Data
The ArduinoLite_preventive_maintenance project uses a feedforward neural network architecture consisting of an input layer, two dropout layers, one hidden layer, and an output layer to handle a binary classification task. It employs binary_crossentropy as the loss function and the Adam optimizer for training, with support for class weight adjustment and early stopping callbacks to ensure effective training even in cases of class imbalance. The example uses sensor data, such as speed, temperature, vibration, and current, simulated by random numbers. To reduce the model size, the model is quantized to int8.
The main functionalities of this program are:
Effects and Impact of Quantization:
Quantization in TensorFlow Lite for Microcontrollers (TFLM) is the process of converting model data from floating-point (float32) to integer (int8).
/* Copyright 2024 John O'Sullivan, TensorFlow Authors. All Rights Reserved.
This is an example program that runs TensorFlow Lite models using the MicroTFLite library.
It is mainly used for preventive maintenance of equipment by predicting potential failures from sensor data.
For more information, refer to the library documentation:
https://github.com/johnosbb/MicroTFLite
Licensed under the Apache License, Version 2.0 (the "License");
... License Information ...
==============================================================================*/
#include <MicroTFLite.h>
#include "model.h"
// Statistics of feature data (before scaling and balancing):
// RPM - Mean: 1603.866, Standard Deviation: 195.843
// Temperature (°C) - Mean: 24.354, Standard Deviation: 4.987
// Vibration (g) - Mean: 0.120, Standard Deviation: 0.020
// Current (A) - Mean: 3.494, Standard Deviation: 0.308
constexpr float tMean = 24.354f; // Temperature mean value
constexpr float rpmMean = 1603.866f; // RPM mean value
constexpr float vMean = 0.120f; // Vibration mean value
constexpr float cMean = 3.494f; // Current mean value
constexpr float tStd = 4.987f; // Temperature standard deviation
constexpr float rpmStd = 195.843f; // RPM standard deviation
constexpr float vStd = 0.020f; // Vibration standard deviation
constexpr float cStd = 0.308f; // Current standard deviation
// Thresholds for failure conditions
const float highTempThreshold = 30.0f; // High temperature threshold (Celsius)
const float lowRpmThreshold = 1500.0f; // Low RPM threshold
const float highVibrationThreshold = 0.60f; // High vibration threshold (g)
const float abnormalCurrentLowThreshold = 0.2f; // Low current threshold (Amps)
const float abnormalCurrentHighThreshold = 10.8f; // High current threshold (Amps)
// Prediction statistics counters
int totalPredictions = 0;
int truePositives = 0;
int falsePositives = 0;
int trueNegatives = 0;
int falseNegatives = 0;
float rollingAccuracy = 0.0f; // Rolling accuracy calculation
bool showStatistics = false; // Whether to display statistics
// Allocate memory for TensorFlow Lite
constexpr int kTensorArenaSize = 4 * 1024;
alignas(16) uint8_t tensorArena[kTensorArenaSize];
void setup() {
// Initialize serial communication and wait for the serial monitor to open
Serial.begin(115200);
while (!Serial);
delay(5000);
Serial.println("Preventive Maintenance Example.");
Serial.println("Initializing TensorFlow Lite Micro Interpreter...");
// Initialize TensorFlow Lite model
if (!ModelInit(model, tensorArena, kTensorArenaSize)) {
Serial.println("Model initialization failed!");
while (true);
}
Serial.println("Model initialization done.");
ModelPrintMetadata();
ModelPrintTensorQuantizationParams();
ModelPrintTensorInfo();
}
// Generate normally distributed random values using Box-Muller transform
float GenerateRandomValue(float mean, float stddev) {
float u1 = random(0, 10000) / 10000.0f;
float u2 = random(0, 10000) / 10000.0f;
// Box-Muller transform to generate normally distributed values
float z0 = sqrt(-2.0f * log(u1)) * cos(2.0f * PI * u2);
float value = mean + z0 * stddev;
return value;
}
// Simulated function to read sensor data
float ReadRpm() {
return GenerateRandomValue(rpmMean, rpmStd);
}
float ReadVibration() {
return GenerateRandomValue(vMean, vStd);
}
float ReadTemperature() {
return GenerateRandomValue(tMean, tStd);
}
float ReadCurrent() {
return GenerateRandomValue(cMean, cStd);
}
// Check for failure conditions
bool CheckFailureConditions(float temperature, float rpm, float vibration, float current) {
bool conditionMet = false;
String failureReason = "";
// Check if any parameter exceeds thresholds
if (temperature > highTempThreshold) {
conditionMet = true;
failureReason += "High temperature; ";
}
if (rpm < lowRpmThreshold) {
conditionMet = true;
failureReason += "Low RPM; ";
}
if (vibration > highVibrationThreshold) {
conditionMet = true;
failureReason += "High vibration; ";
}
if (current < abnormalCurrentLowThreshold) {
conditionMet = true;
failureReason += "Low current; ";
}
if (current > abnormalCurrentHighThreshold) {
conditionMet = true;
failureReason += "High current; ";
}
if (conditionMet) {
Serial.print("Warning: Sensor readings indicate potential failure. Cause: ");
Serial.println(failureReason);
}
return conditionMet;
}
void loop() {
// Read sensor data (simulated values)
float rpm = ReadRpm();
float temperature = ReadTemperature();
float current = ReadCurrent();
float vibration = ReadVibration();
bool inputSetFailed = false;
// Check actual failure conditions
bool actualFailure = CheckFailureConditions(temperature, rpm, vibration, current);
// Data normalization
float temperatureN = (temperature - tMean) / tStd;
float rpmN = (rpm - rpmMean) / rpmStd;
float currentN = (current - cMean) / cStd;
float vibrationN = (vibration - vMean) / vStd;
// Input data into the model
if (!ModelSetInput(temperatureN, 0))
inputSetFailed = true;
if (!ModelSetInput(rpmN, 1))
inputSetFailed = true;
if (!ModelSetInput(currentN, 2))
inputSetFailed = true;
if (!ModelSetInput(vibrationN, 3))
inputSetFailed = true;
// Run model inference
if (!ModelRunInference()) {
Serial.println("Model inference failed!");
return;
}
// Get model output results
float prediction = ModelGetOutput(0);
bool predictedFailure = (prediction > 0.50f); // Predicted failure if value > 0.5
// Update prediction statistics
if (predictedFailure && actualFailure) {
truePositives++;
} else if (predictedFailure && !actualFailure) {
falsePositives++;
showStatistics = true;
} else if (!predictedFailure && actualFailure) {
falseNegatives++;
showStatistics = true;
} else if (!predictedFailure && !actualFailure) {
trueNegatives++;
}
totalPredictions++;
// Calculate rolling accuracy
rollingAccuracy = (float)(truePositives + trueNegatives) / totalPredictions * 100.0f;
// Display statistics when false positives or false negatives occur
if (showStatistics) {
Serial.println("-----------------------------");
Serial.print("Prediction Confidence: ");
Serial.println(prediction);
Serial.print("Predicted failure: ");
Serial.println(predictedFailure);
Serial.print("Actual failure: ");
Serial.println(actualFailure);
Serial.print("RPM: ");
Serial.print(rpm);
Serial.print(" | Temperature: ");
Serial.print(temperature);
Serial.print(" °C");
Serial.print(" | Current: ");
Serial.print(current);
Serial.print(" A");
Serial.print(" | Vibration: ");
Serial.print(vibration);
Serial.print(" m/s^2\n");
Serial.println("Total predictions: " + String(totalPredictions) +
", True Positives: " + String(truePositives) +
", False Positives: " + String(falsePositives) +
", True Negatives: " + String(trueNegatives) +
", False Negatives: " + String(falseNegatives) +
", Accuracy (%): " + String(rollingAccuracy));
showStatistics = false;
}
delay(10000); // Wait for 10 seconds before the next check
}

2. Connect the PC to the K10 and upload the program:

3. K10 compilation result: The program occupies very little storage space and dynamic memory.
Sketch uses 539913 bytes (10%) of program storage space. Maximum is 5242880 bytes.
Global variables use 29740 bytes (9%) of dynamic memory, leaving 297940 bytes for local variables.
4. Serial Monitor output:
At 7:59:07.295, the serial output indicated that the sensor readings showed potential failure conditions due to high temperature. At 7:59:37.328, the serial output again indicated potential failure conditions, including high temperature and low RPM.

Download and install Python 3.6 or higher: Python
Open the terminal (Command Prompt on Windows, Terminal on macOS or Linux), then run the following commands to install the required libraries:
# TensorFlow 2.x
pip install tensorflow
# NumPy
pip install numpyThe input is a number between 1 and 5, and the output is its square value.
Run the test.py file in the terminal:
python test.py"""
Environment requirements:
- Python 3.6+
- TensorFlow 2.x
- NumPy
Function description:
This program demonstrates a simple regression model for predicting the square of a number.
It includes three main steps: model training, saving, converting to TFLite format, and generating a C array.
"""
import tensorflow as tf
import numpy as np
# 1. Prepare training data
# Create a simple dataset: input numbers from 1 to 5, output their square values
x_train = np.array([[1], [2], [3], [4], [5]], dtype=np.float32)
y_train = np.square(x_train) # Calculate the square values as target output
# 2. Define the model
# Use the Sequential API to create a simple model
# The Lambda layer directly implements the square operation without the need for trainable parameters
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(1,), name='input_layer'),
tf.keras.layers.Lambda(lambda x: x ** 2, name='square_layer')
])
# 3. Compile the model
# Note: Since the Lambda layer directly computes the square, no actual training is needed
model.compile(optimizer='sgd', loss='mean_squared_error')
# 4. Test model performance
print("Test results:")
for x in x_train:
y_pred = model.predict([x])
print(f"Input: {x[0]}, Predicted output: {y_pred[0][0]}")
# 5. Save the model
# Save the model in HDF5 format
model.save('./saved_model.h5')
# 6. Convert to TFLite format
# TFLite format is better suited for running on embedded devices
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Save the TFLite model
with open('./model.tflite', 'wb') as f:
f.write(tflite_model)
# 7. Convert the TFLite model to a C array
def convert_tflite_to_c_array(tflite_model_path, c_file_path):
"""
Convert a TFLite model file to a C array format
Parameters:
tflite_model_path: Path to the TFLite model file
c_file_path: Output C header file path
"""
with open(tflite_model_path, 'rb') as file:
content = file.read()
with open(c_file_path, 'w') as file:
file.write('const unsigned char model_data[] = {')
for i, val in enumerate(content):
if i % 12 == 0: # Insert a newline every 12 numbers to improve readability
file.write('\n ')
file.write('0x{:02x}, '.format(val))
file.write('\n};\n')
file.write('const int model_data_len = {};\n'.format(len(content)))
if __name__ == '__main__':
convert_tflite_to_c_array('./model.tflite', './model_data.h')
Place the model file (model.h) and the .ino file in the same folder, then upload the program using the Arduino IDE software.
#include <MicroTFLite.h>
#include "model.h" // Import model data
// Define memory area for storing intermediate tensors
constexpr int kTensorArenaSize = 2 * 1024; // Adjust based on model size
alignas(16) uint8_t tensorArena[kTensorArenaSize];
void setup() {
// Initialize serial communication
Serial.begin(9600);
// Initialize model
if (!ModelInit(model, tensorArena, kTensorArenaSize)) {
Serial.println("Model initialization failed!");
while (true); // Stop execution if initialization fails
}
Serial.println("Model initialization done.");
ModelPrintMetadata();
ModelPrintTensorInfo();
ModelPrintInputTensorDimensions(); // Print input tensor dimensions
ModelPrintOutputTensorDimensions(); // Print output tensor dimensions
}
void loop() {
float input_value = 3.0; // Test input value
// Display the input value
Serial.print("Input value: ");
Serial.println(input_value);
// Set input value to the model
if (!ModelSetInput(input_value, 0)) { // Set the first input
Serial.println("Failed to set input value!");
return;
}
// Run inference
if (!ModelRunInference()) {
Serial.println("Inference failed!");
return;
}
// Get the model's output
float prediction = ModelGetOutput(0);
// Print the output result (predicted square value)
Serial.print("Predicted output: ");
Serial.println(prediction);
delay(2000); // Run every 2 seconds
}
Sketch uses 530373 bytes (10%) of program storage space. Maximum is 5242880 bytes.
Global variables use 28492 bytes (8%) of dynamic memory, leaving 299188 bytes for local variables. Maximum is 327680 bytes.
15:42:46.823 -> Output Tensor Information:
15:42:46.823 -> Type: float32
15:42:46.823 -> Dimensions: 1 x 1
15:42:46.823 -> Input tensor dimensions: 2
15:42:46.823 -> Dimension 0: 1
15:42:46.823 -> Dimension input value: 3.00
15:42:46.823 -> Predicted output: 9.00
1. https://github.com/tensorflow/tflite-micro
2. https://github.com/johnosbb/MicroTFLite
3. https://github.com/polamaxu/TFLM
Related Project:
1. E-nose: MEMS AI Gas Detection with UNIHIKER K10(ESP32 S3) with Arduino IDE
If you need any help or want to join more discussions, feel free to join our Discord: https://discord.gg/PVAWBMPwsk