Build a Face Tracking Robot with Arduino ML

What a Face-Tracking Robot Is and Why Arduino UNO Q Shines

A face-tracking robot is a small, mobile platform that uses a camera and on-device machine learning to detect a human face in real time, then automatically steers its wheels so it follows that face without relying on any cloud services or external computers. In this project, you will build a compact desk robot around the Arduino UNO Q board, turning it into a practical introduction to Arduino machine learning and basic robotics. The UNO Q combines a Linux-based MPU and a real-time MCU, so it can run a face-detection model at about 10–15 frames per second while still generating precise servo control signals. All inference and steering logic stay on the board, which keeps latency low and protects privacy. By the end, you will have a face tracking robot that responds smoothly and can be extended with more sensors or behaviors.

Hardware Setup: Chassis, Servos, Camera and Power

Start by building a simple differential-drive chassis for your face tracking robot. Mount two continuous-rotation servos on opposite sides, then add a rear caster or skid so the robot balances. Place a small platform on top for the Arduino UNO Q and a USB camera. Continuous-rotation servos make motion control easy because a 1500 microsecond pulse stops the wheel, 1000 microseconds drives full speed in one direction, and 2000 microseconds drives full speed in the opposite direction. According to the project author, the closer the pulse is to 1500 microseconds, the slower the wheel turns. Wire both servos to a shared power rail and add a 100 µF capacitor across power and ground to absorb current spikes. Connect the servos’ signal lines to pins D3 and D6, then plug the webcam into a USB-A port on a USB-C splitter that both powers the UNO Q and carries video data.

Configuring Edge Impulse and Vision Bricks for On-Device Detection

With the hardware ready, you can focus on on-device ML detection. Using the Arduino App Lab environment, define an app.yaml file that pulls in the video_object_detection and web_ui bricks. The video_object_detection brick captures frames from the USB camera, runs the face-detection model, and forwards results to your Python code via callbacks. You can point this brick at a built-in lightweight face model or an Edge Impulse model you trained yourself, making this an excellent Edge Impulse tutorial for computer vision. Detections arrive as a dictionary where the “face” label maps to a list of instances, each with confidence and a bounding_box_xyxy tuple. This format gives you the face’s position in pixel coordinates for every frame, at around 10–15 FPS on the UNO Q. Because the entire workflow stays on the device, Arduino machine learning inference works without any cloud API calls.

Servo Control on the MCU: Bridge RPC and PWM Output

Next, configure the MCU side so the robot can move. In sketch/sketch.yaml, include the Arduino_RouterBridge, Servo, and MsgPack libraries, then write a minimal Arduino sketch that exposes a set_wheel_pwm function via Bridge RPC. Inside this function, clamp incoming pulse widths to 1000–2000 microseconds, optionally invert the right wheel pulse because its servo is mounted mirrored, and send the final values to leftServo and rightServo using writeMicroseconds() on pins D3 and D6. Bridge.provide_safe registers this function so it runs inside loop(), which keeps PWM writes safe and deterministic. The UNO Q’s split architecture lets the Linux MPU run non-deterministic ML workloads while the STM32-based MCU keeps servo timing stable. With this setup, the Python process can send updated wheel commands every frame through Bridge RPC, achieving smooth, low-latency motion even when the ML workload varies.

From Face Detections to Steering and Live Tuning

Finally, implement the steering logic on the MPU. In Python, initialize the WebUI and VideoObjectDetection bricks, then register an on_detect_all callback. For each frame, pick the most confident face, compute the center of its bounding box, and normalize its x-coordinate between 0.0 at the left edge and 1.0 at the right edge. Subtract 0.5 to get an error value around the image center, then feed that into a proportional controller that adjusts left and right wheel pulses around 1500 microseconds. This proportional control turns the robot toward the face: a positive error speeds one wheel and slows the other, while a small error keeps the robot mostly straight. The web_ui brick hosts a Socket.IO-powered dashboard that lets you see detection overlays and tweak controller gains live without recompiling. All face detection, steering, and tuning stay in the microcontroller stack, making the system responsive and private.