Gesture-Driven Robotic Arm: Utilizing Computer Vision for Enhanced Human-Machine Interaction.
Background
Gesture-controlled robotic arms offer significant benefits in various fields, particularly in a developing country
like Bangladesh. In industries, these robotic arms can enhance precision in manufacturing, reducing dependency on
manual labor while improving efficiency and safety in hazardous environments. For disabled individuals, such
technology can provide an affordable and intuitive solution for assistive devices, enabling them to regain
mobility and independence. In the medical field, gesture-controlled robotic arms can assist in remote surgeries
and rehabilitation therapy, allowing healthcare professionals to operate with greater accuracy and minimal
physical contact. Given Bangladesh's rapid industrialization and increasing demand for automation, integrating
gesture-controlled robotic arms can bridge the technological gap, enhance productivity, and improve the quality of
life for many. Here, we have tried to develop a mini prototype version of a gesture-controlled robotic arm.
Major Components
MediaPipe is an open-source framework for building pipelines to perform computer vision inference over arbitrary
sensory data such as video or audio. It was developed by C. Lugaresi et al. at google [3]. The framework provides
many solutions such as human pose, hand tracking, face landmarks etc (Figure 4, Figure 5 & Figure 6). The human
pose landmark gives 3D coordinate value of human body joint. We have used them in our project to extract human
gesture to control the robot arm.
A stepper motor (Figure 1) is an electromechanical device it converts electrical power into mechanical power.
Also, it is a brushless, synchronous electric motor that can divide a full rotation into an expansive number of
steps as showed in Figure 2. The stepper motor can be controlled by energizing every stator one by one. So, the
stator will magnetize & works like an electromagnetic pole which uses repulsive energy on the rotor to move
forward. The stator’s alternative magnetizing as well as demagnetizing will shift the rotor gradually & allows it
to turn through great control. Once the supply is provided to the winding of the stator then the magnetic field
will be developed within the stator. Now rotor in the motor will start to move with the rotating magnetic field of
the stator as showed in Figure 3.
Workflow of The Robot Arm
The block diagram of this project is given in Figure 7. The computer takes input from the camera real time and
sends it to the MediaPipe Machine Learning (ML) model. The model gives us the 3D coordinate of the hand joints.
Using the coordinates, we identify the gesture and send it to the ESP-32. The ESP-32 controls the stepper motor
using the motor driver using the input received via the computer. Here the computer and the ESP-32 communicates
wirelessly via WebSocket communication protocol.
Circuit Diagram
The circuit diagram is given in Figure 8. We have connected the three-motor driver with the ESP32 via the
breadboard. We have used 12 DC, 5A power supply to poser the stepper motors as the stepper motor will draw high
current. To power the ESP32 with 3.3V, we have used the buck converter to step down the 12V to 3.3V. The three
stepper motor controls the base, Arm 1 and Arm 2 of the robot arm. The ESP32 provides control logic sequence to
the motor driver which power and control the three stepper motors.
Control Logic
Figure 9, shows the gesture control logic to control the robot arm. We have taken a rectangular box, which can be
placed anywhere with the camera view by showing 1 by the hand. This makes the input more versatile as the box is
not fix to a particular position and can be anywhere when needed. If the hand is inside the rectangle box no input
is sent to the robot arm. If the hand is in fist mode as shown in the figure and is up or down the rectangle box
then the arm 1 moves up or down respectively. Similar for the arm 2, except in this case the hand should in open
palm mode as shown in the figure. If the hand is quickly changed from fist to open palm mode the input changes
accordingly. Now to rotate the base the hand can be in both fist or palm mode but it should be left or right side
of the rectangular box. The ESP32 will receive signal as long as the hand is outside of the rectangular box. The
robot arm also supports simultaneous input. It means the user can control both arm 1 and base or arm2 and base at
the same time. This makes the use of the robot faster and useful by giving more access.
Actual Design & Control
Figure 10, shows the design of the actual robot arm. We have made the arm using the PVC board, Nut, Bolt and Glue.
The stepper motor used in the base is inside the plastic box. The base support 360-degree rotation. The arm also
supports full rotation and moves as long as it has place to move. The stepper motor and the robot are joined via
the coupler which holds them together.
To control the robot arm we had to code for both the microcontroller and the computer. We haves used C++ to code
the ESP32. We have used the Accelstepper library to control the stepper motor. We have also used several network
libraries to create a WebSocket communication path to receive signal from the computer. We have also written html
code to create a web page to remotely control the robot arm by hand as shown in Figure 11. The top upper and
bottom lower button controls the arm 2, the button between them controls the arm 1 while the two-side button
controls the base movement. For a single input signal, we move 70 steps of the stepper motor. From Figure 11, we
see that the arm is quite flexible and can be moved almost all the areas it can reach.
Figure 12, shows the actual proposed hand gesture control logic. We have used python to code the computer vision
part. The algorithm takes camera input and pass it through Mediapipe human pose prediction model that return the
coordinates of the hand joints relative to the screen. To activate the control the user, have to show one by their
hand. Then at that place a rectangular box appears which activates the control. The coordinate of the midpoint of
the middle finger is used to identify weather the hand is outside the cox or inside. If the hand is inside then no
signal is sent. To send the signal the hand must be outside the box, and the signal will be sent to the ESP32 as
discussed earlier using Figure 9. To completely stop sending signals to the ESP32 the user can show number four
using their hand to disable the control. The proposed gesture mechanism gives very easy to use control to work in
various filed. Any user can easily learn to use the robot arm and command it at their will.
Team Background
Our team consisted of the following members,
Md. Johir Raihan, Electronics and Communication Engineering
Jannatul Jeba, Electronics and Communication Engineering
Samiya Intesar, Electronics and Communication Engineering
MD Jakaria, Electronics and Communication Engineering
Limitations & Future Work
Despite the successful implementation of the gesture-controlled robotic arm, the project had several limitations.
Since the human hand gestures were extracted using a Machine Learning (ML) model, occasional misidentifications in
hand coordination affected accuracy. The arm, constructed using a PVC board and couplers, exhibited instability in
movement due to material constraints. Furthermore, the use of a small stepper motor resulted in overheating under
heavy loads, limiting its efficiency. However, the overall movement of the robotic arm was smooth and
satisfactory. For future improvements, using high-performance motors would enhance load-bearing capacity, and
increasing the number of motors could provide additional degrees of freedom, enabling the robot to perform more
complex tasks.
Event Photos
Our project was recognized as the best in our class, marking a proud achievement. We had the honor of sharing this
moment with our course teacher, Professor Dr. Shamim Ahsan.
[1] T. Agarwal, “Stepper Motor : Construction, Working, Types and Its Applications,” ElProCus
- Electronic Projects for Engineering Students, Oct. 24, 2013.
https://www.elprocus.com/stepper-motor-types-advantages-applications/ (accessed Mar. 16,
2023).
[2] “In-Depth: Control 28BYJ-48 Stepper Motor with ULN2003 Driver & Arduino,” Last Minute
Engineers, Dec. 22, 2019. https://lastminuteengineers.com/28byj48-stepper-motor-arduino-
tutorial/ (accessed Mar. 16, 2023).
[3] C. Lugaresi et al., “MediaPipe: A Framework for Building Perception Pipelines,” arXiv.org,
Jun. 14, 2019. https://arxiv.org/abs/1906.08172v1 (accessed Mar. 16, 2023).
[4] “MediaPipe.” https://mediapipe.dev/ (accessed Mar. 16, 2023).