Abstract:
In this paper, the Deep Deterministic Policy Gradient (DDPG) reinforcement learning algorithm is employed to enable
a double-jointed robot arm to reach continuously-changing target locations. The experimentation of the algorithm is carried out by
training an agent to control the movement of this double-jointed robot arm. The architectures of the actor and cretic networks are
meticulously designed and the DDPG hyperparameters are carefully tuned. An enhanced version of the DDPG is also presented to
handle multiple robot arms simultaneously. The trained agents are successfully tested in the Unity Machine Learning Agents
environment for controlling both a single robot arm as well as multiple simultaneous robot arms. The testing shows the robust
performance of the DDPG algorithm for empowering robot arm maneuvering in complex environments.