dc.contributor.author |
Mohammed, Sharfuddin Waseem |
|
dc.contributor.author |
Garrapally, Vignesh |
|
dc.contributor.author |
Manchala, Suraj |
|
dc.contributor.author |
Reddy, Soora Narasimha |
|
dc.contributor.author |
Naligenti, Santosh Kumar |
|
dc.date.accessioned |
2022-03-09T08:20:46Z |
|
dc.date.available |
2022-03-09T08:20:46Z |
|
dc.date.issued |
2022-11-30 |
|
dc.identifier.issn |
2210-142X |
en_US |
dc.identifier.uri |
https://journal.uob.edu.bh:443/handle/123456789/4597 |
|
dc.description.abstract |
Yoga is a broad concept that connotes union. Considering yoga’s spiritual and health benefits, it is now practiced by millions of people worldwide. It is crucial to perform yoga asana correctly like other exercises. Any mistake made while performing asana could result in severe injuries and is life-threatening. So, through this paper, we propose a lightweight and robust architecture that could recognize yoga asana from video input which could be our personal AI-powered yoga instructor. The proposed model is so computationally efficient that it can be deployed even in entry-level smartphones, browsers, and smart TVs. Most of the existing techniques use either expensive hardware configuration such as Kinect or specialized feature extraction techniques from raw inputs for each asana. Even though these produce decent accuracy in a controlled environment, they are complex to design and often fail in most real-time cases with complex backgrounds. However, several research works have recently exhaustively used deep learning (DL) techniques to recognize asana. The problem with the existing asana recognition methods from the literature is that they either demand high-end configurations or do not produce key points while recognition, which is crucial in pose correction employed at a later stage. For the training of the model, there are not many publicly available datasets. The dataset we used consists of six yoga asanas namely Padmasana, Tadasana, Bhujangasana, Trikonasana, Shavasana and Vrikshasana. In the proposed methodology, pose estimation is done initially using state-of-the-art Blaze Pose architecture. Transformations are applied after that to achieve scale and position independence. In the next stage, these transformed key points are passed to the novel model architecture, a combination of convolutional neural networks (CNN) and long-short-term memory (LSTM) networks. The CNN network from the novel architecture can be leveraged to extract spatial features whereas, LSTM networks understand the features through time stages. After precise tuning of hyperparameters, our system achieves a training accuracy of 95.29% and test accuracy of 98.65% at 30 frames per second (FPS) in real-time. To the best of our knowledge, this is the first work that is computationally very efficient which process video input at 30 FPS and achieve decent accuracy compared to existing research works from the literature. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
University of Bahrain |
en_US |
dc.subject |
Deep Learning |
en_US |
dc.subject |
Computer Vision |
en_US |
dc.subject |
Blaze Pose |
en_US |
dc.subject |
Human Activity Recognition |
en_US |
dc.subject |
Yoga |
en_US |
dc.subject |
CNN |
en_US |
dc.subject |
LSTM |
en_US |
dc.subject |
Pose Estimation |
en_US |
dc.subject |
Recurrent Neural Networks |
en_US |
dc.title |
Recognition of Yoga Asana from Real-Time Videos using Blaze-pose |
en_US |
dc.identifier.doi |
https://dx.doi.org/10.12785/ijcds/1201104 |
en_US |
dc.volume |
12 |
en_US |
dc.issue |
1 |
en_US |
dc.pagestart |
1304 |
en_US |
dc.pageend |
1295 |
en_US |
dc.contributor.authoraffiliation |
Professor at Kakatiya Institute of Technology and Science, Warangal, Telangana, India |
en_US |
dc.contributor.authoraffiliation |
Student at Kakatiya Institute of Technology and Science, Warangal, Telangana,India |
en_US |
dc.source.title |
International Journal of Computing and Digital Systems |
en_US |
dc.abbreviatedsourcetitle |
IJCDS |
en_US |