University of Bahrain
Scientific Journals

Non-negative Matrix Factorization on a Multi-lingual Overlapped Speech Signal: A Signal and Perception Level Analysis

Show simple item record

dc.contributor.author Nag, Nandini C
dc.contributor.author Shah, Milind S
dc.date.accessioned 2021-07-25T10:09:42Z
dc.date.available 2021-07-25T10:09:42Z
dc.date.issued 2021-07-25
dc.identifier.issn 2210-142X
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/4323
dc.description.abstract A complex acoustic scenario comprising overlapping speeches from multiple speakers in the presence of noise renders speech recognition perform poorly in hands-free devices. This scenario turns out to be more complex in India, a country where 96.71% of the population speaks one of the 22 scheduled languages. Therefore, an audio source separation algorithm that mitigates the interference from other speakers and effectively enhances the articulacy and quality of source speech may be added as a pre-processor in speech recognition systems. This research, therefore, investigates the non-negative matrix factorization (NMF) algorithm's effectiveness for the separation of source in an overlapping multi-lingual multi-dialect single-channel speech mixture scenario, an inherent characteristic of a cocktail party problem in India. The objective is to analyze the signal level metrics and perception level metrics of a speech source-separated from a multi-lingual overlapped speech signal. The languages used for the same are English and two Indo-Aryan languages, Marathi and Bengali. One of the experimental results demonstrated that the source to distortion ratio (SDR) of separated target source from English-Bengali and English-Marathi speech mixture is 0.4 and 1.3 dB higher than English-English speech mixed signals, respectively. Therefore, the experiments highlight an improvement in separating sources from mixed speech signals with different language combinations than the same language. en_US
dc.language.iso en en_US
dc.publisher University of Bahrain en_US
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/4.0/ *
dc.subject Audio source separation en_US
dc.subject cocktail party problem en_US
dc.subject Multi-lingual scenario en_US
dc.subject Non-negative matrix factorization en_US
dc.title Non-negative Matrix Factorization on a Multi-lingual Overlapped Speech Signal: A Signal and Perception Level Analysis en_US
dc.identifier.doi https://dx.doi.org/10.12785/ijcds/110103
dc.contributor.authorcountry India en_US
dc.contributor.authorcountry India en_US
dc.contributor.authoraffiliation Electronics and Telecommunication Engineering Department, Fr. C. Rodrigues Institute of Technology, University of Mumbai, Sector 9A, Vashi, Navi Mumbai en_US
dc.contributor.authoraffiliation Electronics and Telecommunication Engineering Department, Fr. C. Rodrigues Institute of Technology, University of Mumbai, Sector 9A, Vashi, Navi Mumbai en_US
dc.source.title International Journal of Computing and Digital System en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Issue(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 International Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International

All Journals


Advanced Search

Browse

Administrator Account