University of Bahrain
Scientific Journals

SGAKE: Semantic Graph-based Automatic Keyword Extraction from Hindi Text Documents

Show simple item record

dc.contributor.author Joshi, Manju Lata
dc.contributor.author Mittal, Namita
dc.contributor.author Joshi, Nisheeth
dc.date.accessioned 2021-07-14T11:21:50Z
dc.date.available 2021-07-14T11:21:50Z
dc.date.issued 2021-07-14
dc.identifier.issn 2210-142X
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/4290
dc.description.abstract Automatic keyword extraction is an automated process to identify terms that best describe the subject of the document. These terms can be in the form of key terms or key phrases representing the most relevant information conveyed by the documents. Keyword extraction techniques can be Statistical based, Linguistic based, Machine Learning based, Graph-based, or Hybrid of any these. Each approach has its limitations and strengths. This paper focuses on Graph-based approaches. These approaches rely on the exploration of network properties like Degree, Structural Diversity Index, Strength, Clustering Coefficient, Neighborhood Size, Page Rank, Closeness, Betweenness, Eigenvector Centrality, Hub, and Authority Score. In the proposed approach, the graph is constructed using semantic linkages between the terms in the document. The semantic linkages between the document terms are extracted using Hindi Wordnet as a background knowledge source. Further, fourteen different graphical measures are applied to extract the keywords. The experiments are conducted on the Tourism and Health data set of the Hindi language. The results of the proposed approach are evaluated and compared with the state-of-the-art approach TextRank as well as with the Human Annotated keywords. The result shows that the closeness centrality measure produces better precision and recall as compared to other graphical measures in case of matching with human-annotated keywords while authority proved as a good graphical measure to produce keywords, matching with TextRank. The experiments prove that the proposed semantic graph-based approach performs better as compared to the state of art approach TextRank. This paper also explored the correlation between different graph-theoretic measures using different methods of correlations. en_US
dc.language.iso en en_US
dc.publisher University of Bahrain en_US
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/4.0/ *
dc.subject Automatic Keyword Extraction en_US
dc.subject Semantic Graph-based Keyword Extraction en_US
dc.subject Semantic Network en_US
dc.subject Hindi Text Documents en_US
dc.subject Hindi WordNet en_US
dc.title SGAKE: Semantic Graph-based Automatic Keyword Extraction from Hindi Text Documents en_US
dc.identifier.doi https://dx.doi.org/10.12785/ijcds/120130
dc.contributor.authorcountry India en_US
dc.contributor.authorcountry India en_US
dc.contributor.authorcountry India en_US
dc.contributor.authoraffiliation Banasthali University & ISIM en_US
dc.contributor.authoraffiliation MNIT Jaipur en_US
dc.contributor.authoraffiliation Banasthali University en_US
dc.source.title International Journal of Computing and Digital System en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Issue(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 International Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International

All Journals


Advanced Search

Browse

Administrator Account