University of Bahrain
Scientific Journals

Text Classification on Cybercrime Cases From News Articles Using Supervised Learning

Show simple item record

dc.contributor.author Farhan, Muhammad
dc.contributor.author Mutalib, Sofianita
dc.contributor.author Yusof Darus, Mohamad
dc.contributor.author Ismail, Azlan
dc.contributor.author Mokayed, Hamam
dc.contributor.author Abdul-Rahman, Shuzlina
dc.contributor.author Nizam, Muhamad
dc.date.accessioned 2024-07-11T11:39:44Z
dc.date.available 2024-07-11T11:39:44Z
dc.date.issued 2024-07-11
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/5807
dc.description.abstract The number of cybercrime cases has increased in this country, especially after the pandemic. The nation has created numerous strategic plans, including the introduction of the Malaysia Cyber Security Strategy (MCSS), which sparked a baseline for countering cybercrime. One of the pillars is Enhancing Capacity and Capability Building, Awareness, and Education. To raise awareness effectively, the taxonomy of cybercrime must be easily understandable by the citizens. This project is to study the classification of news postings by applying supervised models that can ease the classification of cybercrime types. Five supervised models with a combination of two feature extractors were examined. The models were experimented with to evaluate their performance using a percentage split of 70:20 and 80:20. Each model is evaluated based on accuracy, F1-measure, and precision. In the experiment, Random Forest with the TF-IDF feature extractor produced the best result. Achieving an impressive accuracy rate of 94.01%, this model stands out for its precision. Naïve Bayes with the Word2vec feature extractor performed the least effectively, with an accuracy rate of 73.48%. This research focused on analyzing textual data by examining word frequency and interpreting topics based on the class labels of Cybercrime Type 1 and Cybercrime Type 2. Each class of cybercrime news uncovered the topic using latent direct allocation, which was interpreted using Chat-GPT. The analysis and the results of the classification model have been effectively visualized in the PowerBI dashboard, enhancing comprehension. To enhance future research, consider adjusting the scope of the data to focus on local Malay news for more targeted insights. en_US
dc.language.iso en_US en_US
dc.publisher University of Bahrain en_US
dc.subject Article News en_US
dc.subject Cybercrime en_US
dc.subject Machine Learning en_US
dc.subject Text Classification en_US
dc.subject Topic Identification en_US
dc.title Text Classification on Cybercrime Cases From News Articles Using Supervised Learning en_US
dc.identifier.doi XXXXXX
dc.volume 17 en_US
dc.issue 1 en_US
dc.pagestart 1 en_US
dc.pageend 11 en_US
dc.contributor.authorcountry 40450 Shah Alam, Selangor, Malaysia en_US
dc.contributor.authorcountry Luelå, Sweden en_US
dc.contributor.authoraffiliation School of Computing Sciences, College of Computing, Informatics and Mathematics, Universiti Teknologi MARA en_US
dc.contributor.authoraffiliation Institute of Big Data Analytics and Artificial Intelligence, Universiti Teknologi MARA en_US
dc.contributor.authoraffiliation Department of Computer Science, Electrical and Space Engineering, Luleå tekniska universitet en_US
dc.source.title International Journal of Computing and Digital Systems en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

This item appears in the following Issue(s)

Show simple item record

All Journals


Advanced Search

Browse

Administrator Account