A Multi-Task Learning Approach for News Classification and Number of Reader Prediction

Authors

DOI:

https://doi.org/10.25139/inform.v10i2.9549

Keywords:

Multi-Task Learning, CNN, Classification News, Prediction News

Abstract

The large volume of online news content presents a challenge in effectively managing and organizing information, especially regarding enhancing literacy rates in Indonesia. As the amount of news articles continues to grow, there is a need for a robust system that can categorize news and predict the number of readers to assess its impact on literacy. This study introduces a Multi-Task Learning (MTL) approach, utilizing data from online websites to simultaneously address news classification and reader prediction tasks. Cross-entropy loss is applied in the model to handle the class imbalance issue. The research compares two performance MTL architectures, the Dense architecture and the CNN architecture. The experiments assess the models' abilities to classify news and predict reader numbers. The results show that the Dense architecture outperforms the CNN architecture, achieving 94% accuracy and a 99% AUC-ROC score, whereas the CNN model achieved 91% accuracy and a 98% AUC-ROC score. This study highlights the effectiveness of the Dense architecture in classifying online news and predicting reader engagement. The findings provide valuable insights for enhancing news sorting systems and could contribute to improving literacy initiatives in Indonesia by offering more accurate predictive models for online news consumption. The results indicate that integrating Multi-Task Learning into news classification systems can enhance content management and offer a deeper understanding of public interaction with news.

References

M. Javaid, A. Haleem, R. P. Singh, S. Rab, and R. Suman, "Exploring impact and features of machine vision for progressive industry 4.0 cul-ture," 2022. doi: 10.1016/j.sintl.2021.100132.

Y. Li et al., "Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event de-tection for benchmark datasets," J Biomed Inform, vol. 152, p. 104621, Apr. 2024, doi: 10.1016/J.JBI.2024.104621.

E. HASANÇEBİ and A. YILMAZ, “Changing Gatekeepers in the New Media Age: An Analysis of Internet Newspapers and Twitter,†Erciyes İletişim Dergisi, vol. 11, no. 1, pp. 1–14, Jan. 2024, doi: 10.17680/erciyesiletisim.1352735.

M. I. Syafaah and L. Lestandy, "Emotional Text Classification Using TF-IDF (Term Frequency-Inverse Document Frequency) And LSTM (Long Short-Term Memory)," 2022.

H. Huang, J. Mao, R. Liu, W. Lu, T. Tang, and L. Liu, "MTLMetro: A Deep Multi-Task Learning Model for Metro Passenger Demands Predic-tion," IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 9, pp. 11805–11820, 2024, doi: 10.1109/TITS.2024.3373565.

B. L. Ye, S. Zhu, L. Li, and W. Wu, "Short-term traffic flow prediction at isolated intersections based on parallel multi-task learning," Systems Sci-ence and Control Engineering, vol. 12, no. 1, 2024, doi: 10.1080/21642583.2024.2316160.

Diyah Utami Kusumaning Putri and Dinar Nugroho Pratomo, "Clickbait Detection of Indonesian News Headlines using Fine-Tune Bidirectional Encoder Representations from Transformers (BERT)," Inform : Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, vol. 7, no. 2, pp. 162–168, Jul. 2022, doi: 10.25139/inform.v7i2.4686.

P. Liu, X. Qiu, and X. Huang, "Adversarial multi-task learning for text classification," in ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Pa-pers), Association for Computational Linguistics (ACL), 2017, pp. 1–10. doi: 10.18653/v1/P17-1001.

G. Xingyi and H. M. Adnan, "Potential cyberbullying detection in social media platforms based on a multi-task learning framework," International Journal of Data and Network Science, vol. 8, no. 1, pp. 25–34, Dec. 2024, doi: 10.5267/j.ijdns.2023.10.021.

M. S. Jahan and M. Oussalah, "A systematic review of hate speech auto-matic detection using natural language processing," Neurocomputing, vol. 546, p. 126232, Aug. 2023, doi: 10.1016/J.NEUCOM.2023.126232.

K. Denistia and R. H. Baayen, "The morphology of Indonesian: Data and quantitative modeling."

Internal Company Dataset, "Dataset of Indonesian News Database," 2022.

E. Yoshua and W. Maharani, "Depression Detection of Users in Social-Media Twitter Using Decision Tree with Word2Vec," Inform : Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, vol. 9, no. 1, pp. 95–100, Feb. 2024, doi: 10.25139/inform.v9i1.7617.

R. Wang and K. Sun, "TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches," Apr. 2024.

S. Vandenhende, S. Georgoulis, B. De Brabandere, and L. Van Gool, "Branched Multi-Task Networks: Deciding What Layers To Share," Apr. 2019.

B. Qu et al., "Multi-task CNN Behavioral Embedding Model For Trans-action Fraud Detection," Nov. 2024.

D. Yang et al., "An efficient multi-task learning CNN for driver attention monitoring," Journal of Systems Architecture, vol. 148, p. 103085, Mar. 2024, doi: 10.1016/J.SYSARC.2024.103085.

J. S. Aguilar-Ruiz and M. Michalak, "Classification performance assess-ment for imbalanced multiclass data," Sci Rep, vol. 14, no. 1, Dec. 2024, doi: 10.1038/s41598-024-61365-z.

A. M. Carrington et al., "Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explana-tion," IEEE Trans Pattern Anal Mach Intell, vol. 45, no. 1, pp. 329–341, Jan. 2023, doi: 10.1109/TPAMI.2022.3145392.

Z. Chen, V. Badrinarayanan, C.-Y. Lee, and A. Rabinovich, "GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multi-task Networks," Nov. 2017.

Downloads

Published

2025-07-04

How to Cite

A Multi-Task Learning Approach for News Classification and Number of Reader Prediction. (2025). Inform : Jurnal Ilmiah Bidang Teknologi Informasi Dan Komunikasi, 10(2), 97–102. https://doi.org/10.25139/inform.v10i2.9549

Issue

Section

Articles

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.