Entity Extraction and Annotation for Job Title and Job Descriptions Using Bert-Based Model

  • Anindo Saka Fitri Information System Department, Universitas Pembangunan Veteran Jawa Timur
  • Seftin Fitri Ana Wati Information System Department, Universitas Pembangunan Veteran Jawa Timur
  • Herlambang Haryo Putra Arion Digital Media
  • Suryo Widodo Mathematics Education Department, Universitas Nusantara PGRI Kediri
  • Arizia Aulia Aziiza Information System Department, Universitas Surabaya
Abstract views: 127 , PDF downloads: 46
Keywords: Named Entity Recognition, Bert-Based Model, Job Vacancy, Deep Learning, Language Processing

Abstract

This research paper investigates Named Entity Recognition (NER) within Indonesia’s job vacancy domain, employing state-of-the-art Bert-based models. The study presents a detailed data collection and preprocessing methodology, followed by the Bert-based model’s fine-tuning for enhanced NER. The dataset comprises 48,673 job vacancies collected from the JobStreet website in July 2023, specifically focusing on multi-entity recognition, including job titles and job descriptions. An original annotation algorithm was developed using Python and Laravel for precise entity recognition. In addition, this paper provides an extensive literature review of NER and Bert-based models and discusses their relevance in the context of the Indonesian job market. The outcomes highlight the efficacy of our BERT-based model, attaining an average accuracy of 78.5%, a precision of 79.7%, a recall of 81.1%, and an F1 score of 80.8% in the Named Entity Recognition (NER) task. The study concludes by discussing the implications, limitations, and future directions, underscoring the model’s potential applicability in streamlining job matching and recruitment processes in Indonesia and beyond. This research contributes to the field by providing a robust framework for NER in job vacancies, highlighting the potential for improved job matching, and proposing enhancements for future model development and application in other languages and regions.

References

K. R. Chowdhary, “Natural Language Processing,” in Fundamentals of Artificial Intelligence. Springer India, 2021. Accessed: Oct. 20, 2024.

N. Nurchim, N. Nurmalitasari, and Z. A. Long, “Indonesian news classification application with named entity recognition approach,” JURNAL INFOTEL, vol. 15, no. 2, pp. 130–134, May 2023, doi: 10.20895/infotel.v15i2.909.

S. H. E* and M. A E, “Differential Hiring using a Combination of NER and Word Embedding,” International Journal of Recent Technology and Engineering (IJRTE), vol. 9, no. 1, pp. 1344–1349, May 2020, doi: 10.35940/ijrte.A2400.059120.

F. Stollenwerk, A. Sweden Niklas Fastlund, and A. Nyqvist, “Annotated Job Ads with Named Entity Recognition.”, doi: 10.1109/CSCWD49262.2021.9437789.

M. Melih Mutlu and A. Özgür, “A Dataset and BERT-based Models for Targeted Sentiment Analysis on Turkish Texts.”, doi: 10.48550/arXiv.2205.04185.

J. Li, A. Sun, J. Han, and C. Li, “A Survey on Deep Learning for Named Entity Recognition,” 2020.

A. Goyal, V. Gupta, and M. Kumar, “Recent Named Entity Recognition and Classification techniques: A systematic review,” Aug. 01, 2018, Elsevier Ireland Ltd. doi: 10.1016/j.cosrev.2018.06.001.

J.-J. Decorte, J. Van Hautte, T. Demeester, and C. Develder, “JobBERT: Understanding Job Titles through Skills.”

Z. Mincheva, N. Vasilev, V. Nikolov, and A. Antonov, “Extracting Structured Data from Text in Natural Language,” International Journal of Intelligent Information Systems, vol. 10, no. 4, p. 74, 2021, doi: 10.11648/j.ijiis.20211004.16.

J. Li, A. Sun, J. Han, and C. Li, “A Survey on Deep Learning for Named Entity Recognition,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 1, pp. 50–70, Jan. 2022, doi: 10.1109/TKDE.2020.2981314.

A. Goyal, V. Gupta, and M. Kumar, “Recent Named Entity Recognition and Classification techniques: A systematic review,” Computer Science Review, vol. 29, pp. 21–43, Aug. 2018, doi: 10.1016/j.cosrev.2018.06.001.

J.-J. Decorte, J. Van Hautte, T. Demeester, and C. Develder, “JobBERT: Understanding Job Titles through Skills.” arXiv, Sep. 20, 2021. doi: 10.48550/arXiv.2109.09605.

H. H. Putro and N. R. Rakhmawati, “Job Standard Parameters from Online Job Vacancy,” IJPS, vol. 0, no. 6, p. 46, Mar. 2021, doi: 10.12962/j23546026.y2020i6.8905.

Published
2025-01-31
How to Cite
Fitri, A. S., Fitri Ana Wati, S., Putra, H. H., Widodo, S., & Aziiza, A. A. (2025). Entity Extraction and Annotation for Job Title and Job Descriptions Using Bert-Based Model. Inform : Jurnal Ilmiah Bidang Teknologi Informasi Dan Komunikasi, 10(1), 73-77. https://doi.org/10.25139/inform.v10i1.7367
Section
Articles