TOPIC MODELLING ADUAN MASYARAKAT PEKANBARU MENGGUNAKAN METODE LATENT DIRICHLET ALLOCATION
No Thumbnail Available
Date
2023-11
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Elfitra
Abstract
The people of Pekanbaru can submit complaints to the Pekanbaru City Government
through SP4N-LAPOR via the website or application. Based on data from this system, it
is evident that the issues faced by the community still occur frequently. This is confirmed
by the recurring nature of similar complaints over time, indicating that the issues have
not been maximally resolved, and the relevant agencies do not have a deep
understanding of the trends in community issues. Therefore, this research will model
topics using Latent Dirichlet Allocation (LDA) to group the data into topics that
represent the most frequently emerging issues. By understanding these topic trends, the
government and relevant agencies can be more responsive in addressing emerging
problems. The data used consists of 345 text data obtained from the Pekanbaru City
Information and Communication Office. The collected data then undergoes
preprocessing stages, including text cleaning, tokenization, normalization, stopword
removal, and stemming. After preprocessing, word weighting is performed using Term
Frequency-Inverse Document Frequency (TF-IDF). In building LDA, coherence scores
are used to determine the most optimal number of topics for topic modeling. Experiments
are conducted with 50 and 100 iteration tests. Different numbers of topics, namely 3, 5,
and 7, are used for each iteration test. Based on these experiments, the analysis results
show that 5 is the most suitable number of topics. The topics identified are, These topics
are public services, order, Covid-19, public services, government assistance, and data
management.
Description
Keywords
Coherence Score, LDA, Preprocessing, TF-IDF, Topic Modelling
Citation
Perpustakaan