Performance Evaluation of LSTM and GRU Models for Movie Genre Classification Based on Subtitle Dialogs Using Augmented Data and Cross-Validation

Ni Luh Putu Yonita Putri Utami; Desy Purnami Singgih Putri; Ni Kadek Dwi Rusjayanthi

doi:10.31294/inf.v12i2.25897

Performance Evaluation of LSTM and GRU Models for Movie Genre Classification Based on Subtitle Dialogs Using Augmented Data and Cross-Validation

Ni Luh Putu Yonita Putri Utami, Desy Purnami Singgih Putri, Ni Kadek Dwi Rusjayanthi

Abstract

This study aims to evaluate and compare the performance of Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models in classifying movie genres based on subtitle dialogs. To address data imbalance across genres, data augmentation was applied to create balanced datasets with 500 and 700 samples per genre, in addition to the original dataset. The classification models were built using Word2Vec for word embedding, followed by LSTM and GRU architectures with a single hidden layer and dropout regularization. Model performance was assessed using accuracy and further validated through 5-fold cross-validation. The best test accuracy was achieved with the dataset containing 700 samples per genre, reaching 91% for LSTM and 92% for GRU. Cross-validation showed stable performance with average accuracies of 0.68 for LSTM and 0.67 for GRU. A paired t-test analysis yielded a p-value of 0.341, indicating no statistically significant difference between the two models. These findings suggest that both LSTM and GRU are effective for genre classification based on subtitle dialogs. The use of data augmentation is a key contribution of this study, enabling improved model performance on underrepresented genres. This research supports the development of automated movie recommendation systems that utilize subtitle-based genre prediction.

Keywords

LSTM, GRU, Data Augmentation

Full Text:

PDF

References

Akbar, J., Fahmi, H., & Murniati, W. (2025). Multi Label Klasifikasi Genre Film Berdasarkan Sinopsis Menggunakan Metode Long Short-Term Memory (LSTM). Jurnal Manajemen Informatika & Sistem Informasi (MISI), 8(1). https://doi.org/10.36595/misi.v5i2

Alzoubi, Y. I., Topcu, A. E., Elbasi, E., Buyukyilmaz, M., & Cibikdiken, A. O. (2024). Anticipate Movie Theme from Subtitle: A Deep Learning Approach. 2024 47th International Conference on Telecommunications and Signal Processing, TSP 2024, 205–210. https://doi.org/10.1109/TSP63128.2024.10605925

Azka, F., Hilyah, N., Hufad, A., Aziz, F., & Kunci, K. (2024). Konten Publikasi Film: Impresi Remaja terhadap Film Indonesia. Jurnal Gunahumas, 7(1), 1–16. https://doi.org/10.17509/ghm.v7i1

Beddiar, D. R., Jahan, M. S., & Oussalah, M. (2021). Data Expansion using Back Translation and Paraphrasing for Hate Speech Detection. http://arxiv.org/abs/2106.04681

Bibi, I., Akhunzada, A., Malik, J., Iqbal, J., Mussaddiq, A., & Kim, S. (2020). A Dynamic DL-Driven Architecture to Combat Sophisticated Android Malware. IEEE Access, 8, 129600–129612. https://doi.org/10.1109/ACCESS.2020.3009819

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares,

F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Association for Computational Linguistics, 1724–1734. https://doi.org/https://doi.org/10.3115/v1/D14-1179

Du, Y., Lavarec, E., & Lalouqette, C. (2023). Text Data Augmentation to Manage Imbalanced Classification: Apply to BERT-based Large Multiclass Classification for Product Sheets. International Journal of Computational Linguistics (IJCL), 14, 2023–2024. https://www.cscjournals.org/journals/IJCL/description.php

Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/https://doi.org/10.1162/neco.1997.9.8.1735

Ibrahim, M. A., Faisal, Sulistiya, Z. D., & Winarto, T. S. Y. (2024). Prompt-Based Data Augmentation with Large Language Models for Indonesian Gender-Based Hate Speech Detection. Journal of Computer Science, 20(8), 819–826. https://doi.org/10.3844/jcssp.2024.819.826

Kusumo, S., & Somya, R. (2022). Penerapan Web Scraping Deskripsi Produk Menggunakan Selenium Python Dan Framework Laravel. Jurnal Teknik Informatika Dan Sistem Informasi, 9(4). https://doi.org/http://dx.doi.org/10.35957/jatisi.v9i4.2727

Mangolin, R. B., Pereira, R. M., Britto, A. S., Silla, C. N., Feltrim, V. D., Bertolini, D., & Costa, Y. M. G. (2022). A multimodal approach for multi-label movie genre classification. Multimedia Tools and Applications, 81(14), 19071–19096. https://doi.org/10.1007/s11042-020-10086-2

Nifanto, S., & Nurhopipah, A. (2024). Balancing Dataset Untuk Klasifikasi Komentar Program Kampus Merdeka Menggunakan Synonym Replacement. Jurnal Ilmu Komputer, 17, 55–64. https://doi.org/http://dx.doi.org/10.24843/JIK.2024.v17.i01.p02

Novenrodumetasa, N., Suarjaya, I. M. A. D., & Raharja, I. M. S. (2023). Analisis Genre Film Berdasarkan Data Subtitle. JITTER : Jurnal Ilmiah Teknologi Dan Komputer, 4(2), 1912. https://doi.org/10.24843/JTRTI.2023.v04.i02.p23

Pamungkas, F. S., Prasetya, B. D., & Kharisudin, I. (2019). Perbandingan Metode Klasifikasi Supervised Learning pada Data Bank Customers Menggunakan Python. PRISMA, Prosiding Seminar Nasional Matematika, 3, 689–694. https://journal.unnes.ac.id/sju/index.php/prisma/

Purnomo, I. I., & Syafarina, G. A. (2024). Analisis Prediktif Dan Preprocessing Untuk Kualitas Buah Apel Pendekatan Machine Learning. Technologia : Jurnal Ilmiah, 15(4), 681. https://doi.org/10.31602/tji.v15i4.15945

Rajput, N. K., & Grover, B. A. (2022). A multi-label movie genre classification scheme based on the movie’s subtitles. Multimedia Tools and Applications, 81(22), 32469–32490. https://doi.org/10.1007/s11042-022-12961-6

Salam, R. R., Jamil, M. F., Ibrahim, Y., Rahmaddeni, R., Soni, S., &

Herianto, H. (2023). Analisis Sentimen Terhadap Bantuan Langsung Tunai (BLT) Bahan Bakar Minyak (BBM) Menggunakan Support Vector Machine: Sentiment Analysis of Cash Direct Assistance Distribution for Fuel Oil Using Support Vector Machine. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 3(1), 27-35. https://doi.org/10.57152/malcom.v3i1.590

Sari, W., Rini, D., & Malik, R. (2019). Text Classification Using Long Short-Term Memory. International Conference on Electrical Engineering and Computer Science (ICECOS), 150–155. https://doi.org/https://doi.org/10.1109/ICECOS47637.2019.8984558

Sarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. In SN Computer Science (Vol. 2, Issue 3). Springer. https://doi.org/10.1007/s42979-021-00592-x

Shorten, C., Khoshgoftaar, T. M., & Furht, B. (2021). Text Data Augmentation for Deep Learning. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00492-0

Wijaya, N. N., Setiadi, D. R. I. M., & Muslikh, A. R. (2024). Music-Genre Classification using Bidirectional Long Short-Term Memory and Mel-Frequency Cepstral Coefficients. Journal of Computing Theories and Applications, 1(3), 243–256. https://doi.org/10.62411/jcta.9655

Wijiyanto, W., Pradana, A. I., Sopingi, S., & Atina, V. (2024). Teknik K-Fold Cross Validation untuk Mengevaluasi Kinerja Mahasiswa. Jurnal Algoritma, 21(1). https://doi.org/10.33364/algoritma/v.21-1.1618

DOI: https://doi.org/10.31294/inf.v12i2.25897