Analysis Name Entity Disambiguation Using Mining Evidence Method

Adelya Astari, Moch. Arif Bijaksana, Arie Ardiyanti Suryani


Hadith is the second guideline and source of Islamic teachings after the Qur'an. One of the most Saheeh hadith is the book of Saheeh al-Bukhaari. Hadith Sahih Bukhari has a chain of narrators, hadith numbers, and contents of different contents. This tradition also has science that discusses the history of the narrators of the hadith called the Science of Rijalul Hadith. In the Sahih Bukhari hadith there are the names of the narrators of the hadith who have the same name, causing obligation between names. That makes it difficult for many ordinary people to understand these ambiguous names because it is not yet known whether the two names are the same person or not. So, it raises the problem of a name ambiguation for ordinary people who cannot distinguish whether the name of the narrator is the same person or not. To solve these problems, a solution is built, namely the disambiguation of names to eliminate the ambiguity of the name by checking the name, hadith number, narrators chain, content topics, circles, countries, and companions of the Prophet that are seen from the 3 last names before the Prophet based on the chain of narrators. Also, the solution is assisted by using a method Mining Evidence with several other approaches, i.e. Association label documents, word association labels, context similarity, cosine similarity, and word2vec to obtain all similarity values between name entities. After the similarity values are obtained, the data are grouped using the Clustering algorithm. This system is expected to be able to produce a good system performance with a confusion matrix based on value precision, recall, and accuracy.


Disambiguation, Entity Name, Mining Evidence, Sahih Bukhari, Similarity

Full Text:



Bunescu, R., & Pas, M. (n.d.). Using Encyclopedic Knowledge for Named Entity Disambiguation.

Chairulloh, M. R., Bijaksana, M. A., & Wahyudi, B. A. (n.d.). Analisis Name Matching untuk Nama Arab Menggunakan Metode N-gram dan Jaccard Similarity Pendahuluan Studi Terkait Hadis Pedoman Transliterasi Aksara Arab ke Latin. 1–7.

Cucerzan, S. (2007). Large-scale named entity disambiguation based on Wikipedia data. EMNLP-CoNLL 2007 - Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June, 708–716.

Dia, L., Maka, M., & Tidak, S. (n.d.). No Title.

Farnham, J. E., & Rowland, R. E. (1968). The retention of 133Ba in beagles. ANL-7615. ANL [Reports]. U. S. Atomic Energy Commission, 32–38.

Ginting, M. F., Bijaksana, M. A., Wahyudi, B. A., & Telkom, U. (n.d.). Analisis Pencocokan Nama Arab Dengan Terjemahan Nama Indonesia Menggunakan Metode Jaro Winkler.

Guntara, F. F. (2019). Pembangunan Daftar Kata Terkait pada Kosa Kata Al-Qur ’ an Berdasarkan Kesamaan Distribusiaonal Proposal Tugas Akhir Program Studi Sarjana Informatika Fakultas Informatika Universitas Telkom Bandung.

Gupitasari, L. (2019). Pembangunan Synonym Set untuk Tesaurus Al-Quran dengan Pendekatan Kamus Monolingual dan WordNet Proposal Tugas Akhir Program Studi Sarjana Informatika Fakultas Informatika Universitas Telkom Bandung.

Hoffart, J., Yosef, M. A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., & Weikum, G. (2011). Robust disambiguation of named entities in text. EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 782–792.

Nguyen, H. T., & Cao, T. H. (2008). Named entity disambiguation on an ontology enriched by Wikipedia. RIVF 2008 - 2008 IEEE International Conference on Research, Innovation, and Vision for the Future in Computing and Communication Technologies, 00(c), 247–254.


Copyright (c) 2020 Paradigma - Jurnal Komputer dan Informatika

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Dipublikasikan oleh LPPM Universitas Bina Sarana Informatika

Jl. Kramat Raya No.98, Kwitang, Kec. Senen, Kota Jakarta Pusat, DKI Jakarta 10450
Telepon: 021-21231170, ext. 704 / 705
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License