Talk show segmentation system based on Twitter using K-medoids clustering algorithm
DOI:
https://doi.org/10.24036/jptk.v3i3.15123Keywords:
twitter segmentation, k-medoids clustering, cosine similarity, data transformation, silhouette coefficientAbstract
Innovations on a talk show on television can be a threat. Audience will be divided into groups so that it can make a downgrade rating program. Program ratings affect companies that will use advertising services. Television companies will go bankrupt. The biggest source of income is sales of advertising services. One way to overcome them can be analyzed in public opinion. The results of the analysis can provide information about the attractiveness of the community towards the program. But the analysis process takes a long time and can be done only by a competent person so another process is needed to get the results of the analysis that is fast and can be done by anyone. In this study using K-Medoids Clustering in the process of identifying public opinion. The clustering process known as unsupervised learning will be combined with the labeling process. The previous episode's tweet data will be labeled and then used to obtain the predicted labels from other cluster members. Before going through the clustering stage, the tweet data will go through the text preprocessing stage then transformed into a numeric form based on the appearance of the word. Transformation data will be clustered by calculating proximity using Cosine Similarity. Labels from the Medoids cluster will be used on unlabeled tweet data. The cluster results were tested using the Silhouette Coefficient method to get 0.19 results. However, this method successfully predicted public opinion and achieved an accuracy of 80%.
Downloads
References
Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). The Impact of Features Extraction on the Sentiment Analysis. Procedia Computer Science, 152, 341–348. https://doi.org/10.1016/j.procs.2019.05.008
Arora, P., Deepali, & Varshney, S. (2016). Analysis of K-Means and K-Medoids Algorithm for Big Data. Procedia Computer Science, 78, 507–512. https://doi.org/10.1016/j.procs.2016.02.095
Chrisnanto, Y. H., & Abdillah, G. (2015). Gambaran Umum Kemampuan Akademik Mahasiswa Unjani Dengan Algoritma Partitioning Around Medoids ( PAM ) Clustering. Seminar Nasional Ilmu Pengetahuan Dan Teknologi, 285–290.
Darnstadt, M., Meutzner, H., & Kolossa, D. (2014). Reducing the Cost of Breaking Audio CAPTCHAs by Active and Semi-supervised Learning. Proceedings - 2014 13th International Conference on Machine Learning and Applications, ICMLA 2014, 67–73. https://doi.org/10.1109/ICMLA.2014.16
Devika, M. D., Sunitha, C., & Ganesh, A. (2016). Sentiment Analysis: A Comparative Study on Different Approaches. Procedia Computer Science, 87, 44–49. https://doi.org/10.1016/j.procs.2016.05.124
Dos Santos, C. N., & Gatti, M. (2014). Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts. International Conference on Computational Linguistics, 69–78. Ireland.
Guftar, M., Ali, S. H., Raja, A. A., & Qamar, U. (2015). A Novel Framework for Classification of Syncope Disease using K-Means Clustering Algorithm. SAI Intelligent Systems Conference, 127–132. https://doi.org/10.1109/IntelliSys.2015.7361135
Hutto, C. J., & Gilbert, E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. International AAAI Conference on Weblogs and Social Media, 216–225. https://doi.org/10.1210/en.2011-1066
Ji, W., Wang, R., & Ma, J. (2019). Dictionary-Based Active Learning for Sound Event Classification. Multimedia Tools and Applications, 78(3), 3831–3842. https://doi.org/10.1007/s11042-018-6380-z
Kui, X., Lv, H., Tang, Z., Zhou, H., Yang, W., Li, J., … Xia, J. (2020). TVseer: A Visual Analytics System for Television Ratings. Visual Informatics, 4(3), 1–11. https://doi.org/10.1016/j.visinf.2020.06.001
Li, S. S. (2020). Lifestyles, Technology Clustering, and the Adoption of Over-the-top Television and Internet Protocol Television in Taiwan. International Journal of Communication, 14, 2017–2035.
Pribadi, M. A., Yoedtadi, M. G., & Siswoko, K. H. (2017). Perspektif Praktisi Televisi Indonesia terhadap Konvergensi Televisi dan Internet dalam Persaingan Penyajian Informasi di Internet. Jurnal Muara Ilmu Sosial, Humaniora, Dan Seni, 1(1), 319. https://doi.org/10.24912/jmishumsen.v1i1.372
Ruiz, L. G. B., Pegalajar, M. C., Arcucci, R., & Molina-Solana, M. (2020). A Time-Series Clustering Methodology for Knowledge Extraction in Energy Consumption Data. Expert Systems with Applications, 160, 113731. https://doi.org/10.1016/j.eswa.2020.113731
Shuyang, Z., Heittola, T., & Virtanen, T. (2017). Active Learning for Sound Event Classification by Clustering Unlabeled Data. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 751–755. https://doi.org/10.1109/ICASSP.2017.7952256
Shuyang, Z., Heittola, T., & Virtanen, T. (2018). An Active Learning Method Using Clustering and Committee-Based Sample Selection for Sound Event Classification. 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings, 116–120. https://doi.org/10.1109/IWAENC.2018.8521336
Tan, Y. (2018). An Improved KNN Text Classification Algorithm Based on K-Medoids and Rough Set. International Conference on Intelligent Human-Machine Systems and Cybernetics, 1, 109–113. https://doi.org/10.1109/IHMSC.2018.00032
Vijayarani, S., Ilamathi, M. J., & Nithya, M. (2016). Preprocessing Techniques for Text Mining -An Overview. International Journal of Computer Science & Communication Networks, 5(1), 7–16.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2020 Kharisma Jevi Shafira Sepyanto, Yulison Herry Chrisnanto, Fajri Rakhmat Umbara
This work is licensed under a Creative Commons Attribution 4.0 International License.