TY - JOUR AU - Salih, Niyaz PY - 2022/12/30 Y2 - 2024/03/29 TI - Semantic-Based K-Means Clustering for IMDB Top 100 Movies JF - Journal of Applied Science and Technology Trends JA - JASTT VL - 3 IS - 02 SE - DO - 10.38094/jastt302138 UR - https://jastt.org/index.php/jasttpath/article/view/138 SP - 112 - 115 AB - <p>Textual documents are growing rapidly through the internet in today’s modern technology era. Electronic structured databases archive offline and online documents, e-mails, webpages, blog and social network posts. Without appropriate ranking and demand clustering when there is classification without any specifics, it is quite difficult to retain and access these documents. K-means is one of the methods that is frequently used for clustering. In terms of determining the proximity of meaning or semantics between data, the distance-based K-means method still has flaws. To get around this issue, semantic similarity can be estimated by measuring the level of similarity between objects in a cluster. This research provides a method for clustering documents based on semantic similarity. The approach is carried out by defining document synopses from the IMDB and Wikipedia databases using the NLTK dictionary, and we provide a semantic-based K-means clustering approach that assesses not only the similarity of the data represented as a vector space model with TFIDF, but also the semantic similarity of the data Precision, recall, and F-measure, we demonstrate how well the semantic-based K-means clustering technique works using experimental findings from the IMDB and Wikipedia &nbsp;top 100 movies datasets.</p><p>&nbsp;</p> ER -