Semantic-based tweet clustering and summarization technique using foxonomy and user influence
TitlesSemantic based Clustering and Summarization of Tweets exploiting Folksonomy and User InfluenceAuthorLee Dong-hoKeywordsTwitter; Clustering; Document Summarization; Tag Cluster K-Means; tweezer; clustering; document summary; tag cluster; K-Mean Issue Date2015-08PublisherKorea Information Science Society Citation Database Research, v. 31, NO. 2, Page. 104-119AbstractRecently, with the development of the Internet and the popularization of smart devices, many users can easily access a vast amount of information. As a result, the use of social networking services such as Twitter and Facebook has increased rapidly, and information on various topics is being created. However, it takes a lot of effort and time to obtain the information about the tweet the user wants among the vast number of tweets created. This paper proposes a semantic-based K-means clustering algorithm to cluster the vast number of tweeters. The semantic-based K-means clustering algorithm not only measures the similarity between data expressed as a vector model in the existing K-means clustering algorithm, but also clusters by considering the semantic similarity between the data. Additionally, in order to extract the most meaningful tweets from each cluster, we analyze the influence of each Twitter user and propose a tweet summary technique using the previously proposed document summary technique. Lastly, through an experiment using the Twitter data set provided by RepLab2013, the excellence of the semantic-based K-Means clustering algorithm and tweet summarization technique was demonstrated.; Recently, with the development of Internet technologies and propagation of smart devices, many users have been able to easily access a large amount of information. For this reason, social network services such as twitter and facebook, have been rapidly increasing and have created massive data for various topics. However, it is hard and requires too much time and effort for user to find necessary information from massively generated tweets because they must manually review all of tweets. In this paper, we propose semantic based K-Means clustering algorithm which is not only to measure the similarity between the data represented by vector space model but also to measure semantic similarity between the data for clustering the massive of tweets. To extract the most meaningful tweets in each cluster, we also propose a new tweet summarization technique which analyzes user information for measuring the influence of users and exploits our previously proposed document summarization method. Finally, through the experimental results on RepLab2013 twitter dataset, we show the superiority of semantic based K-Means clustering algorithm and the tweet summarization technique.URIhttps://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002018597https://repository.hanyang.ac.kr/handle/20.500.11754/186039ISSN1598-9798Appears in Collections:ETC[S] > ETC
Files in This Item:
There are no files associated with this item.
ExportRIS (EndNote)XLS (Excel)XML