Ain Shams Engineering Journal (Jun 2014)
Arabic summarization in Tw
Abstract
Twitter, an online micro blogs, enables its users to write and read text-based posts known as “tweets”. It became one of the most commonly used social networks. However, an important problem arises is that the returned tweets, when searching for a topic phrase, are only sorted by recency not relevancy. This makes the user to manually read through the tweets in order to understand what are primarily saying about the particular topic. Some strategies were developed for summarizing English micro blogs but Arabic micro blogs summarization is still an active research area. This paper presents a machine learning based solution for summarizing Arabic micro blogging posts and more specifically Egyptian dialect summarization. The goal is to produce short summary for Arabic tweets related to a specific topic in less time and effort. The proposed strategy is evaluated and the results are compared with that obtained by the well-known multi-document summarization algorithms including; SumBasic, TF-IDF, PageRank, MEAD, and human summaries.
Keywords