Data in Brief (Oct 2023)

ANAD: Arabic news article dataset

  • Mohammed Altamimi,
  • Abdulaziz M. Alayba

Journal volume & issue
Vol. 50
p. 109460

Abstract

Read online

In this paper, we present a modern standard Arabic dataset based on Arabic news articles collected over a one-year period from 01/01/2021 to 12/31/2021. In total, from 12 Arabic news websites, over 500,000 articles were collected, the selection of which was driven by a variety of topics, including sports, economies, local news, politics, tech, tourism, entertainment, cars, health, and art. The development of this dataset will enable data scientists to explore and experiment effectively in the field of natural language processing, and the dataset can also be used to develop machine learning and deep learning models to classify articles according to topic. The dataset is available for download athttps://github.com/alaybaa/ArabicArticlesDataset/tree/main.

Keywords