Urdu Handwritten Characters Data Visualization and Recognition Using Distributed Stochastic Neighborhood Embedding and Deep Network

Mujtaba Husnain; Malik Muhammad Saad Missen; Shahzad Mumtaz; Dost Muhammad Khan; Mickäel Coustaty; Muhammad Muzzamil Luqman; Jean-Marc Ogier; Hizbullah Khattak; Sikandar Ali; Ali Samad

doi:10.1155/2021/4383037

Complexity (Jan 2021)

Urdu Handwritten Characters Data Visualization and Recognition Using Distributed Stochastic Neighborhood Embedding and Deep Network

Mujtaba Husnain,
Malik Muhammad Saad Missen,
Shahzad Mumtaz,
Dost Muhammad Khan,
Mickäel Coustaty,
Muhammad Muzzamil Luqman,
Jean-Marc Ogier,
Hizbullah Khattak,
Sikandar Ali,
Ali Samad

Affiliations

Mujtaba Husnain: Department of Information Technology, Faculty of Computing, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
Malik Muhammad Saad Missen: Department of Information Technology, Faculty of Computing, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
Shahzad Mumtaz: Department of Information Technology, Faculty of Computing, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
Dost Muhammad Khan: Department of Information Technology, Faculty of Computing, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
Mickäel Coustaty: L3i Lab, Université of La Rochelle Av. Michel Crépeau, 17000 La Rochelle, France
Muhammad Muzzamil Luqman: L3i Lab, Université of La Rochelle Av. Michel Crépeau, 17000 La Rochelle, France
Jean-Marc Ogier: L3i Lab, Université of La Rochelle Av. Michel Crépeau, 17000 La Rochelle, France
Hizbullah Khattak: Department of Information Technology, Hazara University Mansehra, 21120 Khyber Pakhtunkhwa, Pakistan
Sikandar Ali: Department of Information Technology, The University of Haripur, Khyber Pakhtunkhwa, Pakistan
Ali Samad: Department of Information Technology, Faculty of Computing, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan

DOI: https://doi.org/10.1155/2021/4383037
Journal volume & issue: Vol. 2021

Abstract

Read online

In this paper, we make use of the 2-dimensional data obtained through t-Stochastic Neighborhood Embedding (t-SNE) when applied on high-dimensional data of Urdu handwritten characters and numerals. The instances of the dataset used for experimental work are classified in multiple classes depending on the shape similarity. We performed three tasks in a disciplined order; namely, (i) we generated a state-of-the-art dataset of both the Urdu handwritten characters and numerals by inviting a number of native Urdu participants from different social and academic groups, since there is no publicly available dataset of such type till date, then (ii) applied classical approaches of dimensionality reduction and data visualization like Principal Component Analysis (PCA), Autoencoders (AE) in comparison with t-Stochastic Neighborhood Embedding (t-SNE), and (iii) used the reduced dimensions obtained through PCA, AE, and t-SNE for recognition of Urdu handwritten characters and numerals using a deep network like Convolution Neural Network (CNN). The accuracy achieved in recognition of Urdu characters and numerals among the approaches for the same task is found to be much better. The novelty lies in the fact that the resulting reduced dimensions are used for the first time for the recognition of Urdu handwritten text at the character level instead of using the whole multidimensional data. This results in consuming less computation time with the same accuracy when compared with processing time consumed by recognition approaches applied to other datasets for the same task using the whole data.

Published in Complexity

ISSN: 1076-2787 (Print); 1099-0526 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://onlinelibrary.wiley.com/journal/8503

About the journal