Universe (Sep 2022)

A Preliminary Study of Large Scale Pulsar Candidate Sifting Based on Parallel Hybrid Clustering

  • Zhi Ma,
  • Zi-Yi You,
  • Ying Liu,
  • Shi-Jun Dang,
  • Dan-Dan Zhang,
  • Ru-Shuang Zhao,
  • Pei Wang,
  • Si-Yao Li,
  • Ai-Jun Dong

DOI
https://doi.org/10.3390/universe8090461
Journal volume & issue
Vol. 8, no. 9
p. 461

Abstract

Read online

Pulsar candidate sifting is an essential part of pulsar analysis pipelines for discovering new pulsars. To solve the problem of data mining of a large number of pulsar data using a Five-hundred-meter Aperture Spherical radio Telescope (FAST), a parallel pulsar candidate sifting algorithm based on semi-supervised clustering is proposed, which adopts a hybrid clustering scheme based on density hierarchy and the partition method, combined with a Spark-based parallel model and a sliding window-based partition strategy. Experiments on the two datasets, HTRU (The High Time-Resolution Universe Survey) 2 and AOD-FAST (Actual Observation Data from FAST), show that the algorithm can excellently identify the pulsars with high performance: On HTRU2, the Precision and Recall rates are 0.946 and 0.905, and those on AOD-FAST are 0.787 and 0.994, respectively; the running time on both datasets is also significantly reduced compared with its serial execution mode. It can be concluded that the proposed algorithm provides a feasible idea for astronomical data mining of FAST observation.

Keywords