The Astrophysical Journal (Jan 2024)

Anomaly Detection and Approximate Similarity Searches of Transients in Real-time Data Streams

  • P. D. Aleo,
  • A. W. Engel,
  • G. Narayan,
  • C. R. Angus,
  • K. Malanchev,
  • K. Auchettl,
  • V. F. Baldassare,
  • A. Berres,
  • T. J. L. de Boer,
  • B. M. Boyd,
  • K. C. Chambers,
  • K. W. Davis,
  • N. Esquivel,
  • D. Farias,
  • R. J. Foley,
  • A. Gagliano,
  • C. Gall,
  • H. Gao,
  • S. Gomez,
  • M. Grayling,
  • D. O. Jones,
  • C.-C. Lin,
  • E. A. Magnier,
  • K. S. Mandel,
  • T. Matheson,
  • S. I. Raimundo,
  • V. G. Shah,
  • M. D. Soraisam,
  • K. M. de Soto,
  • S. Vicencio,
  • V. A. Villar,
  • R. J. Wainscoat

DOI
https://doi.org/10.3847/1538-4357/ad6869
Journal volume & issue
Vol. 974, no. 2
p. 172

Abstract

Read online

We present Lightcurve Anomaly Identification and Similarity Search ( LAISS ), an automated pipeline to detect anomalous astrophysical transients in real-time data streams. We deploy our anomaly detection model on the nightly Zwicky Transient Facility (ZTF) Alert Stream via the ANTARES broker, identifying a manageable ∼1–5 candidates per night for expert vetting and coordinating follow-up observations. Our method leverages statistical light-curve and contextual host galaxy features within a random forest classifier, tagging transients of rare classes ( spectroscopic anomalies), of uncommon host galaxy environments ( contextual anomalies), and of peculiar or interaction-powered phenomena ( behavioral anomalies). Moreover, we demonstrate the power of a low-latency (∼ms) approximate similarity search method to find transient analogs with similar light-curve evolution and host galaxy environments. We use analogs for data-driven discovery, characterization, (re)classification, and imputation in retrospective and real-time searches. To date, we have identified ∼50 previously known and previously missed rare transients from real-time and retrospective searches, including but not limited to superluminous supernovae (SLSNe), tidal disruption events, SNe IIn, SNe IIb, SNe I-CSM, SNe Ia-91bg-like, SNe Ib, SNe Ic, SNe Ic-BL, and M31 novae. Lastly, we report the discovery of 325 total transients, all observed between 2018 and 2021 and absent from public catalogs (∼1% of all ZTF Astronomical Transient reports to the Transient Name Server through 2021). These methods enable a systematic approach to finding the “needle in the haystack” in large-volume data streams. Because of its integration with the ANTARES broker, LAISS is built to detect exciting transients in Rubin data.

Keywords