Maven: a multimodal foundation model for supernova science

Gemma Zhang; Thomas Helfer; Alexander T Gagliano; Siddharth Mishra-Sharma; V Ashley Villar

doi:10.1088/2632-2153/ad990d

Machine Learning: Science and Technology (Jan 2024)

Maven: a multimodal foundation model for supernova science

Gemma Zhang,
Thomas Helfer,
Alexander T Gagliano,
Siddharth Mishra-Sharma,
V Ashley Villar

Affiliations

Gemma Zhang: ORCiD; The NSF AI Institute for Artificial Intelligence and Fundamental Interactions , Boston, MA, United States of America; Department of Physics, Harvard University , Cambridge, MA 02138, United States of America
Thomas Helfer: ORCiD; Institute for Advanced Computational Science, Stony Brook University , Stony Brook, NY 11794 United States of America
Alexander T Gagliano: ORCiD; The NSF AI Institute for Artificial Intelligence and Fundamental Interactions , Boston, MA, United States of America; Department of Physics, Massachusetts Institute of Technology , Cambridge, MA 02139, United States of America; Center for Astrophysics | Harvard & Smithsonian , 60 Garden Street, MS-16, Cambridge, MA 02138, United States of America
Siddharth Mishra-Sharma: ORCiD; The NSF AI Institute for Artificial Intelligence and Fundamental Interactions , Boston, MA, United States of America; Department of Physics, Harvard University , Cambridge, MA 02138, United States of America; Center for Theoretical Physics, Massachusetts Institute of Technology , Cambridge, MA 02139, United States of America
V Ashley Villar: ORCiD; The NSF AI Institute for Artificial Intelligence and Fundamental Interactions , Boston, MA, United States of America; Center for Astrophysics | Harvard & Smithsonian , 60 Garden Street, MS-16, Cambridge, MA 02138, United States of America

DOI: https://doi.org/10.1088/2632-2153/ad990d
Journal volume & issue: Vol. 5, no. 4
p. 045069

Abstract

Read online

A common setting in astronomy is the availability of a small number of high-quality observations, and larger amounts of either lower-quality observations or synthetic data from simplified models. Time-domain astrophysics is a canonical example of this imbalance, with the number of supernovae observed photometrically outpacing the number observed spectroscopically by multiple orders of magnitude. At the same time, no data-driven models exist to understand these photometric and spectroscopic observables in a common context. Contrastive learning objectives, which have grown in popularity for aligning distinct data modalities in a shared embedding space, provide a potential solution to extract information from these modalities. We present Maven, the first foundation model for supernova science. To construct Maven, we first pre-train our model to align photometry and spectroscopy from 0.5 M synthetic supernovae using a contrastive objective. We then fine-tune the model on 4702 observed supernovae from the Zwicky transient facility. Maven reaches state-of-the-art performance on both classification and redshift estimation, despite the embeddings not being explicitly optimized for these tasks. Through ablation studies, we show that pre-training with synthetic data improves overall performance. In the upcoming era of the Vera C. Rubin observatory, Maven will serve as a valuable tool for leveraging large, unlabeled and multimodal time-domain datasets.

Published in Machine Learning: Science and Technology

ISSN: 2632-2153 (Online)
Publisher: IOP Publishing
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://iopscience.iop.org/journal/2632-2153

About the journal

Abstract

Keywords