Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, UK; Institute of Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA; Corresponding author
Ivan Tomic
Deep Medicine, Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, UK; Corresponding author
Levi Waldron
Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA; Institute for Implementation Science and Population Health, City University of New York, New York, NY, USA
Ludwig Geistlinger
Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA; Institute for Implementation Science and Population Health, City University of New York, New York, NY, USA
Max Kuhn
RStudio, PBC, Boston, MA, USA
Rachel L. Spreng
Duke Human Vaccine Institute, Duke University, Durham, NC, USA
Lindsay C. Dahora
Duke Human Vaccine Institute, Duke University, Durham, NC, USA
Kelly E. Seaton
Duke Human Vaccine Institute, Duke University, Durham, NC, USA
Georgia Tomaras
Duke Human Vaccine Institute, Duke University, Durham, NC, USA
Jennifer Hill
Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, UK
Niharika A. Duggal
MRC-Versus Arthritis Centre for Musculoskeletal Ageing Research, Institute of Inflammation and Ageing, University of Birmingham Research Labs, Birmingham, UK
Ross D. Pollock
Centre for Human and Applied Physiological Sciences, King's College London, UK
Norman R. Lazarus
Centre for Human and Applied Physiological Sciences, King's College London, UK
Stephen D.R. Harridge
Centre for Human and Applied Physiological Sciences, King's College London, UK
Janet M. Lord
MRC-Versus Arthritis Centre for Musculoskeletal Ageing Research, Institute of Inflammation and Ageing, University of Birmingham Research Labs, Birmingham, UK; NIHR Birmingham Biomedical Research Centre, University Hospital Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
Purvesh Khatri
Institute of Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA; Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA, USA
Andrew J. Pollard
Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, UK
Mark M. Davis
Institute of Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA; Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA; Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA; Corresponding author
Summary: Data analysis and knowledge discovery has become more and more important in biology and medicine with the increasing complexity of biological datasets, but the necessarily sophisticated programming skills and in-depth understanding of algorithms needed pose barriers to most biologists and clinicians to perform such research. We have developed a modular open-source software, SIMON, to facilitate the application of 180+ state-of-the-art machine-learning algorithms to high-dimensional biomedical data. With an easy-to-use graphical user interface, standardized pipelines, and automated approach for machine learning and other statistical analysis methods, SIMON helps to identify optimal algorithms and provides a resource that empowers non-technical and technical researchers to identify crucial patterns in biomedical data. The Bigger Picture: Over the past years, technological advances have enabled the generation of large amounts of data at multiple scales. The integration of high-dimensional data is particularly important in biomedical sciences, as they can be used to identify biological mechanisms and predict clinical outcomes well in advance of their occurrence. Because of the lack of powerful analytical tools that can be used by the average biomedical researcher, translation of such knowledge has been extremely slow. We have developed an open-source software, SIMON, to facilitate the application of machine learning to high-dimensional biomedical data. In SIMON, analysis is performed using an intuitive graphical user interface and standardized, automated machine learning approach allowing non-technical researchers to identify patterns and extract knowledge from high-dimensional data and build high-quality predictive models.