Removing the effects of the site in brain imaging machine-learning – Measurement and extendable benchmark

Aleix Solanes; Corentin J Gosling; Lydia Fortea; María Ortuño; Elisabet Lopez-Soley; Sara Llufriu; Santiago Madero; Eloy Martinez-Heras; Edith Pomarol-Clotet; Elisabeth Solana; Eduard Vieta; Joaquim Radua

NeuroImage (Jan 2023)

Removing the effects of the site in brain imaging machine-learning – Measurement and extendable benchmark

Aleix Solanes,
Corentin J Gosling,
Lydia Fortea,
María Ortuño,
Elisabet Lopez-Soley,
Sara Llufriu,
Santiago Madero,
Eloy Martinez-Heras,
Edith Pomarol-Clotet,
Elisabeth Solana,
Eduard Vieta,
Joaquim Radua

Affiliations

Aleix Solanes: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; Department of Psychiatry and Forensic Medicine, Autonomous University of Barcelona, Barcelona, Spain
Corentin J Gosling: DysCo Lab, Paris Nanterre University, Nanterre, France; Laboratoire de Psychopathologie et Processus de Santé, Université de Paris, Paris, France
Lydia Fortea: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; Biomedical Network Research Centre on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Madrid, Spain; University of Barcelona, Barcelona, Spain
María Ortuño: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
Elisabet Lopez-Soley: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; University of Barcelona, Barcelona, Spain; Center of Neuroimmunology, Laboratory of Advanced Imaging in Neuroimmunological Diseases, Hospital Clinic Barcelona
Sara Llufriu: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; University of Barcelona, Barcelona, Spain; Center of Neuroimmunology, Laboratory of Advanced Imaging in Neuroimmunological Diseases, Hospital Clinic Barcelona
Santiago Madero: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; Biomedical Network Research Centre on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Madrid, Spain; University of Barcelona, Barcelona, Spain; Barcelona Bipolar Disorders and Depressive Unit, Institute of Neurosciences, Hospital Clinic, Barcelona, Spain
Eloy Martinez-Heras: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; University of Barcelona, Barcelona, Spain; Center of Neuroimmunology, Laboratory of Advanced Imaging in Neuroimmunological Diseases, Hospital Clinic Barcelona
Edith Pomarol-Clotet: Biomedical Network Research Centre on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Madrid, Spain; FIDMAG Germanes Hospitalàries Research Foundation, Barcelona, Spain; Benito Menni CASM, Sant Boi de Llobregat, Barcelona, Spain
Elisabeth Solana: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; University of Barcelona, Barcelona, Spain; Center of Neuroimmunology, Laboratory of Advanced Imaging in Neuroimmunological Diseases, Hospital Clinic Barcelona
Eduard Vieta: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; Biomedical Network Research Centre on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Madrid, Spain; University of Barcelona, Barcelona, Spain; Barcelona Bipolar Disorders and Depressive Unit, Institute of Neurosciences, Hospital Clinic, Barcelona, Spain
Joaquim Radua: Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; Biomedical Network Research Centre on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Madrid, Spain; University of Barcelona, Barcelona, Spain; Department of Psychosis Studies, Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, United Kingdom; Centre for Psychiatric Research and Education, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden; Corresponding author at: IDIBAPS: Institut d'Investigacions Biomediques August Pi i Sunyer, Rosselló 149, 08036 Barcelona, Spain.

Journal volume & issue: Vol. 265
p. 119800

Abstract

Read online

Multisite machine-learning neuroimaging studies, such as those conducted by the ENIGMA Consortium, need to remove the differences between sites to avoid effects of the site (EoS) that may prevent or fraudulently help the creation of prediction models, leading to impoverished or inflated prediction accuracy. Unfortunately, we have shown earlier that current Methods Aiming to Remove the EoS (MAREoS, e.g., ComBat) cannot remove complex EoS (e.g., including interactions between regions). And complex EoS may bias the accuracy. To overcome this hurdle, groups worldwide are developing novel MAREoS. However, we cannot assess their effectiveness because EoS may either inflate or shrink the accuracy, and MAREoS may both remove the EoS and degrade the data. In this work, we propose a strategy to measure the effectiveness of a MAREoS in removing different types of EoS. FOR MAREOS DEVELOPERS, we provide two multisite MRI datasets with only simple true effects (i.e., detectable by most machine-learning algorithms) and two with only simple EoS (i.e., removable by most MAREoS). First, they should use these datasets to fit machine-learning algorithms after applying the MAREoS. Second, they should use the formulas we provide to calculate the relative accuracy change associated with the MAREoS in each dataset and derive an EoS-removal effectiveness statistic. We also offer similar datasets and formulas for complex true effects and EoS that include first-order interactions. FOR MACHINE-LEARNING RESEARCHERS, we provide an extendable benchmark website to show: a) the types of EoS they should remove for each given machine-learning algorithm and b) the effectiveness of each MAREoS for removing each type of EoS. Relevantly, a MAREoS only able to remove the simple EoS may suffice for simple machine-learning algorithms, whereas more complex algorithms need a MAREoS that can remove more complex EoS. For instance, ComBat removes all simple EoS as needed for predictions based on simple lasso algorithms, but it leaves residual complex EoS that may bias the predictions based on standard support vector machine algorithms.

Published in NeuroImage

ISSN: 1053-8119 (Print); 1095-9572 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry
Website: https://www.journals.elsevier.com/neuroimage

About the journal

Abstract

Keywords