Signals (Mar 2024)

Large Language Model-Informed X-ray Photoelectron Spectroscopy Data Analysis

  • J. de Curtò,
  • I. de Zarzà,
  • Gemma Roig,
  • Carlos T. Calafate

DOI
https://doi.org/10.3390/signals5020010
Journal volume & issue
Vol. 5, no. 2
pp. 181 – 201

Abstract

Read online

X-ray photoelectron spectroscopy (XPS) remains a fundamental technique in materials science, offering invaluable insights into the chemical states and electronic structure of a material. However, the interpretation of XPS spectra can be complex, requiring deep expertise and often sophisticated curve-fitting methods. In this study, we present a novel approach to the analysis of XPS data, integrating the utilization of large language models (LLMs), specifically OpenAI’s GPT-3.5/4 Turbo to provide insightful guidance during the data analysis process. Working in the framework of the CIRCE-NAPP beamline at the CELLS ALBA Synchrotron facility where data are obtained using ambient pressure X-ray photoelectron spectroscopy (APXPS), we implement robust curve-fitting techniques on APXPS spectra, highlighting complex cases including overlapping peaks, diverse chemical states, and noise presence. Post curve fitting, we engage the LLM to facilitate the interpretation of the fitted parameters, leaning on its extensive training data to simulate an interaction corresponding to expert consultation. The manuscript presents also a real use case utilizing GPT-4 and Meta’s LLaMA-2 and describes the integration of the functionality into the TANGO control system. Our methodology not only offers a fresh perspective on XPS data analysis, but also introduces a new dimension of artificial intelligence (AI) integration into scientific research. It showcases the power of LLMs in enhancing the interpretative process, particularly in scenarios wherein expert knowledge may not be immediately available. Despite the inherent limitations of LLMs, their potential in the realm of materials science research is promising, opening doors to a future wherein AI assists in the transformation of raw data into meaningful scientific knowledge.

Keywords