Zhipu Xuebao (May 2024)
Development of a Computing Platform for Data Independent Acquisition Tensor Data
Abstract
Data independent acquisition tensor (DIAT), a specialized mass spectrometry data format, is intricately crafted for acquiring the meticulous processing and analysis of proteomics data through the advanced methodology of data independent acquisition (DIA). Compared to conventional approaches, DIAT technique has the advantages of the seamless processing of data visualization and efficient deep learning model training, ensuring a convenient user experience. Despite these merits, the DIAT mass spectrometry data format remains a relatively recent introduction. As a consequence, comprehensive principles and methodologies elucidating the nuances of this data format are predominantly found within the confines of specialized literatures. Furthermore, the absence of robust software platform supporting for the utilization of DIAT format data is a key impediment, hindering the widespread application in various scientific domains. In response to these challenges, a meticulously designed software solution for the comprehensive handling and analyzing of DIAT data was introduced in this study. Leveraging the robust capabilities of the PyQt framework, the software embodied a spectrum of functionalities intricately tied to DIAT technique, including DIAT data format conversion, the intuitive visualization of DIAT data, the training of classification models using DIAT data, and the accurate prediction of DIAT data labels through the utilization of trained models. Crucially, each of these functionalities underwent rigorous test using authentic mass spectrometry data. The test results unequivocally showcased the prowess of the DIAT computational software, empowering users to engage effortlessly in complex DIAT data analysis. This not only lowered the entry barriers for users, but also injected a renewed vigor into the field of DIA mass spectrometry data analysis. In summary, DIAT emerged as a trailblazing mass spectrometry data format meticulously tailored for the expeditious processing and analysis of proteomics data acquired through the sophisticated techniques of DIA. This study introduced a groundbreaking software solution, which was developed on the robust PyQt framework, aimed at surmounting the existing limitations associated with the nascent DIAT format. The implemented software was validated through rigorous real-world data testing, not only facilitated sophisticated DIAT data analysis but also significantly contributed to enhance the accessibility and dynamism of DIA mass spectrometry data analysis. This innovative approach will hold immense promise for advancing research and applications in the ever-evolving field of proteomics.
Keywords