Neural network developed by a TUM team uses all the spectral information for identification, missing fewer proteins and making 100 times fewer mistakes than mass spectrometry
Using artificial intelligence, researchers at the Technical University of Munich (TUM) have succeeded in making the mass analysis of proteins from any organism significantly faster than before and almost error-free. This new approach is set to provoke a considerable change in the field of proteomics, as it can be applied in both basic and clinical research.
The genome of any organism contains the blueprints for thousands of proteins which control almost all the functions of life. Defective proteins lead to serious diseases, such as cancer, diabetes or dementia. Therefore, proteins are also very important targets for drugs.
To better understand life processes and diseases and develop more appropriate therapies, it is necessary for as many proteins as possible to be analysed simultaneously. At present, mass spectrometry is used in order to determine the type and quantity of proteins in a biological system. However, the current methods of data analysis continue to produce many mistakes.
A team at the Technical University of Munich led by bioinformatics scientist Mathias Wilhelm and biochemist Bernhard Küster, Professor of Proteomics and Bioanalytics at the Technical University of Munich, has now succeeded in using proteomic data to train a neural network in such a way that it is able to recognise proteins much more quickly and with almost no errors.
Mass spectrometers do not measure proteins directly. They analyse smaller parts consisting of amino acid sequences with up to 30 building blocks. The measured spectra of these chains are compared with databases in order to assign them to a specific protein. However, the evaluation software can only use part of the information that the spectra contain. Therefore, certain proteins are not recognised or are recognised incorrectly.
"This is a serious problem," explains Küster. The neural network developed by the TUM team uses all the information of the spectra for the process of identification. "We miss fewer proteins and make 100 times fewer mistakes," says Bernhard Küster.
"Prosit", as the researchers call the AI software, is "applicable to all organisms in the world, even if their proteomes have never been examined before," explains Mathias Wilhelm. "This enables research which was previously inconceivable."
With the help of 100 million mass spectra, the algorithm has been so extensively trained that it can be used for all common mass spectrometers without any additional training. "Our system is the global leader in this field," says Küster.
Clinics, biotech companies, pharmaceutical companies and research institutes are using high-performance devices of this kind; the market is already worth billions. With "Prosit", it will be possible to develop even more powerful instruments in the future. Researchers and physicians will also be better and faster able to search for biomarkers in patients' blood or urine, or monitor therapies for their effectiveness.
The researchers also have high hopes for fundamental research. "The method can be used to track down new regulatory mechanisms in cells," says Küster. "We hope to gain a considerable amount of knowledge here, which, in the medium and long term, will be reflected in the treatment of diseases suffered by humans, animals and plants."
Wilhelm also expects that "AI methods such as Prosit will soon change the field of proteomics, as they can be used in almost every area of protein research".