element1 webtheme
element2
 
WELCOME

PeakLink: A new peptide peak linking method in      LC-MS/MS using wavelet and SVM

Mehrab Ghanat Bari, Xuepo Ma and Jianqiu Zhang

Department of Electrical and Computer Engineering, The University of Texas at san Antonio, One 

UTSA Circle,San Antonio, TX 78246, USA.

ABSTRACT

Motivation: In Liquid Chromatography Mass Spectrometry/Tandem Mass Spectrometry (LC-MS/MS), tandem MS search can provide confident peptide sequence and retention time information, based on which, LC-MS peaks of peptides can be located for quantification. However, we can only locate a peptide’s LC-MS peaks precisely when the peptide has been randomly picked up and identified by tandem MS. In order to investigate protein expression changes across multiple runs, it becomes necessary to link peptides peaks in runs with tandem identification to their corresponding peaks in runs without identification. In the past, peptide peaks are linked based on similarities in retention time, mass, or peak shape after retention time alignment, which corrects mean retention time shifts between runs. However, the accuracy in linking is still limited especially for complex samples collected from different conditions. Consequently, large scale proteomics studies that require comparison of protein expression profiles of hundreds of patients can not be carried out effectively.

 

Method: We propose a new method, PeakLink (PL), which uses information in both the time and frequency domain as inputs to a non-linear support vector machine (SVM) classifier. The PL algorithm first uses a threshold on retention time to remove candidate corresponding peaks with excessively large elution time shifts, then PL calculates the correlation between a pair of candidate peaks after removing noise through wavelet transformation. After converting retention time and peak shape correlation to statistical scores, an SVM classifier is trained and applied for differentiating corresponding and non-corresponding peptide peaks.

Results: PL is tested in two challenging cases, in which LC-MS/MS samples are collected from different disease states and from different labs. Testing results show significant improvement in linking accuracy comparing to other algorithms.