Interpreting Individual Peaks – part 1
If you have ever looked at a mass-spectrometer output file, you may have wondered how it is possible to make sense of all of these peaks. I mean, there really is a lot of them! Here, we will talk a bit about the basic rules which govern the behaviour of peptides in MS1 spectra and MS2 spectra, and which are used to identify the boundaries of groups of peaks belonging to the same peptide.
First, though, we may need to define the notion of “peptide”. Oh, “that’s easy”, I hear you say. A peptide is a bit of protein – usually a digestion product but it can also be an unusually short protein, right? The limit is usually arbitrarily set at 50 amino acids I believe. Yes, true[1], but in proteomics we actually need to add a few additional nuances: when proteomics people mention a “peptide”, they usually mean all of the copies of the same peptide currently present in the sample. Usually, especially when speaking of behaviour inside a mass-spectrometer, a “peptide” is understood as being not just a collection of molecules with a single primary sequence, but also the same post-translational modification state and charge. I sometimes refer to these as “entities” or “species”.
This does not mean that we do not revert back to sometimes using peptide to refer to a single copy of a peptide. It would be too simple. In fact, in the lines below I do on a few occasions use the word peptide to refer to a single copy. Context, as always, is everything.
Now, let us consider a single peptide and its behaviour in the different scans of a standard Data-Dependent Acquisition experiment:
[1] Although, with my slightly OCD mind, I have never really liked the idea of distinguishing between proteins and peptides: it is just too blurry, where one ends and the other starts is never really clear.
MS1 level
How many peaks would we expect from a single peptide as defined above? Well, it is one sequence in a single charge and modification state, so maybe, 1 peak? Wrong. It turns out I have left out another source of variation that will affect the number of peaks, namely, isotopic composition. Indeed, all isotopically distinct versions of a peptide as defined above are considered as variants of the same peptide.
Now, as you surely know, each atom exists in nature in one main light isotope and several isotopes, distinguished by adding one or more neutron to the standard atom. Each additional neutron results in a mass shift of approximately +1 Da (= 1 “amu”, atomic mass unit)[1]. The probability – which we will note PL – of a given atom in a peptide being made of one of its heavier isotopes is low, usually in the 0.01 range. However, in even a short peptide there are easily a hundred or more atoms. If we use the very rough approximation of PL = 0.01, then the chart below represents the probabilities of peptides of 100, 200, 300, 400 or 500 atoms having between +0 and +10 neutrons:
[1] Thanks to the strong nuclear force, the precise value depends on atomic nucleus context and how much the additional neutron changes nuclear stability. Amazingly, Orbitraps are able to resolve the minuscule difference between +1 neutron (hydrogen -> deuterium), +1 neutron (12C -> 13C) or +1 neutron (14N -> 15N). This remarkable technologic feat makes NeuCode SILAC or TMT-10plex labelling possible.
What the hell?!?! I am doing graphs in Excel? I feel like I just cheated on ggplot2!
(the numbers after each amino acid letter are the number of additional neutrons)
Since for each peptide all other amino acids will still be subject to the same statistical distribution as above, this means that the labelling will very slightly modify the shape of each envelope (we know the isotope present for all atoms from the labelled amino-acids), but most importantly it will shift them to the right: the monoisotopic peak for a peptide containing a single R6 or R10 and with charge +2 will be shifted 3 and 5 Th to the right relative to an R0 peptide. Thus, a mixture of a same peptide labelled with R0, R6 and R10 will materialise as three distinguishable isotopic envelopes:
MS2 level
At the MS2 level, things become waaaay more complicated because of fragmentation. So in order for this blog entry to stay a reasonable size, I will delay this until after the Christmas break.