OpenMS
PeakPickerWavelet

A tool for peak detection in profile data. Executes the peak picking with the algorithm described in described in Lange et al. (2006) Proc. PSB-06.

pot. predecessor tools → PeakPickerWavelet → pot. successor tools
BaselineFilter any tool operating on MS peak data
(in mzML format)
NoiseFilterGaussian
NoiseFilterSGolay

The conversion of the ''raw'' ion count data acquired by the machine into peak lists for further processing is usually called peak picking. The choice of the algorithm should mainly depend on the resolution of the data. As the name implies, the high_res algorithm is fit for high resolution data whereas in case of low-resoluted data the wavelet algorithm offers the ability to resolve highly convoluted and asymmetric signals, separation of overlapping peaks and nonlinear optimization.

TOPP_example_signalprocessing_parameters is explained in the TOPP tutorial.

The command line parameters of this tool are:

PeakPickerWavelet -- Finds mass spectrometric peaks in profile mass spectra.
Full documentation: http://www.openms.de/doxygen/release/3.0.0/html/TOPP_PeakPickerWavelet.html
Version: 3.0.0 Jul 14 2023, 11:57:33, Revision: be787e9
To cite OpenMS:
 + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for 
   mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  PeakPickerWavelet <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option

Options (mandatory options marked with '*'):
  -in <file>*        Input profile data file  (valid formats: 'mzML')
  -out <file>*       Output peak file  (valid formats: 'mzML')
                     
Common TOPP options:
  -ini <file>        Use the given TOPP INI file
  -threads <n>       Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>  Writes the default configuration file
  --help             Shows options
  --helphelp         Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Algorithm parameters section

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
For more information, please consult the online documentation for this tool:
  - http://www.openms.de/doxygen/release/3.0.0/html/TOPP_PeakPickerWavelet.html

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+PeakPickerWaveletFinds mass spectrometric peaks in profile mass spectra.
version3.0.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'PeakPickerWavelet'
in input profile data file input file*.mzML
out output peak file output file*.mzML
write_peak_meta_datafalse Write additional information about the picked peaks (maximal intensity, left and right area...) into the mzML-file. Attention: this can blow up files, since seven arrays are stored per spectrum!true, false
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++algorithmAlgorithm parameters section
signal_to_noise1.0 Minimal signal to noise ratio for a peak to be picked.0.0:∞
centroid_percentage0.8 Percentage of the maximum height that the raw data points must exceed to be taken into account for the calculation of the centroid. If it is 1 the centroid position corresponds to the position of the highest intensity.0.0:1.0
peak_width0.15 Approximate fwhm of the peaks.0.0:∞
estimate_peak_widthfalse Flag if the average peak width shall be estimated. Attention: when this flag is set, the peak_width is ignored.true, false
fwhm_lower_bound_factor0.7 Factor that calculates the minimal fwhm value from the peak_width. All peaks with width smaller than fwhm_bound_factor * peak_width are discarded.0.0:∞
fwhm_upper_bound_factor20.0 Factor that calculates the maximal fwhm value from the peak_width. All peaks with width greater than fwhm_upper_bound_factor * peak_width are discarded.0.0:∞
optimizationno If the peak parameters position, intensity and left/right widthshall be optimized set optimization to one_dimensional or two_dimensional.no, one_dimensional, two_dimensional
++++thresholds
peak_bound10.0 Minimal peak intensity.0.0:∞
peak_bound_ms2_level10.0 Minimal peak intensity for MS/MS peaks.0.0:∞
correlation0.5 minimal correlation of a peak and the raw signal. If a peak has a lower correlation it is skipped.0.0:1.0
noise_level0.1 noise level for the search of the peak endpoints.0.0:∞
search_radius3 search radius for the search of the maximum in the signal after a maximum in the cwt was found0:∞
++++wavelet_transform
spacing1.0e-03 Spacing of the CWT. Note that the accuracy of the picked peak's centroid position depends in the Raw data spacing, i.e., 50% of raw peak distance at most.0.0:∞
++++optimization
iterations400 maximal number of iterations for the fitting step1:∞
+++++penalties
position0.0 penalty term for the fitting of the position:If it differs too much from the initial one it can be penalized 0.0:∞
left_width1.0 penalty term for the fitting of the left width:If the left width differs too much from the initial one during the fitting it can be penalized.0.0:∞
right_width1.0 penalty term for the fitting of the right width:If the right width differs too much from the initial one during the fitting it can be penalized.0.0:∞
height1.0 penalty term for the fitting of the intensity (only used in 2D Optimization):If it gets negative during the fitting it can be penalized.0.0:∞
+++++2d
tolerance_mz2.2 mz tolerance for cluster construction0.0:∞
max_peak_distance1.2 maximal peak distance in mz in a cluster0.0:∞
++++deconvolution
deconvolutionfalse If you want heavily overlapping peaks to be separated set this value to "true"true, false
asym_threshold0.3 If the symmetry of a peak is smaller than asym_thresholds it is assumed that it consists of more than one peak and the deconvolution procedure is started.0.0:∞
left_width2.0 1/left_width is the initial value for the left width of the peaks found in the deconvolution step.0.0:∞
right_width2.0 1/right_width is the initial value for the right width of the peaks found in the deconvolution step.0.0:∞
scaling0.12 Initial scaling of the cwt used in the separation of heavily overlapping peaks. The initial value is used for charge 1, for higher charges it is adapted to scaling/charge.0.0:∞
+++++fitting
fwhm_threshold0.7 If the FWHM of a peak is higher than 'fwhm_thresholds' it is assumed that it consists of more than one peak and the deconvolution procedure is started.0.0:∞
eps_abs9.999999747378752e-06 if the absolute error gets smaller than this value the fitting is stopped.0.0:∞
eps_rel9.999999747378752e-06 if the relative error gets smaller than this value the fitting is stopped.0.0:∞
max_iteration10 maximal number of iterations for the fitting step1:∞
++++++penalties
position0.0 penalty term for the fitting of the peak position:If the position changes more than 0.5Da during the fitting it can be penalized as well as discrepancies of the peptide mass rule.0.0:∞
height1.0 penalty term for the fitting of the intensity:If it gets negative during the fitting it can be penalized.0.0:∞
left_width0.0 penalty term for the fitting of the left width:If the left width gets too broad or negative during the fitting it can be penalized.0.0:∞
right_width0.0 penalty term for the fitting of the right width:If the right width gets too broad or negative during the fitting it can be penalized.0.0:∞
++++SignalToNoiseEstimationParameter
max_intensity-1 maximal intensity considered for histogram construction. By default, it will be calculated automatically (see auto_mode). Only provide this parameter if you know what you are doing (and change 'auto_mode' to '-1')! All intensities EQUAL/ABOVE 'max_intensity' will not be added to the histogram. If you choose 'max_intensity' too small, the noise estimate might be too small as well. If chosen too big, the bins become quite large (which you could counter by increasing 'bin_count', which increases runtime).-1:∞
auto_max_stdev_factor3.0 parameter for 'max_intensity' estimation (if 'auto_mode' == 0): mean + 'auto_max_stdev_factor' * stdev0.0:999.0
auto_max_percentile95 parameter for 'max_intensity' estimation (if 'auto_mode' == 1): auto_max_percentile th percentile0:100
auto_mode0 method to use to determine maximal intensity: -1 --> use 'max_intensity'; 0 --> 'auto_max_stdev_factor' method (default); 1 --> 'auto_max_percentile' method-1:1
win_len200.0 window length in Thomson1.0:∞
bin_count30 number of bins for intensity values3:∞
stdev_mp3.0 multiplier for stdev0.01:999.0
min_required_elements10 minimum number of elements required in a window (otherwise it is considered sparse)1:∞
noise_for_empty_window1.0e20 noise value used for sparse windows

For the parameters of the algorithm section see the algorithm documentation:
PeakPickerCWT
In the following table you, can find example values of the most important algorithm parameters for different instrument types.
These parameters are not valid for all instruments of that type, but can be used as a starting point for finding suitable parameters.

  Q-TOF LTQ Orbitrap
signal_to_noise 2 0
peak_width ("wavelet" only) 0.1 0.012

In order to impove the results of the peak detection on low resolution data NoiseFilterSGolay or NoiseFilterGaussian and BaselineFilter can be applied. For high resolution data this is not necessary.