OpenMS
PeakPickerHiRes

A tool for peak detection in profile data. Executes the peak picking with high_res algorithm.

pot. predecessor tools → PeakPickerHiRes → pot. successor tools
BaselineFilter any tool operating on MS peak data
(in mzML format)
NoiseFilterGaussian
NoiseFilterSGolay

Reference:
Weisser et al.: An automated pipeline for high-throughput label-free quantitative proteomics (J. Proteome Res., 2013, PMID: 23391308).

The conversion of the "raw" ion count data acquired by the machine into peak lists for further processing is usually called peak picking or centroiding. The choice of the algorithm should mainly depend on the resolution of the data. As the name implies, the high_res algorithm is fit for high resolution (Orbitrap or FTICR) data.

TOPP_example_signalprocessing_parameters is explained in the TOPP tutorial.

The command line parameters of this tool are:

PeakPickerHiRes -- Finds mass spectrometric peaks in profile mass spectra.
Full documentation: http://www.openms.de/doxygen/release/3.1.0/html/TOPP_PeakPickerHiRes.html
Version: 3.1.0 Oct 18 2023, 10:27:18, Revision: 17a07f8
To cite OpenMS:
 + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for 
   mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  PeakPickerHiRes <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option

Options (mandatory options marked with '*'):
  -in <file>*        Input profile data file  (valid formats: 'mzML')
  -out <file>*       Output peak file  (valid formats: 'mzML')
                     
Common TOPP options:
  -ini <file>        Use the given TOPP INI file
  -threads <n>       Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>  Writes the default configuration file
  --help             Shows options
  --helphelp         Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Algorithm parameters section

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
For more information, please consult the online documentation for this tool:
  - http://www.openms.de/doxygen/release/3.1.0/html/TOPP_PeakPickerHiRes.html

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+PeakPickerHiResFinds mass spectrometric peaks in profile mass spectra.
version3.1.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'PeakPickerHiRes'
in input profile data file input file*.mzML
out output peak file output file*.mzML
processOptioninmemory Whether to load all data and process them in-memory or whether to process the data on the fly (lowmemory) without loading the whole file into memory firstinmemory, lowmemory
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++algorithmAlgorithm parameters section
signal_to_noise0.0 Minimal signal-to-noise ratio for a peak to be picked (0.0 disables SNT estimation!)0.0:∞
spacing_difference_gap4.0 The extension of a peak is stopped if the spacing between two subsequent data points exceeds 'spacing_difference_gap * min_spacing'. 'min_spacing' is the smaller of the two spacings from the peak apex to its two neighboring points. '0' to disable the constraint. Not applicable to chromatograms.0.0:∞
spacing_difference1.5 Maximum allowed difference between points during peak extension, in multiples of the minimal difference between the peak apex and its two neighboring points. If this difference is exceeded a missing point is assumed (see parameter 'missing'). A higher value implies a less stringent peak definition, since individual signals within the peak are allowed to be further apart. '0' to disable the constraint. Not applicable to chromatograms.0.0:∞
missing1 Maximum number of missing points allowed when extending a peak to the left or to the right. A missing data point occurs if the spacing between two subsequent data points exceeds 'spacing_difference * min_spacing'. 'min_spacing' is the smaller of the two spacings from the peak apex to its two neighboring points. Not applicable to chromatograms.0:∞
ms_levels[] List of MS levels for which the peak picking is applied. If empty, auto mode is enabled, all peaks which aren't picked yet will get picked. Other scans are copied to the output without changes.1:∞
report_FWHMfalse Add metadata for FWHM (as floatDataArray named 'FWHM' or 'FWHM_ppm', depending on param 'report_FWHM_unit') for each picked peak.true, false
report_FWHM_unitrelative Unit of FWHM. Either absolute in the unit of input, e.g. 'm/z' for spectra, or relative as ppm (only sensible for spectra, not chromatograms).relative, absolute
++++SignalToNoise
max_intensity-1 maximal intensity considered for histogram construction. By default, it will be calculated automatically (see auto_mode). Only provide this parameter if you know what you are doing (and change 'auto_mode' to '-1')! All intensities EQUAL/ABOVE 'max_intensity' will be added to the LAST histogram bin. If you choose 'max_intensity' too small, the noise estimate might be too small as well. If chosen too big, the bins become quite large (which you could counter by increasing 'bin_count', which increases runtime). In general, the Median-S/N estimator is more robust to a manual max_intensity than the MeanIterative-S/N.-1:∞
auto_max_stdev_factor3.0 parameter for 'max_intensity' estimation (if 'auto_mode' == 0): mean + 'auto_max_stdev_factor' * stdev0.0:999.0
auto_max_percentile95 parameter for 'max_intensity' estimation (if 'auto_mode' == 1): auto_max_percentile th percentile0:100
auto_mode0 method to use to determine maximal intensity: -1 --> use 'max_intensity'; 0 --> 'auto_max_stdev_factor' method (default); 1 --> 'auto_max_percentile' method-1:1
win_len200.0 window length in Thomson1.0:∞
bin_count30 number of bins for intensity values3:∞
min_required_elements10 minimum number of elements required in a window (otherwise it is considered sparse)1:∞
noise_for_empty_window1.0e20 noise value used for sparse windows
write_log_messagestrue Write out log messages in case of sparse windows or median in rightmost histogram bintrue, false

For the parameters of the algorithm section see the algorithm documentation: PeakPickerHiRes

Be aware that applying the algorithm to already picked data results in an error message and program exit or corrupted output data. Advanced users may skip the check for already centroided data using the flag "-force" (useful e.g. if spectrum annotations in the data files are wrong).

In the following table you, can find example values of the most important algorithm parameters for different instrument types.
These parameters are not valid for all instruments of that type, but can be used as a starting point for finding suitable parameters.

  Q-TOF LTQ Orbitrap
signal_to_noise 2 0