OpenMS
ProteomicsLFQ

ProteomicsLFQ performs label-free quantification of peptides and proteins.
Input:

  • Spectra in mzML format
  • Identifications in idXML or mzIdentML format with posterior error probabilities as score type. To generate those we suggest to run:
    1. PeptideIndexer to annotate target and decoy information.
    2. PSMFeatureExtractor to annotate percolator features.
    3. PercolatorAdapter tool (score_type = 'q-value', -post-processing-tdc)
    4. IDFilter (pep:score = 0.01) to filter PSMs at 1% FDR
  • An experimental design file:
    (see ExperimentalDesign for details)
  • A protein database in with appended decoy sequences in FASTA format
    (e.g., generated by the OpenMS DecoyDatabase tool)
    Processing:
    ProteomicsLFQ has different methods to extract features: ID-based (targeted only), or both ID-based and untargeted.
  1. The first method uses targeted feature dectection using RT and m/z information derived from identification data to extract features. Note: only identifications found in a particular MS run are used to extract features in the same run. No transfer of IDs (match between runs) is performed.
  2. The second method adds untargeted feature detection to obtain quantities from unidentified features. Transfer of Ids (match between runs) is performed by transfering feature identifications to coeluting, unidentified features with similar mass and RT in other runs.

Requantification:

  1. Optionally, a requantification step is performed that tries to fill NA values. If a peptide has been quantified in more than half of all maps, the peptide is selected for requantification. In that case, the mean observed RT (and theoretical m/z) of the peptide is used to perform a second round of targeted extraction. Output:
  • mzTab file with analysis results
  • MSstats file with analysis results for statistical downstream analysis in MSstats
  • ConsensusXML file for visualization and further processing in OpenMS

experiments TODO:

  • change percentage of missingness in ID transfer
  • disable elution peak fit

    Potential scripts to perform the search can be found under src/tests/topp/ProteomicsLFQTestScripts

    The command line parameters of this tool are:

    ProteomicsLFQ -- A standard proteomics LFQ pipeline.
    Full documentation: http://www.openms.de/doxygen/release/3.0.0/html/UTILS_ProteomicsLFQ.html
    Version: 3.0.0 Jul 14 2023, 11:57:33, Revision: be787e9
    To cite OpenMS:
     + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for 
       mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.
    
    Usage:
      ProteomicsLFQ <options>
    
    Options (mandatory options marked with '*'):
      -in <file list>*                                           Input files (valid formats: 'mzML')
      -ids <file list>*                                          Identifications filtered at PSM level (e.g., 
                                                                 q-value < 0.01).And annotated with PEP as main 
                                                                 score.
                                                                 We suggest using:
                                                                 1. PSMFeatureExtractor to annotate percolator 
                                                                 features.
                                                                 2. PercolatorAdapter tool (score_type = 'q-value
                                                                 ', -post-processing-tdc)
                                                                 ...
                                                                 ra files. (valid formats: 'idXML', 'mzId')
      -design <file>                                             Design file (valid formats: 'tsv')
      -fasta <file>                                              Fasta file (valid formats: 'fasta')
      -out <file>*                                               Output mzTab file (valid formats: 'mzTab')
      -out_msstats <file>                                        Output MSstats input file (valid formats: 'csv')
    
      -out_triqler <file>                                        Output Triqler input file (valid formats: 'tsv')
    
      -out_cxml <file>                                           Output consensusXML file (valid formats: 'consen
                                                                 susXML')
      -proteinFDR <threshold>                                    Protein FDR threshold (0.05=5%). (default: '0.05
                                                                 ') (min: '0.0' max: '1.0')
      -picked_proteinFDR <choice>                                Use a picked protein FDR? (default: 'false') 
                                                                 (valid: 'true', 'false')
      -psmFDR <threshold>                                        FDR threshold for sub-protein level (e.g. 0.05=5
                                                                 %). Use -FDR_type to choose the level. Cutoff 
                                                                 is applied at the highest level. If Bayesian 
                                                                 inference was chosen, it is equivalent with a 
                                                                 peptide FDR (default: '1.0') (min: '0.0' max: 
                                                                 '1.0')
      -FDR_type <threshold>                                      Sub-protein FDR level. PSM, PSM+peptide (best 
                                                                 PSM q-value). (default: 'PSM') (valid: 'PSM', 
                                                                 'PSM+peptide')
      -quantification_method <option>                            Feature_intensity: MS1 signal.
                                                                 spectral_counting: PSM counts. (default: 'featur
                                                                 e_intensity') (valid: 'feature_intensity', 'spec
                                                                 tral_counting')
      -targeted_only <option>                                    True: Only ID based quantification.
                                                                 false: include unidentified features so they 
                                                                 can be linked to identified ones (=match between
                                                                  runs). (default: 'false') (valid: 'true', 'fals
                                                                 e')
      -transfer_ids <option>                                     Requantification using mean of aligned RTs of a 
                                                                 peptide feature.
                                                                 Only applies to peptides that were quantified 
                                                                 in more than 50% of all runs (of a fraction). 
                                                                 (default: 'false') (valid: 'false', 'mean')
    
    Centroiding:
      -Centroiding:signal_to_noise <value>                       Minimal signal-to-noise ratio for a peak to be 
                                                                 picked (0.0 disables SNT estimation!) (default: 
                                                                 '0.0') (min: '0.0')
      -Centroiding:ms_levels <numbers>                           List of MS levels for which the peak picking is 
                                                                 applied. If empty, auto mode is enabled, all 
                                                                 peaks which aren't picked yet will get picked. 
                                                                 Other scans are copied to the output without 
                                                                 changes. (min: '1')
    
    PeptideQuantification:
      -PeptideQuantification:quantify_decoys                     Whether decoy peptides should be quantified (tru
                                                                 e) or skipped (false).
      -PeptideQuantification:min_psm_cutoff <text>               Minimum score for the best PSM of a spectrum to 
                                                                 be used as seed. Use 'none' for no cutoff. (defa
                                                                 ult: 'none')
    
    Parameters for ion chromatogram extraction:
      -PeptideQuantification:extract:batch_size <number>         Nr of peptides used in each batch of chromatogra
                                                                 m extraction. Smaller values decrease memory 
                                                                 usage but increase runtime. (default: '5000') 
                                                                 (min: '1')
      -PeptideQuantification:extract:mz_window <value>           M/z window size for chromatogram extraction (uni
                                                                 t: ppm if 1 or greater, else Da/Th) (default: 
                                                                 '10.0') (min: '0.0')
    
    Parameters for detecting features in extracted ion chromatograms:
      -PeptideQuantification:detect:mapping_tolerance <value>    RT tolerance (plus/minus) for mapping peptide 
                                                                 IDs to features. Absolute value in seconds if 1 
                                                                 or greater, else relative to the RT span of the 
                                                                 feature. (default: '0.0') (min: '0.0')
    
    Parameters for scoring features using a support vector machine (SVM):
      -PeptideQuantification:svm:log2_p <values>                 Values to try for the SVM parameter 'epsilon' 
                                                                 during parameter optimization (epsilon-SVR only)
                                                                 . A value 'x' is used as 'epsilon = 2^x'. (defau
                                                                 lt: '[-15.0 -12.0 -9.0 -6.0 -3.32192809489 0.0 
                                                                 3.32192809489 6.0 9.0 12.0 15.0]')
    
    Parameters for fitting exp. mod. Gaussians to mass traces.:
      -PeptideQuantification:EMGScoring:max_iteration <number>   Maximum number of iterations for EMG fitting. 
                                                                 (default: '100') (min: '1')
      -PeptideQuantification:EMGScoring:init_mom                 Alternative initial parameters for fitting throu
                                                                 gh method of moments.
    
    Alignment:
      -Alignment:model_type <choice>                             Options to control the modeling of retention 
                                                                 time transformations from data (default: 'b_spli
                                                                 ne') (valid: 'linear', 'b_spline', 'lowess', 
                                                                 'interpolated')
    
    Alignment:model:
      -Alignment:model:type <choice>                             Type of model (default: 'b_spline') (valid: 'lin
                                                                 ear', 'b_spline', 'lowess', 'interpolated')
    
    Parameters for 'linear' model:
      -Alignment:model:linear:symmetric_regression               Perform linear regression on 'y - x' vs. 'y + 
                                                                 x', instead of on 'y' vs. 'x'.
      -Alignment:model:linear:x_weight <choice>                  Weight x values (default: 'x') (valid: '1/x', 
                                                                 '1/x2', 'ln(x)', 'x')
      -Alignment:model:linear:y_weight <choice>                  Weight y values (default: 'y') (valid: '1/y', 
                                                                 '1/y2', 'ln(y)', 'y')
      -Alignment:model:linear:x_datum_min <value>                Minimum x value (default: '1.0e-15')
      -Alignment:model:linear:x_datum_max <value>                Maximum x value (default: '1.0e15')
      -Alignment:model:linear:y_datum_min <value>                Minimum y value (default: '1.0e-15')
      -Alignment:model:linear:y_datum_max <value>                Maximum y value (default: '1.0e15')
    
    Parameters for 'b_spline' model:
      -Alignment:model:b_spline:wavelength <value>               Determines the amount of smoothing by setting 
                                                                 the number of nodes for the B-spline. The number
                                                                  is chosen so that the spline approximates a 
                                                                 low-pass filter with this cutoff wavelength. 
                                                                 The wavelength is given in the same units as 
                                                                 the data; a higher value means more smoothing. 
                                                                 '0' sets the number of nodes to twice the number
                                                                  of input points. (default: '0.0') (min: '0.0')
      -Alignment:model:b_spline:num_nodes <number>               Number of nodes for B-spline fitting. Overrides 
                                                                 'wavelength' if set (to two or greater). A lower
                                                                  value means more smoothing. (default: '5') (min
                                                                 : '0')
      -Alignment:model:b_spline:extrapolate <choice>             Method to use for extrapolation beyond the origi
                                                                 nal data range. 'linear': Linear extrapolation 
                                                                 using the slope of the B-spline at the correspon
                                                                 ding endpoint. 'b_spline': Use the B-spline (as 
                                                                 for interpolation). 'constant': Use the constant
                                                                  value of the B-spline at the corresponding endp
                                                                 oint. 'global_linear': Use a linear fit through 
                                                                 the data (which will most probably introduce 
                                                                 discontinuities at the ends of the data range). 
                                                                 (default: 'linear') (valid: 'linear', 'b_spline'
                                                                 , 'constant', 'global_linear')
      -Alignment:model:b_spline:boundary_condition <number>      Boundary condition at B-spline endpoints: 0 (val
                                                                 ue zero), 1 (first derivative zero) or 2 (second
                                                                  derivative zero) (default: '2') (min: '0' max: 
                                                                 '2')
    
    Parameters for 'lowess' model:
      -Alignment:model:lowess:span <value>                       Fraction of datapoints (f) to use for each local
                                                                  regression (determines the amount of smoothing)
                                                                 . Choosing this parameter in the range .2 to .8 
                                                                 usually results in a good fit. (default: '0.6666
                                                                 66666666667') (min: '0.0' max: '1.0')
      -Alignment:model:lowess:num_iterations <number>            Number of robustifying iterations for lowess 
                                                                 fitting. (default: '3') (min: '0')
      -Alignment:model:lowess:delta <value>                      Nonnegative parameter which may be used to save 
                                                                 computations (recommended value is 0.01 of the 
                                                                 range of the input, e.g. for data ranging from 
                                                                 1000 seconds to 2000 seconds, it could be set 
                                                                 to 10). Setting a negative value will automatica
                                                                 lly do this. (default: '-1.0')
      -Alignment:model:lowess:interpolation_type <choice>        Method to use for interpolation between datapoin
                                                                 ts computed by lowess. 'linear': Linear interpol
                                                                 ation. 'cspline': Use the cubic spline for inter
                                                                 polation. 'akima': Use an akima spline for inter
                                                                 polation (default: 'cspline') (valid: 'linear', 
                                                                 'cspline', 'akima')
      -Alignment:model:lowess:extrapolation_type <choice>        Method to use for extrapolation outside the data
                                                                  range. 'two-point-linear': Uses a line through 
                                                                 the first and last point to extrapolate. 'four-p
                                                                 oint-linear': Uses a line through the first and 
                                                                 second point to extrapolate in front and and a 
                                                                 line through the last and second-to-last point 
                                                                 in the end. 'global-linear': Uses a linear regre
                                                                 ssion to fit a line through all data points and 
                                                                 use it for interpolation. (default: 'four-point-
                                                                 linear') (valid: 'two-point-linear', 'four-point
                                                                 -linear', 'global-linear')
    
    Parameters for 'interpolated' model:
      -Alignment:model:interpolated:interpolation_type <choice>  Type of interpolation to apply. (default: 'cspli
                                                                 ne') (valid: 'linear', 'cspline', 'akima')
      -Alignment:model:interpolated:extrapolation_type <choice>  Type of extrapolation to apply: two-point-linear
                                                                 : use the first and last data point to build a 
                                                                 single linear model, four-point-linear: build 
                                                                 two linear models on both ends using the first 
                                                                 two / last two points, global-linear: use all 
                                                                 points to build a single linear model. Note that
                                                                  global-linear may not be continuous at the bord
                                                                 er. (default: 'two-point-linear') (valid: 'two-p
                                                                 oint-linear', 'four-point-linear', 'global-linea
                                                                 r')
    
    Alignment:align_algorithm:
      -Alignment:align_algorithm:score_type <text>               Name of the score type to use for ranking and 
                                                                 filtering (.oms input only). If left empty, a 
                                                                 score type is picked automatically.
      -Alignment:align_algorithm:min_run_occur <number>          Minimum number of runs (incl. reference, if any)
                                                                  in which a peptide must occur to be used for 
                                                                 the alignment.
                                                                 Unless you have very few runs or identifications
                                                                 , increase this value to focus on more informati
                                                                 ve peptides. (default: '2') (min: '2')
      -Alignment:align_algorithm:max_rt_shift <value>            Maximum realistic RT difference for a peptide 
                                                                 (median per run vs. reference). Peptides with 
                                                                 higher shifts (outliers) are not used to compute
                                                                  the alignment.
                                                                 If 0, no limit (disable filter); if > 1, the 
                                                                 final value in seconds; if <= 1, taken as a frac
                                                                 tion of the range of the reference RT scale. 
                                                                 (default: '0.1') (min: '0.0')
      -Alignment:align_algorithm:use_adducts <choice>            If IDs contain adducts, treat differently adduct
                                                                 ed variants of the same molecule as different. 
                                                                 (default: 'true') (valid: 'true', 'false')
    
    Linking:
      -Linking:nr_partitions <number>                            How many partitions in m/z space should be used 
                                                                 for the algorithm (more partitions means faster 
                                                                 runtime and more memory efficient execution). 
                                                                 (default: '100') (min: '1')
      -Linking:min_nr_diffs_per_bin <number>                     If IDs are used: How many differences from match
                                                                 ing IDs should be used to calculate a linking 
                                                                 tolerance for unIDed features in an RT region. 
                                                                 RT regions will be extended until that number 
                                                                 is reached. (default: '50') (min: '5')
      -Linking:min_IDscore_forTolCalc <value>                    If IDs are used: What is the minimum score of 
                                                                 an ID to assume a reliable match for tolerance 
                                                                 calculation. Check your current score type! (def
                                                                 ault: '1.0')
      -Linking:noID_penalty <value>                              If IDs are used: For the normalized distances, 
                                                                 how high should the penalty for missing IDs be? 
                                                                 0 = no bias, 1 = IDs inside the max tolerances 
                                                                 always preferred (even if much further away). 
                                                                 (default: '0.0') (min: '0.0' max: '1.0')
    
    Distance component based on m/z differences:
      -Linking:distance_MZ:max_difference <value>                Never pair features with larger m/z distance 
                                                                 (unit defined by 'unit') (default: '10.0') (min:
                                                                  '0.0')
      -Linking:distance_MZ:unit <choice>                         Unit of the 'max_difference' parameter (default:
                                                                  'ppm') (valid: 'Da', 'ppm')
    
    ProteinQuantification:
      -ProteinQuantification:method <choice>                     - top - quantify based on three most abundant 
                                                                 peptides (number can be changed in 'top').
                                                                 - iBAQ (intensity based absolute quantification)
                                                                 , calculate the sum of all peptide peak intensit
                                                                 ies divided by the number of theoretically obser
                                                                 vable tryptic peptides (https://rdcu.be/cND1J). 
                                                                 Warning: only consensusXML or featureXML input 
                                                                 is allowed! (default: 'top') (valid: 'top', 'iBA
                                                                 Q')
      -ProteinQuantification:best_charge_and_fraction            Distinguish between fraction and charge states 
                                                                 of a peptide. For peptides, abundances will be 
                                                                 reported separately for each fraction and charge
                                                                 ;
                                                                 for proteins, abundances will be computed based 
                                                                 only on the most prevalent charge observed of 
                                                                 each peptide (over all fractions).
                                                                 By default, abundances are summed over all charg
                                                                 e states.
    
    Additional options for custom quantification using top N peptides.:
      -ProteinQuantification:top:N <number>                      Calculate protein abundance from this number of 
                                                                 proteotypic peptides (most abundant first; '0' 
                                                                 for all) (default: '3') (min: '0')
      -ProteinQuantification:top:aggregate <choice>              Aggregation method used to compute protein abund
                                                                 ances from peptide abundances (default: 'median'
                                                                 ) (valid: 'median', 'mean', 'weighted_mean', 
                                                                 'sum')
    
    Additional options for consensus maps (and identification results comprising multiple runs):
      -ProteinQuantification:consensus:normalize                 Scale peptide abundances so that medians of all 
                                                                 samples are equal
      -ProteinQuantification:consensus:fix_peptides              Use the same peptides for protein quantification
                                                                  across all samples.
                                                                 With 'N 0',all peptides that occur in every samp
                                                                 le are considered.
                                                                 Otherwise ('N'), the N peptides that occur in 
                                                                 the most samples (independently of each other) 
                                                                 are selected,
                                                                 breaking ties by total abundance (there is no 
                                                                 guarantee that the best co-ocurring peptides 
                                                                 are chosen!).
    
                                                                 
    Common UTIL options:
      -ini <file>                                                Use the given TOPP INI file
      -threads <n>                                               Sets the number of threads allowed to be used 
                                                                 by the TOPP tool (default: '1')
      -write_ini <file>                                          Writes the default configuration file
      --help                                                     Shows options
      --helphelp                                                 Shows all options (including advanced)
    
    

    INI file documentation of this tool:

    Legend:
    required parameter
    advanced parameter
    +ProteomicsLFQA standard proteomics LFQ pipeline.
    version3.0.0 Version of the tool that generated this parameters file.
    ++1Instance '1' section for 'ProteomicsLFQ'
    in[] Input filesinput file*.mzML
    ids[] Identifications filtered at PSM level (e.g., q-value < 0.01).And annotated with PEP as main score.
    We suggest using:
    1. PSMFeatureExtractor to annotate percolator features.
    2. PercolatorAdapter tool (score_type = 'q-value', -post-processing-tdc)
    3. IDFilter (pep:score = 0.05)
    To obtain well calibrated PEPs and an initial reduction of PSMs
    ID files must be provided in same order as spectra files.
    input file*.idXML, *.mzId
    design design fileinput file*.tsv
    fasta fasta fileinput file*.fasta
    out output mzTab fileoutput file*.mzTab
    out_msstats output MSstats input fileoutput file*.csv
    out_triqler output Triqler input fileoutput file*.tsv
    out_cxml output consensusXML fileoutput file*.consensusXML
    proteinFDR0.05 Protein FDR threshold (0.05=5%).0.0:1.0
    picked_proteinFDRfalse Use a picked protein FDR?true, false
    psmFDR1.0 FDR threshold for sub-protein level (e.g. 0.05=5%). Use -FDR_type to choose the level. Cutoff is applied at the highest level. If Bayesian inference was chosen, it is equivalent with a peptide FDR0.0:1.0
    FDR_typePSM Sub-protein FDR level. PSM, PSM+peptide (best PSM q-value).PSM, PSM+peptide
    protein_inferenceaggregation Infer proteins:
    aggregation = aggregates all peptide scores across a protein (using the best score)
    bayesian = computes a posterior probability for every protein based on a Bayesian network.
    Note: 'bayesian' only uses and reports the best PSM per peptide.
    aggregation, bayesian
    protein_quantificationunique_peptides Quantify proteins based on:
    unique_peptides = use peptides mapping to single proteins or a group of indistinguishable proteins(according to the set of experimentally identified peptides).
    strictly_unique_peptides = use peptides mapping to a unique single protein only.
    shared_peptides = use shared peptides only for its best group (by inference score)
    unique_peptides, strictly_unique_peptides, shared_peptides
    quantification_methodfeature_intensity feature_intensity: MS1 signal.
    spectral_counting: PSM counts.
    feature_intensity, spectral_counting
    targeted_onlyfalse true: Only ID based quantification.
    false: include unidentified features so they can be linked to identified ones (=match between runs).
    true, false
    transfer_idsfalse Requantification using mean of aligned RTs of a peptide feature.
    Only applies to peptides that were quantified in more than 50% of all runs (of a fraction).
    false, mean
    mass_recalibrationfalse Mass recalibration.true, false
    alignment_orderstar If star, aligns all maps to the reference with most IDs,if treeguided, calculates a guiding tree first.star, treeguided
    keep_feature_top_psm_onlytrue If false, also keeps lower ranked PSMs that have the top-scoring sequence as a candidate per feature in the same file.true, false
    log Name of log file (created only when specified)
    debug0 Sets the debug level
    threads1 Sets the number of threads allowed to be used by the TOPP tool
    no_progressfalse Disables progress logging to command linetrue, false
    forcefalse Overrides tool-specific checkstrue, false
    testfalse Enables the test mode (needed for internal use only)true, false
    +++SeedingParameters for seeding of untargeted features
    intThreshold1.0e04 Peak intensity threshold applied in seed detection.
    charge2:5 Charge range considered for untargeted feature seeds.
    traceRTTolerance3.0 Combines all spectra in the tolerance window to stabilize identification of isotope patterns. Controls sensitivity (low value) vs. specificity (high value) of feature seeds.
    +++Centroiding
    signal_to_noise0.0 Minimal signal-to-noise ratio for a peak to be picked (0.0 disables SNT estimation!)0.0:∞
    spacing_difference_gap4.0 The extension of a peak is stopped if the spacing between two subsequent data points exceeds 'spacing_difference_gap * min_spacing'. 'min_spacing' is the smaller of the two spacings from the peak apex to its two neighboring points. '0' to disable the constraint. Not applicable to chromatograms.0.0:∞
    spacing_difference1.5 Maximum allowed difference between points during peak extension, in multiples of the minimal difference between the peak apex and its two neighboring points. If this difference is exceeded a missing point is assumed (see parameter 'missing'). A higher value implies a less stringent peak definition, since individual signals within the peak are allowed to be further apart. '0' to disable the constraint. Not applicable to chromatograms.0.0:∞
    missing1 Maximum number of missing points allowed when extending a peak to the left or to the right. A missing data point occurs if the spacing between two subsequent data points exceeds 'spacing_difference * min_spacing'. 'min_spacing' is the smaller of the two spacings from the peak apex to its two neighboring points. Not applicable to chromatograms.0:∞
    ms_levels[] List of MS levels for which the peak picking is applied. If empty, auto mode is enabled, all peaks which aren't picked yet will get picked. Other scans are copied to the output without changes.1:∞
    report_FWHMfalse Add metadata for FWHM (as floatDataArray named 'FWHM' or 'FWHM_ppm', depending on param 'report_FWHM_unit') for each picked peak.true, false
    report_FWHM_unitrelative Unit of FWHM. Either absolute in the unit of input, e.g. 'm/z' for spectra, or relative as ppm (only sensible for spectra, not chromatograms).relative, absolute
    ++++SignalToNoise
    max_intensity-1 maximal intensity considered for histogram construction. By default, it will be calculated automatically (see auto_mode). Only provide this parameter if you know what you are doing (and change 'auto_mode' to '-1')! All intensities EQUAL/ABOVE 'max_intensity' will be added to the LAST histogram bin. If you choose 'max_intensity' too small, the noise estimate might be too small as well. If chosen too big, the bins become quite large (which you could counter by increasing 'bin_count', which increases runtime). In general, the Median-S/N estimator is more robust to a manual max_intensity than the MeanIterative-S/N.-1:∞
    auto_max_stdev_factor3.0 parameter for 'max_intensity' estimation (if 'auto_mode' == 0): mean + 'auto_max_stdev_factor' * stdev0.0:999.0
    auto_max_percentile95 parameter for 'max_intensity' estimation (if 'auto_mode' == 1): auto_max_percentile th percentile0:100
    auto_mode0 method to use to determine maximal intensity: -1 --> use 'max_intensity'; 0 --> 'auto_max_stdev_factor' method (default); 1 --> 'auto_max_percentile' method-1:1
    win_len200.0 window length in Thomson1.0:∞
    bin_count30 number of bins for intensity values3:∞
    min_required_elements10 minimum number of elements required in a window (otherwise it is considered sparse)1:∞
    noise_for_empty_window1.0e20 noise value used for sparse windows
    write_log_messagestrue Write out log messages in case of sparse windows or median in rightmost histogram bintrue, false
    +++PeptideQuantification
    candidates_out Optional output file with feature candidates.output file
    debug0 Debug level for feature detection.0:∞
    quantify_decoysfalse Whether decoy peptides should be quantified (true) or skipped (false).true, false
    min_psm_cutoffnone Minimum score for the best PSM of a spectrum to be used as seed. Use 'none' for no cutoff.
    ++++extractParameters for ion chromatogram extraction
    batch_size5000 Nr of peptides used in each batch of chromatogram extraction. Smaller values decrease memory usage but increase runtime.1:∞
    mz_window10.0 m/z window size for chromatogram extraction (unit: ppm if 1 or greater, else Da/Th)0.0:∞
    n_isotopes2 Number of isotopes to include in each peptide assay.2:∞
    isotope_pmin0.0 Minimum probability for an isotope to be included in the assay for a peptide. If set, this parameter takes precedence over 'extract:n_isotopes'.0.0:1.0
    rt_quantile0.95 Quantile of the RT deviations between aligned internal and external IDs to use for scaling the RT extraction window0.0:1.0
    rt_window0.0 RT window size (in sec.) for chromatogram extraction. If set, this parameter takes precedence over 'extract:rt_quantile'.0.0:∞
    ++++detectParameters for detecting features in extracted ion chromatograms
    min_peak_width0.2 Minimum elution peak width. Absolute value in seconds if 1 or greater, else relative to 'peak_width'.0.0:∞
    signal_to_noise0.8 Signal-to-noise threshold for OpenSWATH feature detection0.1:∞
    mapping_tolerance0.0 RT tolerance (plus/minus) for mapping peptide IDs to features. Absolute value in seconds if 1 or greater, else relative to the RT span of the feature.0.0:∞
    ++++svmParameters for scoring features using a support vector machine (SVM)
    samples10000 Number of observations to use for training ('0' for all)0:∞
    no_selectionfalse By default, roughly the same number of positive and negative observations, with the same intensity distribution, are selected for training. This aims to reduce biases, but also reduces the amount of training data. Set this flag to skip this procedure and consider all available observations (subject to 'svm:samples').true, false
    xval_out Output file: SVM cross-validation (parameter optimization) resultsoutput file*.csv
    kernelRBF SVM kernelRBF, linear
    xval5 Number of partitions for cross-validation (parameter optimization)1:∞
    log2_C[-2.0, 5.0, 15.0] Values to try for the SVM parameter 'C' during parameter optimization. A value 'x' is used as 'C = 2^x'.
    log2_gamma[-3.0, -1.0, 2.0] Values to try for the SVM parameter 'gamma' during parameter optimization (RBF kernel only). A value 'x' is used as 'gamma = 2^x'.
    log2_p[-15.0, -12.0, -9.0, -6.0, -3.32192809489, 0.0, 3.32192809489, 6.0, 9.0, 12.0, 15.0] Values to try for the SVM parameter 'epsilon' during parameter optimization (epsilon-SVR only). A value 'x' is used as 'epsilon = 2^x'.
    epsilon1.0e-03 Stopping criterion0.0:∞
    cache_size100.0 Size of the kernel cache (in MB)1.0:∞
    no_shrinkingfalse Disable the shrinking heuristicstrue, false
    predictorspeak_apices_sum,var_xcorr_coelution,var_xcorr_shape,var_library_sangle,var_intensity_score,sn_ratio,var_log_sn_score,var_elution_model_fit_score,xx_lda_prelim_score,var_ms1_isotope_correlation_score,var_ms1_isotope_overlap_score,var_massdev_score,main_var_xx_swath_prelim_score Names of OpenSWATH scores to use as predictors for the SVM (comma-separated list)
    min_prob0.9 Minimum probability of correctness, as predicted by the SVM, required to retain a feature candidate0.0:1.0
    ++++modelParameters for fitting elution models to features
    typesymmetric Type of elution model to fit to featuressymmetric, asymmetric, none
    add_zeros0.2 Add zero-intensity points outside the feature range to constrain the model fit. This parameter sets the weight given to these points during model fitting; '0' to disable.0.0:∞
    unweighted_fitfalse Suppress weighting of mass traces according to theoretical intensities when fitting elution modelstrue, false
    no_imputationfalse If fitting the elution model fails for a feature, set its intensity to zero instead of imputing a value from the initial intensity estimatetrue, false
    each_tracefalse Fit elution model to each individual mass tracetrue, false
    +++++checkParameters for checking the validity of elution models (and rejecting them if necessary)
    min_area1.0 Lower bound for the area under the curve of a valid elution model0.0:∞
    boundaries0.5 Time points corresponding to this fraction of the elution model height have to be within the data region used for model fitting0.0:1.0
    width10.0 Upper limit for acceptable widths of elution models (Gaussian or EGH), expressed in terms of modified (median-based) z-scores. '0' to disable. Not applied to individual mass traces (parameter 'each_trace').0.0:∞
    asymmetry10.0 Upper limit for acceptable asymmetry of elution models (EGH only), expressed in terms of modified (median-based) z-scores. '0' to disable. Not applied to individual mass traces (parameter 'each_trace').0.0:∞
    ++++EMGScoringParameters for fitting exp. mod. Gaussians to mass traces.
    max_iteration100 Maximum number of iterations for EMG fitting.1:∞
    init_momfalse Alternative initial parameters for fitting through method of moments.true, false
    +++Alignment
    model_typeb_spline Options to control the modeling of retention time transformations from datalinear, b_spline, lowess, interpolated
    ++++model
    typeb_spline Type of modellinear, b_spline, lowess, interpolated
    +++++linearParameters for 'linear' model
    symmetric_regressionfalse Perform linear regression on 'y - x' vs. 'y + x', instead of on 'y' vs. 'x'.true, false
    x_weightx Weight x values1/x, 1/x2, ln(x), x
    y_weighty Weight y values1/y, 1/y2, ln(y), y
    x_datum_min1.0e-15 Minimum x value
    x_datum_max1.0e15 Maximum x value
    y_datum_min1.0e-15 Minimum y value
    y_datum_max1.0e15 Maximum y value
    +++++b_splineParameters for 'b_spline' model
    wavelength0.0 Determines the amount of smoothing by setting the number of nodes for the B-spline. The number is chosen so that the spline approximates a low-pass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points.0.0:∞
    num_nodes5 Number of nodes for B-spline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing.0:∞
    extrapolatelinear Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the B-spline at the corresponding endpoint. 'b_spline': Use the B-spline (as for interpolation). 'constant': Use the constant value of the B-spline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range).linear, b_spline, constant, global_linear
    boundary_condition2 Boundary condition at B-spline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero)0:2
    +++++lowessParameters for 'lowess' model
    span0.666666666666667 Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit.0.0:1.0
    num_iterations3 Number of robustifying iterations for lowess fitting.0:∞
    delta-1.0 Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.
    interpolation_typecspline Method to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolationlinear, cspline, akima
    extrapolation_typefour-point-linear Method to use for extrapolation outside the data range. 'two-point-linear': Uses a line through the first and last point to extrapolate. 'four-point-linear': Uses a line through the first and second point to extrapolate in front and and a line through the last and second-to-last point in the end. 'global-linear': Uses a linear regression to fit a line through all data points and use it for interpolation.two-point-linear, four-point-linear, global-linear
    +++++interpolatedParameters for 'interpolated' model
    interpolation_typecspline Type of interpolation to apply.linear, cspline, akima
    extrapolation_typetwo-point-linear Type of extrapolation to apply: two-point-linear: use the first and last data point to build a single linear model, four-point-linear: build two linear models on both ends using the first two / last two points, global-linear: use all points to build a single linear model. Note that global-linear may not be continuous at the border.two-point-linear, four-point-linear, global-linear
    ++++align_algorithm
    score_type Name of the score type to use for ranking and filtering (.oms input only). If left empty, a score type is picked automatically.
    score_cutofffalse Use only IDs above a score cut-off (parameter 'min_score') for alignment?true, false
    min_score0.05 If 'score_cutoff' is 'true': Minimum score for an ID to be considered.
    Unless you have very few runs or identifications, increase this value to focus on more informative peptides.
    min_run_occur2 Minimum number of runs (incl. reference, if any) in which a peptide must occur to be used for the alignment.
    Unless you have very few runs or identifications, increase this value to focus on more informative peptides.
    2:∞
    max_rt_shift0.1 Maximum realistic RT difference for a peptide (median per run vs. reference). Peptides with higher shifts (outliers) are not used to compute the alignment.
    If 0, no limit (disable filter); if > 1, the final value in seconds; if <= 1, taken as a fraction of the range of the reference RT scale.
    0.0:∞
    use_unassigned_peptidesfalse Should unassigned peptide identifications be used when computing an alignment of feature or consensus maps? If 'false', only peptide IDs assigned to features will be used.true, false
    use_feature_rttrue When aligning feature or consensus maps, don't use the retention time of a peptide identification directly; instead, use the retention time of the centroid of the feature (apex of the elution profile) that the peptide was matched to. If different identifications are matched to one feature, only the peptide closest to the centroid in RT is used.
    Precludes 'use_unassigned_peptides'.
    true, false
    use_adductstrue If IDs contain adducts, treat differently adducted variants of the same molecule as different.true, false
    +++Linking
    use_identificationstrue Never link features that are annotated with different peptides (only the best hit per peptide identification is taken into account).true, false
    nr_partitions100 How many partitions in m/z space should be used for the algorithm (more partitions means faster runtime and more memory efficient execution).1:∞
    min_nr_diffs_per_bin50 If IDs are used: How many differences from matching IDs should be used to calculate a linking tolerance for unIDed features in an RT region. RT regions will be extended until that number is reached.5:∞
    min_IDscore_forTolCalc1.0 If IDs are used: What is the minimum score of an ID to assume a reliable match for tolerance calculation. Check your current score type!
    noID_penalty0.0 If IDs are used: For the normalized distances, how high should the penalty for missing IDs be? 0 = no bias, 1 = IDs inside the max tolerances always preferred (even if much further away).0.0:1.0
    ignore_chargefalse false [default]: pairing requires equal charge state (or at least one unknown charge '0'); true: Pairing irrespective of charge statetrue, false
    ignore_adducttrue true [default]: pairing requires equal adducts (or at least one without adduct annotation); true: Pairing irrespective of adductstrue, false
    ++++distance_RTDistance component based on RT differences
    exponent1.0 Normalized RT differences ([0-1], relative to 'max_difference') are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0.0:∞
    weight1.0 Final RT distances are weighted by this factor0.0:∞
    ++++distance_MZDistance component based on m/z differences
    max_difference10.0 Never pair features with larger m/z distance (unit defined by 'unit')0.0:∞
    unitppm Unit of the 'max_difference' parameterDa, ppm
    exponent2.0 Normalized ([0-1], relative to 'max_difference') m/z differences are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0.0:∞
    weight5.0 Final m/z distances are weighted by this factor0.0:∞
    ++++distance_intensityDistance component based on differences in relative intensity (usually relative to highest peak in the whole data set)
    exponent1.0 Differences in relative intensity ([0-1]) are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0.0:∞
    weight0.1 Final intensity distances are weighted by this factor0.0:∞
    log_transformdisabled Log-transform intensities? If disabled, d = |int_f2 - int_f1| / int_max. If enabled, d = |log(int_f2 + 1) - log(int_f1 + 1)| / log(int_max + 1))enabled, disabled
    +++ProteinQuantification
    methodtop - top - quantify based on three most abundant peptides (number can be changed in 'top').
    - iBAQ (intensity based absolute quantification), calculate the sum of all peptide peak intensities divided by the number of theoretically observable tryptic peptides (https://rdcu.be/cND1J). Warning: only consensusXML or featureXML input is allowed!
    top, iBAQ
    best_charge_and_fractionfalse Distinguish between fraction and charge states of a peptide. For peptides, abundances will be reported separately for each fraction and charge;
    for proteins, abundances will be computed based only on the most prevalent charge observed of each peptide (over all fractions).
    By default, abundances are summed over all charge states.
    true, false
    ++++topAdditional options for custom quantification using top N peptides.
    N3 Calculate protein abundance from this number of proteotypic peptides (most abundant first; '0' for all)0:∞
    aggregatemedian Aggregation method used to compute protein abundances from peptide abundancesmedian, mean, weighted_mean, sum
    include_alltrue Include results for proteins with fewer proteotypic peptides than indicated by 'N' (no effect if 'N' is 0 or 1)true, false
    ++++consensusAdditional options for consensus maps (and identification results comprising multiple runs)
    normalizefalse Scale peptide abundances so that medians of all samples are equaltrue, false
    fix_peptidesfalse Use the same peptides for protein quantification across all samples.
    With 'N 0',all peptides that occur in every sample are considered.
    Otherwise ('N'), the N peptides that occur in the most samples (independently of each other) are selected,
    breaking ties by total abundance (there is no guarantee that the best co-ocurring peptides are chosen!).
    true, false