FLASHDeconv 2.0 beta+

FLASHDeconv 2.0 beta+ version is out! FLASHDeconv 2.0 beta+ works for MS1 and MS2 deconvolution and supports various output formats (e.g., *.tsv, *.mzML, *.msalign, and *.feature). FLASHDeconv 2.0 stable version will be officially integrated in OpenMS 2.7.0 released in near future. In this version, an important parameter to significantly reduce proteoform identification error from incorrect precursor mass is introduced: -min_precursor_snr (See below). Precursor SNR is the SNR within the precursor envelope m/z range. Even if a mass is represented by a perfect isotope profile, if its precursor SNR is low due to presence of other proteoform signal in the same range (i.e., coelution), the proteoform identification is often false (due to mismatch between precursor ion and fragment ions). Also precursors of low SNR often represent low harmonic mass artifacts. We found that in complex samples, a high portion of precursors with low SNRs are present leading to false positives that are hard to filter out using target-decoy approach. FLASHDeconv 2.0 beta+ also supports TopPIC identification better than the previous version, by generating all msalign and feature files for TopPIC inputs.

Installation

FLASHDeconv installation files (OpenMS-2.x.0-HEAD-, for windows *.exe, for mac *.dmg, and for linux *.deb) and source code (*-src.tar.gz) are found in here.

Parameters

FLASHDeconv basic parameters are found by simply running FLASHDeconv. Only -in and -out are mandatory. FLASH advanced parameters are found by running FLASHDeconv –helphelp. FLASHDeconv parameters have three categories: FLASHDeconv tool parameters, FLASHDeconv algorithm parameters, and FeatureTracing algorithm parameters. Firstly the basic parameters in each category are described, and then the advanced ones are explained.

Basic tool parameters: 

  • -in: input file (only *.mzML files are currently accepted).
  • -out: *.tsv file for feature level deconvolution results.
  • -out_spec: *.tsv files for spectrum level deconvolution results. Files should be specified per MS level.
  • -out_mzml: *.mzML file for MS1 and MS2 deconvoluted spectra.
  • -out_promex: *.ms1ft (promex output format) file. Only MS1 deconvoluted masses are written.
  • -out_topFD: *.msalign (TopFD output format) files. Files should be specified per MS level.
  • -out_topFD_feature: *.feature (TopFD feature output format) files. Files should be specified per MS level.
  • -mzml_mass_charge: specifies the charge of deconvoluted masses (-1, 0, or +1) in mzML output.
  • -preceding_MS1_cout: specifies until how many preceding MS1 spectra precursor mass will be searched in, given an MS2 spectrum. In top-down proteomics, some precursor peaks in MS2 are not part of deconvoluted masses in MS1 immediatly preceding the MS2. In such cases, increasing this parameter allows for the search in further preceding MS1 spectra and helps determine exact precursor masses.
  • -write_detail: to write peak information more in detail (in spectrum level deconvolution *tsv files)
  • -use_ensemble_spectrum: if set, all spectra are merged into a single ensemble spectrum (per MS level) and deconvolution is done for the ensemble spectrum. Basic peak picking is done for each ensemble spectrum.

 

Basic algorithm parameters (with prefix Algorithm: )

  • -Algorithm:tol: tolerance for each MS level in PPM.
  • -Algorithm:min_mass: minimum deconvoluted mass.
  • -Algorithm:max_mass: maximum deconvoluted mass.
  • -Algorithm:min_charge: minimum charge of MS1 peaks. This can be set negative for negative mode MS runs (as in RNA sequencing). For MS2, minimum charge is set to 1.
  • -Algorithm:max_charge: maximum charge of MS1 peaks. This can be set negative for negative mode MS runs (as in RNA sequencing). For MS2, maximum charge is set to its precursor charge.
  • -Algorithm:min_isotope_cosine: Cosine threshold between avg. and observed isotope pattern for MS1, 2, …
  • -Algorithm:min_qscore: QScore threshold. QScore is the probability that a mass is identified.

 

Baisc FeatureTracing parameters (with prefix FeatureTracing: )

  • -FeatureTracing:mass_error_da: mass dalton tolerance for feature tracing.
  • -FeatureTracing:min_sample_rate: minimum fraction of scans along the feature trace that must contain a peak. To raise feature detection sensitivity, lower this value close to 0.
  • -FeatureTracing:min_trace_length: minimum expected length of a feature in second.

 

Advanced tool parameters: 

  • -min_precursor_snr: minimum precursor SNR (default 1.0)
  • -max_MS_level: specifies the maximum MS level.
  • -use_RNA_averagine: if set to 1, RNA averageine model is used instead of protein model.

 

Advanced algorithm parameters (with prefix Algorithm: )

  • -Algorithm:min_mz : minimum m/z value in Th.
  • -Algorithm:max_mz: maximum m/z value in Th.
  • -Algorithm:min_rt: minimum retention time in seconds.
  • -Algorithm:max_rt: maximum retention time in seconds.
  • -Algorithm:min_peaks: minimum numbers of supporting peaks (per MS level). The supporting peaks are the peaks explain a mass. For instance, the peaks of distinct charge states or water/NH3 loss are supporting peaks.
  • -Algorithm:min_mass_count: minimum number of deconvoluted mass per spectrum. Only used for real time deconvolution.
  • -Algorithm:min_intensity: minimum peak intensity to consider.
  • -Algorithm:rt_window: retention time window for MS1 deconvolution.

 

Advanced FeatureTracing parameters (with prefix FeatureTracing: )

  • -FeatureTracing:quant_method: Method of quantification for mass traces. For LC data ‘area’ is recommended, ‘median’ for direct injection data. ‘max_height’ simply uses the most intense peak in the trace.
  • -FeatureTracing:max_trace_length: maximum expected length of a feature in second.
  • -FeatureTracing:min_isotope_cosine: Cosine threshold between avg. and observed isotope pattern for mass features. If not set, controlled by -Algorithm:min_isotope_cosine_ option.

 

Running FLASHDeconv

Currently no GUI is prepared. Only runnable on command line. Runnable FLASHDeconv file can be found under [OpenMS path]/bin directory.

The mandatory options are -in and -out options. FLASHDeconv 2.0 only takes mzML file as its input. Basic parameters could be adjusted by the user according to instrumental setup. For input mzML file conversion from raw file, we recommend to use MSConvert. For MS1, no peak picking or vendor provided peak picking methods may be used. For MS2, vendor provided peak picking methods are recommended.

For example if one wants to deconvolute /User/me/data/infile.mzml and get the result /User/me/out/outfilefeature.tsv,

one could run FLASHDeconv by typing as follows in the directory where FLASHDeconv is installed.

./FLASHDeconv -in /User/me/data/infile.mzml -out /User/me/out/outfilefeature.tsv

The batch script description will be added here soon.

 

Output files

  • Deconvoluted feature file (*.tsv) specified by -out
  • (optional) Deconvoluted MSn spectra files (*.tsv) specified by -out_spec
  • (optional) Deconvoluted mzML spectra file (*.mzML) specified by -out_mzml
  • (optional) Deconvoluted MS1 in promex output format (*.ms1ft) specified by -out_promex
  • (optional) Deconvoluted MSn spectra files in topfd output format (*.msalign) specified by -out_topFD
  • (optional) Deconvoluted MSn feature files in topfd output format (*.feature) specified by -out_topFD_feature

Example datasets

Mass spectrometry datasets(*.raw and *.mzML) and corresponding results have been uploaded to MassIVE (https://massive.ucsd.edu) and are available under accession number MSV000084001.