FLASHDeconv 2.0 beta+ with a GUI!!

Finally a GUI is here. You can find the GUI command in [OpenMS path]/bin folder. Go to [OpenMS path]/bin and run FLASHDeconvWizard! FLASHDeconv 2.0 beta+ version is out! FLASHDeconv 2.0 beta+ works for MS1 and MS2 deconvolution and supports various output formats (e.g., *.tsv, *.mzML, *.msalign, and *.feature). FLASHDeconv 2.0 stable version will be officially integrated in OpenMS 2.7.0 released in near future. In this version, an important parameter to significantly reduce proteoform identification error from incorrect precursor mass is introduced: -min_precursor_snr (See below). Precursor SNR is the SNR within the precursor envelope m/z range. Even if a mass is represented by a perfect isotope profile, if its precursor SNR is low due to presence of other proteoform signal in the same range (i.e., coelution), the proteoform identification is often false (due to mismatch between precursor ion and fragment ions). Also precursors of low SNR often represent low harmonic mass artifacts. We found that in complex samples, a high portion of precursors with low SNRs are present leading to false positives that are hard to filter out using target-decoy approach. FLASHDeconv 2.0 beta+ also supports TopPIC identification better than the previous version, by generating all msalign and feature files for TopPIC inputs.

Installation

FLASHDeconv installation files (OpenMS-2.x.0-HEAD-, for windows *.exe, for mac *.dmg, and for linux *.deb) and source code (*-src.tar.gz) are found in here.

Parameters

FLASHDeconv basic parameters are found by simply running FLASHDeconv. Only -in and -out are mandatory. FLASH advanced parameters are found by running FLASHDeconv –helphelp. FLASHDeconv parameters have three categories: FLASHDeconv tool parameters, FLASHDeconv algorithm parameters, and FeatureTracing algorithm parameters. Firstly the basic parameters in each category are described, and then the advanced ones are explained.

Basic tool parameters: 

  • -in: input file (only *.mzML files are currently accepted).
  • -out: *.tsv file for feature level deconvolution results.
  • -out_spec: *.tsv files for spectrum level deconvolution results. Files should be specified per MS level.
  • -out_mzml: *.mzML file for MS1 and MS2 deconvoluted spectra.
  • -out_promex: *.ms1ft (promex output format) file. Only MS1 deconvoluted masses are written.
  • -out_topFD: *.msalign (TopFD output format) files. Files should be specified per MS level.
  • -out_topFD_feature: *.feature (TopFD feature output format) files. Files should be specified per MS level.
  • -mzml_mass_charge: specifies the charge of deconvoluted masses (-1, 0, or +1) in mzML output.
  • -preceding_MS1_cout: specifies until how many preceding MS1 spectra precursor mass will be searched in, given an MS2 spectrum. In top-down proteomics, some precursor peaks in MS2 are not part of deconvoluted masses in MS1 immediatly preceding the MS2. In such cases, increasing this parameter allows for the search in further preceding MS1 spectra and helps determine exact precursor masses.
  • -write_detail: to write peak information more in detail (in spectrum level deconvolution *tsv files)
  • -use_ensemble_spectrum: if set, all spectra are merged into a single ensemble spectrum (per MS level) and deconvolution is done for the ensemble spectrum. Basic peak picking is done for each ensemble spectrum.
  • -min_precursor_snr: minimum precursor SNR (default 1.0)

Basic algorithm parameters (with prefix Algorithm: )

  • -Algorithm:tol: tolerance for each MS level in PPM.
  • -Algorithm:min_mass: minimum deconvoluted mass.
  • -Algorithm:max_mass: maximum deconvoluted mass.
  • -Algorithm:min_charge: minimum charge of MS1 peaks. This can be set negative for negative mode MS runs (as in RNA sequencing). For MS2, minimum charge is set to 1.
  • -Algorithm:max_charge: maximum charge of MS1 peaks. This can be set negative for negative mode MS runs (as in RNA sequencing). For MS2, maximum charge is set to its precursor charge.
  • -Algorithm:min_isotope_cosine: Cosine threshold between avg. and observed isotope pattern for MS1, 2, …
  • -Algorithm:min_qscore: QScore threshold. QScore is the probability that a mass is identified.

 

Baisc FeatureTracing parameters (with prefix FeatureTracing: )

  • -FeatureTracing:mass_error_da: mass dalton tolerance for feature tracing.
  • -FeatureTracing:min_sample_rate: minimum fraction of scans along the feature trace that must contain a peak. To raise feature detection sensitivity, lower this value close to 0.
  • -FeatureTracing:min_trace_length: minimum expected length of a feature in second.

 

Advanced tool parameters: 

  • -max_MS_level: specifies the maximum MS level.
  • -use_RNA_averagine: if set to 1, RNA averageine model is used instead of protein model.

 

Advanced algorithm parameters (with prefix Algorithm: )

  • -Algorithm:min_mz : minimum m/z value in Th.
  • -Algorithm:max_mz: maximum m/z value in Th.
  • -Algorithm:min_rt: minimum retention time in seconds.
  • -Algorithm:max_rt: maximum retention time in seconds.
  • -Algorithm:min_peaks: minimum number of peaks of consecutive charge states per MS level.(e.g., -min_peaks 4 2 to specify 4 and 2 for MS1 and MS2, respectively). This affects only for peaks of highly charged peaks (>8). The peaks of low charges are detected based on m/z distance between isotopes.
  • -Algorithm:min_mass_count: minimum number of deconvoluted mass per spectrum. Only used for real time deconvolution.
  • -Algorithm:min_intensity: minimum peak intensity to consider. Default is 100 to remove extremely low intensity peaks (e.g., in Bruker spectra)
  • -Algorithm:rt_window: retention time window for MS1 deconvolution.

 

Advanced FeatureTracing parameters (with prefix FeatureTracing: )

  • -FeatureTracing:quant_method: Method of quantification for mass traces. For LC data ‘area’ is recommended, ‘median’ for direct injection data. ‘max_height’ simply uses the most intense peak in the trace.
  • -FeatureTracing:max_trace_length: maximum expected length of a feature in second.
  • -FeatureTracing:min_isotope_cosine: Cosine threshold between avg. and observed isotope pattern for mass features. If not set, controlled by -Algorithm:min_isotope_cosine_ option.

Running FLASHDeconv with GUI

GUI command is found under [OpenMS path]/bin directory. From the bin directory, type

./FLASHDeconvWizard

And this window pops up.

 

 

From the “LC-MS files” menu you can select (possibly multiple) mzML files to analyze. The selected files are analyzed with the same parameter set.

Then if you go to the “Run FLASHDeconv” menu,  you can control all the parameters and output options.

The default output folder is [home directory]/FLASHDeconvOut folder. You may change this by using Browse button in the right side. Below we have four toggle output buttons.

If “masses per spectrum” is selected, spectrum level deconvolution results (per MS level) are generated (in tsv format). If “mzML” is selected, they are generated in mzML format.

“Promex (*.ms1ft)” triggers the Promex format output generation (only for MS1), and “TopFD (*.msalign,*.feature)” triggers the TopFD format output generation (both msalign and feature formats). The box below the toggle buttons controls the parameters. In default it shows only basic parameters.

The “Log” menu shows the log from FLASHDeconv. During or after FLASHDeconv run, one may check the log from FLASHDeconv from this menu.

 

Running FLASHDeconv on command line

Runnable FLASHDeconv file can be found under [OpenMS path]/bin directory.

The mandatory options are -in and -out options. FLASHDeconv 2.0 only takes mzML file as its input. Basic parameters could be adjusted by the user according to instrumental setup. For input mzML file conversion from raw file, we recommend to use MSConvert. For MS1, no peak picking or vendor provided peak picking methods may be used. For MS2, vendor provided peak picking methods are recommended.

For example if one wants to deconvolute /User/me/data/infile.mzml and get the result /User/me/out/outfilefeature.tsv,

one could run FLASHDeconv by typing as follows in the directory where FLASHDeconv is installed.

./FLASHDeconv -in /User/me/data/infile.mzml -out /User/me/out/outfilefeature.tsv

 

Output files

  • Deconvoluted feature file (*.tsv) specified by -out
  • (optional) Deconvoluted MSn spectra files (*.tsv) specified by -out_spec
  • (optional) Deconvoluted mzML spectra file (*.mzML) specified by -out_mzml
  • (optional) Deconvoluted MS1 in promex output format (*.ms1ft) specified by -out_promex
  • (optional) Deconvoluted MSn spectra files in topfd output format (*.msalign) specified by -out_topFD
  • (optional) Deconvoluted MSn feature files in topfd output format (*.feature) specified by -out_topFD_feature

Example datasets

Mass spectrometry datasets(*.raw and *.mzML) and corresponding results have been uploaded to MassIVE (https://massive.ucsd.edu) and are available under accession number MSV000084001.