FLASHDeconv: Ultrafast, high-quality feature deconvolution for top-down proteomics

Feature deconvolution, the determination of intact proteoform masses, is crucial for native and denatured top-down proteomics but currently suffers from long runtimes and frequent artifacts. We present FLASHDeconv, an algorithm based on a simple transformation of mass spectra, which turns deconvolution into the search for constant patterns thus greatly accelerating the process. We show higher deconvolution quality and two to three orders of magnitude faster execution speed than existing approaches.

 

Note:

The current version is only beta version.

Installation:

FLASHDeconv installation files (for windows *.exe, for mac *.dmg, and for linux *.deb) are found in here.

 

Parameters:

FLASHDeconv basic parameters are found by simply running FLASHDeconv.

  • -in: input file or directory (only *.mzML files are currently accepted)
  • -out: output file prefix; [prefix].tsv is generated.
  • -tol: tolerance in PPM (default: 10)
  • -minC: minimum charge of peaks (default: 2)
  • -maxC: maximum charge of peaks (default: 100)
  • -minM: minimum mass of peaks (default: 1,000)
  • -maxM: maximum mass of peaks (default: 100,000)
  • -minIC: cosine threshold between averagine and observed isotope pattern (default : 0.7)
  • -minCC: cosine threshold between per-charge-intensities and fitted gaussian distribution (default: 0.7)

 

FLASHDeconv advanced parameters are found by running FLASHDeconv with –helphelp option.

  • -minICS: similar to -minIC option but this threshold is applied for each spectrum (instead of each feature; default: 0.5)
  • -minCCS: similar to -minCC option but this threshold is applied for each spectrum (instead of each feature; default: 0.5)
  • -minIT: intensity threshold (default: 0)
  • -RTwindow: RT window in second. When 0, 2% total gradient time will be used (default: 0)
  • -minRTspan: minimum RT span for features in second (default: 1)
  • -writeSpecDeconv: to write per spectrum deconvoluted masses (default: 0)
  • -jitter: jitter universal patter to generate decoy features (output file will have the name [prefix]Decoy.tsv; default: 0)

 

Running FLASHDeconv:

Currently no GUI is prepared. Only runnable on command line.

The mandatory options are -in and -out options. Basic parameters could be adjusted by the user according to instrumental setup. For input mzML file conversion from raw file, we recommend not to use any peak picking method.

 

You can specify a file or a directory for -in and -out options.

 

For example if one wants to deconvolute /User/me/data/infile.mzml and get the result /User/me/out/outfilefeature.tsv,

one could run FLASHDeconv by typing as follows in the directory where FLASHDeconv is installed.

  1. -in [infile] -out [outfile]
    ./FLASHDeconv -in /User/me/data/infile.mzml -out /User/me/out/outfile

    In /User/me/out/ directory, outfilefeature.tsv  will be generated.

  2. -in [infile] -out [outdir]
    ./FLASHDeconv -in /User/me/data/infile.mzml -out /User/me/out/

    In /User/me/out/ directory, infilefeature.tsv  will be generated (output filename follows input filename).

  3. -in [dir] -out [file]
    ./FLASHDeconv -in /User/me/data/ -out /User/me/out/outfile

    FLASHDeconv will find all mzML files in /User/me/data/ (recursively) and process them. In /User/me/out/ directory, outfilefeature.tsv will be generated (all features are written in this file).

  4. -in [dir] -out [dir]
    ./FLASHDeconv -in /User/me/data/ -out /User/me/out/

    FLASHDeconv will find all mzML files in /User/me/data/ (recursively) and process them. In /User/me/out/ directory, an output file will be generated for each input file.

 

Output file:

Under construction.

 

Example datasets:

Mass spectrometry datasets(*.raw and *.mzML) and corresponding results have been uploaded to MassIVE (https://massive.ucsd.edu) and are available under accession number MSV000084001.