OpenMS
DatabaseFilter

The DatabaseFilter tool filters a protein database in fasta format according to one or multiple filtering criteria.

The resulting database is written as output. Depending on the reporting method (method="whitelist" or "blacklist") only entries are retained that passed all filters ("whitelist) or failed at least one filter ("blacklist").

Implemented filter criteria:

accession: Filter database according to the set of protein accessions contained in an identification file (idXML, mzIdentML)

The command line parameters of this tool are:

DatabaseFilter -- Filters a protein database (FASTA format) based on identified proteins
Full documentation: http://www.openms.de/doxygen/release/3.0.0/html/UTILS_DatabaseFilter.html
Version: 3.0.0 Jul 14 2023, 11:57:33, Revision: be787e9
To cite OpenMS:
 + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for 
   mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  DatabaseFilter <options>

Options (mandatory options marked with '*'):
  -in <file>*        Input FASTA file, containing a database. (valid formats: 'fasta')
  -id <file>*        Input file containing identified peptides and proteins. (valid formats: 'idXML', 'mzid')

  -method <choice>   Switch between white-/blacklisting (default: 'whitelist') (valid: 'whitelist', 'blacklis
                     t')
  -out <file>*       Output FASTA file where the reduced database will be written to. (valid formats: 'fasta'
                     )
                     
Common UTIL options:
  -ini <file>        Use the given TOPP INI file
  -threads <n>       Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>  Writes the default configuration file
  --help             Shows options
  --helphelp         Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+DatabaseFilterFilters a protein database (FASTA format) based on identified proteins
version3.0.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'DatabaseFilter'
in Input FASTA file, containing a database.input file*.fasta
id Input file containing identified peptides and proteins.input file*.idXML, *.mzid
methodwhitelist Switch between white-/blacklistingwhitelist, blacklist
out Output FASTA file where the reduced database will be written to.output file*.fasta
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false