Filters protein identification engine results by different criteria.
| potential predecessor tools | IDFilter | potential successor tools |
| MascotAdapter (or other ID engines) | PeptideIndexer | |
| IDFileConverter | ProteinInference | |
| FalseDiscoveryRate | IDMapper | |
| ConsensusID |
This tool is used to filter the identifications found by a peptide/protein identification tool like Mascot. Different filters can be applied:
To enable any of the filters, just change their default value. All active filters will be applied in order.
The command line parameters of this tool are:
IDFilter -- Filters results from protein or peptide identification engines based on different criteria.
Version: 2.0.0 Jul 29 2015, 20:40:22, Revision: GIT-NOTFOUND
Usage:
IDFilter <options>
Options (mandatory options marked with '*'):
-in <file>* Input file (valid formats: 'idXML')
-out <file>* Output file (valid formats: 'idXML')
Filtering by precursor RT or m/z:
-precursor:rt [min]:[max] Retention time range to extract. (default: ':')
-precursor:mz [min]:[max] Mass-to-charge range to extract. (default: ':')
-precursor:allow_missing When filtering by precursor RT or m/z, keep peptide IDs with
missing precursor information ('RT'/'MZ' meta values)?
Filtering by peptide/protein score. To enable any of the filters below, just change their default value. All
active filters will be applied in order.:
-score:pep <score> The score which should be reached by a peptide hit to be kept.
The score is dependent on the most recent(!) preprocessing -
it could be Mascot scores (if a MascotAdapter was applied befor
e), or an FDR (if FalseDiscoveryRate was applied before), etc.
(default: '0')
-score:prot <score> The score which should be reached by a protein hit to be kept.
Use in combination with 'delete_unreferenced_peptide_hits' to
remove affected peptides. (default: '0')
Filtering by significance threshold:
-thresh:pep <fraction> Keep a peptide hit only if its score is above this fraction of
the peptide significance threshold. (default: '0')
-thresh:prot <fraction> Keep a protein hit only if its score is above this fraction of
the protein significance threshold. Use in combination with
'delete_unreferenced_peptide_hits' to remove affected peptides.
(default: '0')
Filtering by whitelisting (only instances also present in a whitelist file can pass):
-whitelist:proteins <file> Filename of a FASTA file containing protein sequences.
All peptides that are not a substring of a sequence in this fi
le are removed
All proteins whose accession is not present in this file are r
emoved. (valid formats: 'fasta')
-whitelist:by_seq_only Match peptides with FASTA file by sequence instead of accession
and disable protein filtering.
-whitelist:protein_accessions <accessions> All peptides that are not referencing at least one of the provi
ded protein accession are removed.
Only proteins of the provided list are retained.
Filtering by blacklisting (only instances not present in a blacklist file can pass):
-blacklist:peptides <file> Peptides having the same sequence and modification assignment
as any peptide in this file will be filtered out. Use with blac
klist:ignore_modification flag to only compare by sequence.
(valid formats: 'idXML')
-blacklist:ignore_modifications Compare blacklisted peptides by sequence only.
Filtering by RT predicted by 'RTPredict':
-rt:p_value <float> Retention time filtering by the p-value predicted by RTPredict.
(default: '0' min: '0' max: '1')
-rt:p_value_1st_dim <float> Retention time filtering by the p-value predicted by RTPredict
for first dimension. (default: '0' min: '0' max: '1')
Filtering by mz:
-mz:error <float> Filtering by deviation to theoretical mass (disabled for negati
ve values). (default: '-1')
-mz:unit <String> Absolute or relative error. (default: 'ppm' valid: 'Da', 'ppm')
Filtering best hits per spectrum (for peptides) or from proteins:
-best:n_peptide_hits <integer> Keep only the 'n' highest scoring peptide hits per spectrum
(for n>0). (default: '0' min: '0')
-best:n_protein_hits <integer> Keep only the 'n' highest scoring protein hits (for n>0). (defa
ult: '0' min: '0')
-best:strict Keep only the highest scoring peptide hit.
Similar to n_peptide_hits=1, but if there are two or more high
est scoring hits, none are kept.
-min_length <integer> Keep only peptide hits with a length greater or equal this valu
e. Value 0 will have no filter effect. (default: '0' min: '0')
-max_length <integer> Keep only peptide hits with a length less or equal this value.
Value 0 will have no filter effect. Value is overridden by min_
length, i.e. if max_length < min_length, max_length will be
ignored. (default: '0' min: '0')
-min_charge <integer> Keep only peptide hits for tandem spectra with charge greater
or equal this value. (default: '1' min: '1')
-var_mods Keep only peptide hits with variable modifications (fixed modif
ications from SearchParameters will be ignored).
-unique If a peptide hit occurs more than once per PSM, only one instan
ce is kept.
-unique_per_protein Only peptides matching exactly one protein are kept. Remember
that isoforms count as different proteins!
-keep_unreferenced_protein_hits Proteins not referenced by a peptide are retained in the ids.
-remove_decoys Remove proteins according to the information in the user parame
ters. Usually used in combination with 'delete_unreferenced_pep
tide_hits'.
-delete_unreferenced_peptide_hits Peptides not referenced by any protein are deleted in the ids.
Usually used in combination with 'score:prot' or 'thresh:prot'.
Common TOPP options:
-ini <file> Use the given TOPP INI file
-threads <n> Sets the number of threads allowed to be used by the TOPP tool
(default: '1')
-write_ini <file> Writes the default configuration file
--help Shows options
--helphelp Shows all options (including advanced)
INI file documentation of this tool:
| OpenMS / TOPP release 2.0.0 | Documentation generated on Thu Jul 30 2015 03:13:00 using doxygen 1.8.9.1 |