RCSB PDB Protein Data Bank A Member of the wwPDB
An Information Portal to Biological Macromolecular Structures
Print Page | Close Window

Sequence Redundancy Analysis in TargetDB

The number of unique (non-redundant) targets and structures in TargetDB is calculated monthly by clustering using BLASTClust program

BLASTClust program thresholds:

  • -S similarity threshold is set as percent of identical residues: 100, 90, 70, 50, or 30 percent.
  • -L minimum length coverage is set to default: 0.9
  • -b is set to F(false) indicating that coverage specified by -S and -L thresholds is required on only one sequence of a pair

Protein sequences fewer than 20 amino acids are excluded from clustering.

Target sequence redundancy calculations are based on the comparison to all protein sequences in TargetDB that are in the same experimental status category.

Sequence redundancy calculations for structures released in the PDB are based on comparison to all protein sequences in the PDB.

© RCSB PDB