PSI TargetDB

TargetDB Statistics Summary Report

Last updated: Mar 11 2010



Target Status Statistics

Total number of targets deposited by worldwide SG Centers in TargetDB: 246866

Table 1: TargetDB Status Statistics

Status Total Number of Targets (%) Relative to "Cloned" Targets(%) Relative to "Expressed" Targets(%) Relative to "Purified" Targets (%) Relative to "Crystallized" Targets
Cloned174533100.0---
Expressed12209470.0100.0--
Soluble4654926.738.1--
Purified4279824.535.1100.0-
Crystallized144218.311.833.7100.0
Diffraction-quality Crystals76144.46.217.852.8
Diffraction65343.75.415.345.3
NMR Assigned21351.21.75.0-
HSQC38852.23.29.1-
Crystal Structure51482.94.212.035.7
NMR Structure20341.21.74.8-
In PDB174704.36.117.538
Work Stopped33829-- --
Test Target94-- --
Other8139-- --

Last updated: Mar 11 2010

Note 1:   Number of targets with status "in PDB" may not be equal to number of structures determined by a project. A target may reference several PDB IDs (example: structure of the same polypeptides with different ligands). Multiple targets in TargetDB may identify the same PDB structure when a stucture is a result of collaboration between different centers and each center includes the target on its target list.

Figure 1: Experimental Status in TargetDB

Last updated: Mar 11 2010

This graph is normalized relative to number of cloned targets in TargetDB.
Targets that progressed to status "Cloned" constitute 71% of TargetDB.

back to top

Table 2: TargetDB Status Statistics by Organism

These statistics are derived from mapping of target sequences to GenBank using >=98% sequence identity cut off

Organism Total Number1 Work Stopped Cloned Expressed Purified Crystallized Crystal Structure NMR Structure In PDB2
Viruses121820879455324955371852
Archaea1662219421307792033865141268559797
Bacteria1575621957111670687801311501124838634974602
Prokaryota1741782151312977997000350131265945475555397
Yeast3054543197016891152118591560
Plasmodium4975416278312011946519323
Trypanosoma5287703443175729258908
Leishmania86692714181212037113920016
Arabidopsis779038433814111530284355389
Rice16812114374124101
Worm14460289212225555548011430536
Drosophila98122230176531081119
Mouse63291300404632101671376132667799
Human16089195484606731322464820511971402
Eukaryota663851137040811230317238159253314611985
Uncultured or unidentified26736170137753611019

Last updated: Mar 11 2010

Note 1:   Total counts in this table may differ from total number of targets and structures. A target is counted in different organism specifications if:
- a target is mapped to different organisms
- a targtet is a hybrid complex (for example:a complex of human and mouse polypeptides)

Figure 2: Source Organisms in TargetDB

Last updated: Mar 11 2010

back to top

Deposited Structure Statistics

Number of released X-Ray structures reported to TargetDB: 5969

Number of released NMR structures reported to TargetDB: 1865

Number of released Cryo-Electron Microscopy structures reported to TargetDB: 3

Total number of released structures from worldwide SG Centers reported to TargetDB: 7837

View list of all reported to TargetDB structures deposited by worldwide SG Centers to the PDB

Table 3: PDB Status Statistics for Structural Genomics Structures

StatusAll CentersPSI CentersNon-PSI SG Centers in North America SG Centers in EuropeSG Centers in Asia
Total Deposited807948753791522714
Released783746523711482709
Release on Publication81205
Release on Certain Date00000
In Process234222840
Last updated: Mar 11 2010
1:   Some PDB IDs are cross referenced by different centers. Example: PDB_id 106Y is associated with SPINE and TB centers. Therefore difference between number of structures in "ALL Centers" column and direct sum of number of structures from projects/geographical regions can be observed.
2:   "Total Deposited" are all structures in the PDB including structures released to the public and structures that are in the process to be released("Released on Publication" , "Released on Certain Date", etc.).

Figure 3: Structures Released by SG Centers by Year

Last updated: Mar 11 2010

back to top

Sequence Redundancy Statistics

Table 4: TargetDB Sequence Redundancy Statistics by Experimental Status

Sequence Identity(%)Novel Targets
Status:
Selected
Novel Targets
Status:
Cloned
Novel Targets
Status:
Expressed
Novel Targets
Status:
Purified
Novel Targets
Status:
Crystallized
Novel Targets
Status:
Crystal Structure
Novel Targets
Status:
NMR Structure
Novel Targets
Status:
in PDB
<100162049116756844683191112006427819456404
<90149171109736798483046911548410319276204
<7013269499997733842868611194403918136022
<5010412081765605232469910216382216785617
<30567954800935313161907608321014784679
Last updated: 10-03-08
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in TargetDB which are in the same experimental status category and at least 20 amino acids long

Table 5: Sequence Redundancy Statistics for Structures Released by SG Centers in the PDB by Year

YearReleased Structures Number of Released Structures <30% Sequence Identity at Time of Release Percent(%) of Released Structures <30% Sequence Identity at Time of Release
<= 2000983334
2001732433
20021715935
200341315638
200495537739
2005106235834
2006115644538
2007159856135
2008106149647
2009107347544
20101688148
Total7828306539
Last updated: 10-03-11
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long

Figure 4: Comparison of Novel Structures with Number of Structures Released By SG Centers

Last updated: 10-03-11
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long
back to top

Summary Statistics Reports by Project or Geographical Region: