PSI TargetDB

TargetDB Statistics Summary Report

Last updated: Jul 2 2009



Target Status Statistics

Total number of targets deposited by worldwide SG Centers in TargetDB: 222177

Table 1: TargetDB Status Statistics

Status Total Number of Targets (%) Relative to "Cloned" Targets(%) Relative to "Expressed" Targets(%) Relative to "Purified" Targets (%) Relative to "Crystallized" Targets
Cloned149354100.0---
Expressed10143567.9100.0--
Soluble3855625.838.0--
Purified3690824.736.4100.0-
Crystallized128348.612.734.8100.0
Diffraction-quality Crystals65054.46.417.650.7
Diffraction58353.95.815.845.5
NMR Assigned20751.42.05.6-
HSQC38612.63.810.5-
Crystal Structure46333.14.612.636.1
NMR Structure19681.31.95.3-
In PDB168114.66.718.538
Work Stopped37080-- --
Test Target104-- --
Other10494-- --

Last updated: Jul 2 2009

Note 1:   Number of targets with status "in PDB" may not be equal to number of structures determined by a project. A target may reference several PDB IDs (example: structure of the same polypeptides with different ligands). Multiple targets in TargetDB may identify the same PDB structure when a stucture is a result of collaboration between different centers and each center includes the target on its target list.

Figure 1: Experimental Status in TargetDB

Last updated: Jul 2 2009

This graph is normalized relative to number of cloned targets in TargetDB.
Targets that progressed to status "Cloned" constitute 67% of TargetDB.

back to top

Table 2: TargetDB Status Statistics by Organism

Organism Total Number1 Work Stopped Cloned Expressed Purified Crystallized Crystal Structure NMR Structure In PDB2
Total Viruses76911841126014034271034
Archaea1520323481140179043472132663552740
Bacteria13557019122951477087426335990234584374042
Total Prokaryotes1507732147010654878778298071122840934894782
Yeast275968119741368810120601458
Plasmodium5201335295812632016920020
Trypanosoma6437923975193130159908
Leishmania95992884576221140414621017
Arabidopsis813349654026119434178375492
Rice13610112862134101
Nematode15175346712741568746610330738
Fly95929017396425348
Mouse27707942155164578921668268340
Human14179403374835297288455016210861262
Other Eukaryotes301644819551409520152821598
Total Eukaryotes683641549442144221636771150249314481942
Synthetic404441123
Unknown420292310000
Total219952370821491361012283672312765461419496761

Last updated: Jul 2 2009

Note 1:   Total counts in this table may differ from total number of targets. If targtet is a hybrid complex
(for example:a complex of human and mouse polypeptides) it is counted in different organism classifications.

Note 2:   Number of targets with status "in PDB" may not be equal to number of structures determined by a project. A target may reference several PDB IDs (example: structure of the same polypeptides with different ligands). Multiple targets in TargetDB may identify the same PDB structure when a stucture is a result of collaboration between different centers and each center includes the target on its target list.

Figure 2: Source Organisms in TargetDB

Last updated: Jul 2 2009

back to top

Deposited Structure Statistics

Number of released X-Ray structures reported to TargetDB: 5321

Number of released NMR structures reported to TargetDB: 1781

Number of released Cryo-Electron Microscopy structures reported to TargetDB: 3

Total number of released structures from worldwide SG Centers reported to TargetDB: 7105

View list of all reported to TargetDB structures deposited by worldwide SG Centers to the PDB

Table 3: PDB Status Statistics for Structural Genomics Structures

StatusAll CentersPSI CentersNon-PSI SG Centers in North America SG Centers in EuropeSG Centers in Asia
Total Deposited736742962521382708
Released710540682431342689
Release on Publication1903016
Release on Certain Date21001
In Process241227842
Last updated: Jul 2 2009
1:   Some PDB IDs are cross referenced by different centers. Example: PDB_id 106Y is associated with SPINE and TB centers. Therefore difference between number of structures in "ALL Centers" column and direct sum of number of structures from projects/geographical regions can be observed.
2:   "Total Deposited" are all structures in the PDB including structures released to the public and structures that are in the process to be released("Released on Publication" , "Released on Certain Date", etc.).

Figure 3: Structures Released by SG Centers by Year

Last updated: Jul 2 2009

back to top

Sequence Redundancy Statistics

Table 4: TargetDB Sequence Redundancy Statistics by Experimental Status

Sequence Identity(%)Novel Targets
Status:
Selected
Novel Targets
Status:
Cloned
Novel Targets
Status:
Expressed
Novel Targets
Status:
Purified
Novel Targets
Status:
Crystallized
Novel Targets
Status:
Crystal Structure
Novel Targets
Status:
NMR Structure
Novel Targets
Status:
in PDB
<100150190109891750522912710881394418805994
<90138310103316710812779410451377418625797
<7012408994424657692622810152371917405614
<50989337777955003226519313352615825215
<30504344334231649145126970294511694114
Last updated: 09-04-28
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in TargetDB which are in the same experimental status category and at least 20 amino acids long

Table 5: Sequence Redundancy Statistics for Structures Released by SG Centers in the PDB by Year

YearReleased Structures Number of Released Structures <30% Sequence Identity at Time of Release Percent(%) of Released Structures <30% Sequence Identity at Time of Release
<= 2000973435
2001732534
20021715935
200341415738
200495538340
2005106236634
2006115745239
2007159757236
2008106350948
200951623646
Total7105279339
Last updated: 09-07-02
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long

Figure 4: Comparison of Novel Structures with Number of Structures Released By SG Centers

Last updated: 09-07-02
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long
back to top

Summary Statistics Reports by Project or Geographical Region: