PSI TargetDB

Statistics Summary Report for PSI Centers

Last updated: Mar 11 2010


PSI-2 Centers:

|ATCG3D| |CESG| |CHTSB| |CSMP| |JCSG| |ISFI| |MCSG| |NESG| |NYCOMPS| |NYSGXRC|

PSI-1 Centers:

|BSGC| |SECSG| |SGPP| |TB|



Target Status Statistics

Total number of targets deposited by PSI Centers to TargetDB: 227169

Table 1: Status Statistics for PSI Centers

Status Total Number of Targets(%) Relative to "Cloned" Targets(%) Relative to "Expressed" Targets(%) Relative to "Purified" Targets(%) Relative to "Crystallized" Targets
Cloned158810100.0---
Expressed10986569.2100.0--
Soluble3829624.134.9--
Purified3437521.631.3100.0-
Crystallized110326.910.032.1100.0
Diffraction-quality Crystals55513.55.116.150.3
Diffraction44782.84.113.040.6
NMR Assigned6780.40.62.0-
HSQC23591.52.16.9-
Crystal Structure34102.13.19.930.9
NMR Structure6090.40.61.8-
In PDB142482.73.912.433
Work Stopped33417----
Test Target93----
Other8139----

Last updated: Mar 11 2010
Note 1:   Number of targets with status "in PDB" may not be equal to number of structures determined by a project. A target may reference several PDB IDs (example: structure of the same polypeptides with different ligands). Multiple targets in TargetDB may identify the same PDB structure when a stucture is a result of collaboration between different centers and each center includes the target on its target list.

Figure 1: Target Experimental Status for PSI Centers

Last updated: Mar 11 2010

This graph is normalized relative to number of cloned targets in TargetDB.
Targets that progressed to status "Cloned" constitute 70% of TargetDB.

Table 2: Status Statistics for PSI Centers by Organism

These statistics is derived from mapping of target sequences to GenBank using >=98% sequence identity cut off

Organism Total Number1 Work Stopped Cloned Expressed Purified Crystallized Crystal Structure NMR Structure In PDB2
Viruses84420550641819335191334
Archaea150001928115018044282580125958372
Bacteria144863192411071328079826832919328214523408
Prokaryota159857211691186298883829655999330795093778
Yeast2678524165914231044101421152
Plasmodium4954416277111921876216017
Trypanosoma5285703441175529158908
Leishmania86642714177211836913920016
Arabidopsis774838423772107326482352157
Rice16812114374124101
Worm14457289212223555347811430334
Drosophila94822205154365426
Mouse4923128827121900490143441364
Human1320419335803413610701886936124
Eukaryota61527112983655819116424590428285399
Uncultured or unidentified23833146126693410017

Last updated: Mar 11 2010

Note 1:   Total counts in this table may differ from total number of targets and structures. A target is counted in different organism specifications if:
- a target is mapped to different organisms
- a targtet is a hybrid complex (for example:a complex of human and mouse polypeptides)

Figure 2: Source Organisms in PSI Centers

Last updated: Mar 11 2010 back to top


Deposited Structure Statistics for PSI Centers

Number of Released X-Ray Structures: 4207

Number of Released NMR Structures: 445

Total number of released structures from PSI Centers in the PDB: 4652

Table 3: PDB Status Statistics for Structures from PSI Centers

PDB StatusATCG3DBSGCCESGCHTSBCSMPJCSGISFIMCSGNESGCNYCOMPSNYSGXRCSECSGSGPPTBTotal
Total Deposited148814512121001311188828890892415154875
Released1488144129985131167812889392413824652
In Process001021618211601500133222
Last updated: Mar 11 2010
Note 1:   "Total Deposited" are all structures in the PDB including structures released to the public and structures that are in the process to be released("Released on Publication" , "Released on Certain Date", etc.).
Note 2:  Some PDB IDs are cross referenced by different centers. Therefore difference between "Total" number of structures and direct sum of number of structures from individual centers can be observed

Figure 3: Structures Released by PSI Centers by Year

Last updated: Mar 11 2010

Sequence Redundancy Statistics

Table 5: Sequence Redundancy Statistics for PSI Centers by Experimental Status

Sequence Identity(%)Novel Targets
Status:
Selected
Novel Targets
Status:
Cloned
Novel Targets
Status:
Expressed
Novel Targets
Status:
Purified
Novel Targets
Status:
Crystallized
Novel Targets
Status:
Crystal Structure
Novel Targets
Status:
NMR Structure
Novel Targets
Status:
in PDB
<1001499221069717692126099956431085703813
<901387751011997318625148929930815693782
<70123781924846739223802903130415643741
<5098049764115605820798836429395453601
<3054415457943338614183655826255213167
Last updated: 10-03-08
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in TargetDB which are in the same experimental status category and at least 20 amino acids long

Table 6: Sequence Redundancy Statistics for Structures Released by PSI Centers by Year

Year Released Structures Number of Released Structures <30% Identity at Time of Release Percent(%) of Released Structures <30% Identity(%) at Time of Release
<= 2000592237
2001471838
20021134540
200322810847
200455725245
200549425552
200669337354
200775344860
200870644863
200985045554
20101438056
Total4643250454
Last updated:10-03-11
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long

Figure 4:   Comparison of Novel Structures with Number of Structures Released by PSI Centers by Year

Note 1:  Last updated:  10-03-11
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long
back to top