Metrics Describing Progress of the Protein Structure Initiative
Updated March 01, 2010
-
I. Progress of the PSI
II. Number of Experimental Structures and Residues
III. Impact and Classification of Structures
IV. Novel Modeling Leverage
V. Biological Theme Targets & Structures
- Biological Theme Targets & Structures by PSI Center
- Biomedical Targets & Structures by PSI Center
- Metagenomic Targets & Structures by PSI Center
- Community Nominated Targets & Structures by PSI Center
Glossary of Terms
- BIG: BioInformatics Group. A team of bioinformaticians from PSI-2 large-scale production centers that coordinate efforts of target selection and progress evaluation
- ALL-PSI: All PSI centers (PSI-1 and PSI-2)
- ALL PDB: Entire Protein Data Bank
- PSI-1: The pilot phase of the protein structure initiative (PSI) ran from 09-01-2000 to 06-30-2005
- PSI-2: The production phase of the protein structure initiative (PSI) (07-01-2005 - ongoing). Statistics on this page reflect PSI-2 data deposited after 07-01-2005 and released before 03-01-2010
- LSC: are the four Protein Structure Initiative(PSI) large-scale production centers; namely MCSG, JCSG, NESG, and NYSGXRC
- Total Structures: Total number of structures in the PDB at the time of deposition. This includes multiple structures of the same protein sequence determined by different methods (i.e., NMR versus X-ray crystallography), in different crystal forms, different solution conditions, or bound to different ligands.
- Distinct Structures: Total number of structures with non-redundant sequences less then 98% sequence identity
- Distinct Residues: Total number of residues in structures with non-redundant sequences less 98% sequence identity
- Novel Structures: Total number of novel structures with less than 30% sequence identity to an existing structure at the time of PDB deposition
- Novel Residues: Total number of residues in structures with less than 30% sequence identity to an existing structure at the time of PDB deposition
- X-Ray Structures: Total number of structures determined using X-Ray crystallography
- NMR Structures: Total number of structures determined using NMR methods
- Membrane Proteins: Total number of structures of membrane proteins
- Eukaryotic Proteins: Total number of structures from eukaryotic organisms
- Prokaryotic Proteins: Total number of structures from prokaryotic organisms
- Human Proteins: Total number of structures of human proteins
- Other Proteins: Total number of structures of viral and unknown source proteins
I. Progress of the PSI
PDB deposition statistics
| PSI grant period | LSC | ALL-PSI |
| PSI-2 grant year 2005 | 419 | 456 |
| PSI-2 grant year 2006 | 620 | 700 |
| PSI-2 grant year 2007 | 681 | 717 |
| PSI-2 grant year 2008 | 796 | 818 |
| PSI-2 grant year 2009 | 447 | 470 |
| PSI-2 current depositions | 2963 | 3161 |
| PSI-1+PSI-2 | 3831 | 4544 |
PSI-2 grant year 2005:
PSI-2 structures deposited after July 1, 2005; released before June 30, 2006
PSI-2 grant year 2006:
PSI-2 structures deposited after July 1, 2006; released before June 30, 2007
PSI-2 grant year 2007:
PSI-2 structures deposited after July 1, 2007; released before July 1, 2008
PSI-2 grant year 2008:
PSI-2 structures deposited after July 1, 2008; released before July 1, 2009
PSI-2 grant year 2009:
PSI-2 structures deposited after July 1, 2009; released before July 1, 2010
PSI-2 current depositions:
PSI-2 structures deposited after July 1, 2005; released before March 01, 2010
PSI-1 + PSI-2:
All PSI structures released before March 01, 2010
Numbers of experimental structures from PSI
Start of PSI Deposition:
Sep 2000: Structures deposited before PSI grant initiation (2000-09-01)
Each subsequent bar shows the total number of PSI structures deposited as of the First of each indicated date.
For example, there were 97 PSI-derived structures deposited as of September 1, 2001.
II. Number of Experimental Structures and Residues
Structures determined in PSI-1
| Center | Total Structures |
Distinct Structures |
Distinct Residues |
Novel Structures |
Novel Residues |
| LSC | 868 | 767 | 174843 | 458 | 100522 |
| ALL-PSI-1 | 1383 | 1127 | 259397 | 631 | 138343 |
| ALL PDB | 34141 | 11505 | 2784172 | 4658 | 1133938 |
Calculations in this table are based on PDB data deposited before July 1, 2005
Structures determined in PSI-2
| Center | Total Structures |
Distinct Structures |
Distinct Residues |
Novel Structures |
Novel Residues |
| LSC | 2963 | 2708 | 614963 | 1794 | 402872 |
| ALL-PSI-2 | 3161 | 2795 | 635617 | 1848 | 415064 |
| ALL PDB | 29439 | 11297 | 2781889 | 4658 | 1102275 |
Calculations in this table are based on PDB data deposited after July 1, 2005 and released before March 01, 2010.
III. Impact and Classification of PSI-2 Structures
Calculations in this section are based on PDB data deposited after July 1, 2005 and released before March 01, 2010.
Classification of PSI-2 structures
| Center | Total Structures |
X-ray | NMR | Membrane Proteins |
Eukary- otes |
Human | Other | Prokary- otes |
| LSC | 2963 | 2679 | 284 | 18 | 143 | 72 | 38 | 2782 |
| ALL-PSI-2 | 3161 | 2848 | 313 | 43 | 294 | 114 | 42 | 2827 |
Classification of PSI-2 structures by organism
| Eukaryotes Total | 294 |
| Prokaryotes Total | 2827 |
| Other Organisms Total | 42 |
Detailed PSI-2 structure counts for eukaryotes
| Eukaryotes Total | 294 |
| Aequorea victoria | 1 |
| Anopheles gambiae | 2 |
| Arabidopsis thaliana | 56 |
| Aspergillus fumigatus | 1 |
| Aspergillus oryzae | 2 |
| Babesia bovis | 1 |
| Bos taurus | 3 |
| Brugia malayi | 1 |
| Caenorhabditis elegans | 10 |
| Candida albicans | 1 |
| Coccidioides immitis | 1 |
| Cyanidioschyzon merolae | 1 |
| Danio rerio | 7 |
| Drosophila melanogaster | 3 |
| Encephalitozoon cuniculi | 4 |
| Engyodontium album | 1 |
| Entamoeba histolytica | 1 |
| Galdieria sulphuraria | 7 |
| Gibberella zeae | 2 |
| Homo sapiens | 114 |
| Methanocaldococcus jannaschii | 1 |
| Mus musculus | 32 |
| Oncorhynchus mykiss | 1 |
| Oryza sativa | 1 |
| Pentadiplandra brazzeana | 1 |
| Plasmodium falciparum | 1 |
| Rana pipiens | 2 |
| Rattus norvegicus | 4 |
| Saccharomyces cerevisiae | 22 |
| Schizosaccharomyces pombe | 2 |
| Solanum lycopersicum | 1 |
| Spinacia oleracea | 1 |
| Sus scrofa | 1 |
| Toxoplasma gondii | 3 |
| Trypanosoma brucei | 1 |
| Xenopus laevis | 1 |
Detailed PSI-2 structure counts for prokaryotes
| Prokaryotes Total | 2827 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Detailed PSI-2 structure counts for other organisms
| Other Organisms Total | 42 |
| Artificial gene | 2 |
| Uncultured marine organism | 6 |
| Unidentified | 11 |
| Pseudomonas phage phi12 | 1 |
| Staphylococcus phage 37 | 1 |
| Sars coronavirus | 12 |
| Murid herpesvirus 4 | 2 |
| Homo sapiens, bacteriophage t4 | 1 |
| Homo sapiens, enterobacteria phage t4 | 1 |
| Sars coronavirus tor2 | 1 |
| Influenza a virus | 1 |
| Mengo virus | 1 |
| Vaccinia virus wr | 1 |
| Homo sapiens, enterobacteria phage t4, homo sapiens | 1 |
IV. Novel Modeling Leverage
Calculations in this section updated March 1, 2010
Modeling leverage caculations are provided by Dr.Lukasz Jaroszewski affiliated with JCSG
Leverage provided by PSI-2 structures
| Center | Total Leverage |
Residue Leverage |
Novel Leverage |
Residue Leverage |
| LSC | 1058498 | 259321521 | 218056 | 55255716 |
| PSI-2 | 1272145 | 302356185 | 231200 | 59116183 |
| PDB excluding PSI-2 | 3528980 | 1008622993 | 484770 | 186020919 |
V. Biological Theme Targets & Structures
Calculations in this section updated March 01, 2010
Biological Theme Targets by PSI Center
| Center | Biomedical Targets |
Metagenomics Targets |
Community Nominated Targets |
| ATCG3D | 7 | 0 | 0 |
| CESG | 931 | 0 | 440 |
| CHTSB | 125 | 0 | 0 |
| ISFI | 0 | 0 | 379 |
| JCSG | 19107 | 4266 | 3740 |
| MCSG | 3377 | 1500 | 1480 |
| NESGC | 7172 | 1142 | 640 |
| NYCOMPS | 0 | 0 | 458 |
| NYSGXRC | 1310 | 485 | 1480 |
| Total | 32029 | 7393 | 8616 |
Biomedical Targets & Structures by PSI Center
| Center | Total Targets |
Cloned Targets |
Expressed Targets |
Purified Targets |
Crystallyzed Targets |
NMR Targets |
Targets In PDB |
Structures In PDB |
| ATCG3D | 7 | 7 | 7 | 7 | 6 | 0 | 6 | 8 |
| CESG | 931 | 723 | 667 | 159 | 25 | 3 | 8 | 7 |
| CHTSB | 125 | 124 | 29 | 10 | 0 | 0 | 1 | 17 |
| JCSG | 19107 | 18380 | 17724 | 17722 | 1643 | 0 | 624 | 665 |
| MCSG | 3377 | 3167 | 2218 | 1112 | 214 | 0 | 238 | 245 |
| NESGC | 7172 | 1406 | 1366 | 422 | 110 | 61 | 95 | 100 |
| NYSGXRC | 1310 | 987 | 892 | 457 | 102 | 0 | 66 | 73 |
| Total | 32029 | 24794 | 22903 | 19889 | 2100 | 64 | 1038 | 1115 |
Metagenomic Targets & Structures by PSI Center
| Center | Total Targets |
Cloned Targets |
Expressed Targets |
Purified Targets |
Crystallyzed Targets |
NMR Targets |
Targets In PDB |
Structures In PDB |
| JCSG | 4266 | 4136 | 4134 | 4134 | 321 | 0 | 72 | 72 |
| MCSG | 1500 | 1430 | 859 | 306 | 64 | 0 | 63 | 63 |
| NESGC | 1142 | 775 | 772 | 258 | 62 | 18 | 39 | 43 |
| NYSGXRC | 485 | 451 | 396 | 207 | 52 | 0 | 30 | 33 |
| Total | 7393 | 6792 | 6161 | 4905 | 499 | 18 | 204 | 211 |
Community Nominated Targets & Structures by PSI Center
| Center | Total Targets |
Cloned Targets |
Expressed Targets |
Purified Targets |
Crystallyzed Targets |
NMR Targets |
Targets In PDB |
Structures In PDB |
| CESG | 440 | 311 | 240 | 69 | 36 | 12 | 29 | 36 |
| ISFI | 379 | 293 | 200 | 137 | 95 | 0 | 35 | 14 |
| JCSG | 3740 | 3615 | 3613 | 3613 | 280 | 0 | 70 | 71 |
| MCSG | 1480 | 1382 | 1068 | 540 | 98 | 0 | 35 | 36 |
| NESGC | 640 | 336 | 322 | 198 | 79 | 29 | 64 | 77 |
| NYCOMPS | 458 | 445 | 83 | 43 | 0 | 0 | 0 | 0 |
| NYSGXRC | 1480 | 1055 | 883 | 439 | 147 | 0 | 102 | 137 |
| Total | 8616 | 7436 | 6409 | 5039 | 735 | 41 | 335 | 371 |
VI. PSI PFAM Domain Family Coverage
Calculations in this section updated February 1, 2010
Number of PFAM families for which PSI provided the first structure representative
| Total PFAM Families |
Total PFAM Families in PDB |
PSI-1 PFAM Families |
PSI-2 PFAM Families |
| 11912 | 5268 | 347 | 519 |