Personal tools
You are here: Home PPP

ppp

HUPO Plasma Proteome Project

The pilot phase of the Plasma Proteome Project (PPP) from 2002 to 2005 was a collaborative international comparative analysis of the experimental approaches and protein identifications from the participating laboratories. As a prerequisite to the comparative analysis, all data were collected and stored in a central sql server database, here at the University of Michigan. Phase 2 of the Human Plasma Proteome Project continues (Omenn, Aebersold, & Paik, 2009).

The following links lead to tab delimited text files containing the data created by different queries:

1) All Identifications: All protein identifications from MS/MS experiments submitted by participating laboratories with either low or high confidence. There are 9504 proteins in this list. Text  

2) Confirmed Identifications: All protein identifications from MS/MS experiments submitted by participating laboratories with either low or high confidence, but identified with two or more distinct peptides. There are 3020 proteins in this list. Text  

3) List of 889 high confidence HPPP proteins: A rigorous statistical approach devised to take into account the length of coding regions in genes and multiple hypothesis testing yielded a reduced set of 889 proteins with at least 95% confidence in protein identification. Table
 
Direct Access to the SQL server database:

It is also possible to access the data directly from the sql server database using a web application. The principal investigators of all participating labs have accounts to login to the application. If you already have an account please click the 'Login' link below. Once you login, you will reach a SQL query collection page with a set of public queries. These queries are created for you to understand the data and database clearly. The Help documentation in the login page has more information regarding the sql query pages.

Database Schema PDF


The primary dataset from the HUPO Plasma Proteome Project is the 3020 proteins based on 2 or more unique peptides identified. See:

Omenn GS, States DJ, Adamski MR, Blackwell TW, Menon R, Hermjakob H, Apweiler R, Haab BB, Simpson, RJ, Eddes JS, Kapp, EA, Moritz RL, Chan DW, Rai AJ, Admon A, Aebersold R, Eng J, Hancock WS, Hefta SA, Meyer H, Paik Y-K, Yoo J-S, Ping P, Pounds J, Adkins J, Qian X, Wang R, Wasinger V, Wu CY, Zhao X, Zeng R, Archakov A, Tsugita A, Beer I, Pandey A, Pisano M, Andrews P, Tammen H, Speicher DW, Hanash SH.  Overview of the HUPO Plasma Proteome Project: results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database.  Proteomics 2005;5:3226-3245. PMID: 16104056.

The larger 9504 protein dataset is a backup based on 1 or more peptides, which has proved useful in checking to see whether different search tools have identified different protein isoforms or homologs with different IPIs, but which are potentially redundant.

In addition, there is a super-stringent 889-protein dataset published in States DJ, Omenn GS, Blackwell TW, Fermin D, Eng J, Speicher DW, Hanash SM.  Challenges in deriving high-confidence protein identifications from data gathered by HUPO plasma proteome collaborative study.  Nature Biotech 2006; 24: 333-338. PMID: 16525410.

Individual files from the 18 participating laboratories that submitted data to the HUPO Plasma Proteome Project Pilot Phase are available under that title in www.ebi.ac.uk/PRIDE and in the tabulation here

Initial datasets from the HUPO HPPP-phase 2 are at PRIDE, as well.

Finally, human plasma proteome data from multiple sources have been re-analyzed from primary spectra and made available at www.peptideatlas.org.

 

Document Actions
Copyright 2009 by The Regents of the University of Michigan
Site powered by Plone