DOWNLOADABLE DATA


  Search results
The data files represented here includes data available in the Human Protein Atlas version 15. A subset of this data can also be downloaded from the Search page with the genes corresponding to the current search result in the result in different formats; XML, RDF & TAB.
 
  Single entry
Data in XML, RDF & TAB format can be accessed at single entry level using URLs structure as below:
/ENSG00000106631.xml
/ENSG00000106631.trig
/ENSG00000106631.tab

 
  Archived data
As of version 13 of the Human Protein Atlas, the site can be reached using the url structure "vXX.proteinatlas.org" where XX is the version number. For example, version 13 of the Human Protein Atlas has the url v13.proteinatlas.org.

 
1 Normal tissue data
Expression profiles for proteins in human tissues based on immunohistochemisty using tissue micro arrays. The comma-separated file includes Ensembl gene identifier ("Gene"), tissue name ("Tissue"), annotated cell type ("Cell type"), expression value ("Level"), the type of annotation (annotated protein expression (APE), based on more than one antibody, or staining, based on one antibody only) ("Expression type"), and the reliability or validation of the expression value ("Reliability"). The data is based on The Human Protein Atlas version 15 and Ensembl version 78.38.

normal_tissue.csv.zip
CSV-file, 4.9 MB
 
2 Cancer tumor data
Staining profiles for proteins in human tumor tissue based on immunohistochemisty using tissue micro arrays. The comma-separated file includes Ensembl gene identifier ("Gene"), tumor name ("Tumor"), staining value ("Level"), the number of patients that stain for this staining value ("Count patients"), the total amount of patients for this tumor type ("Total patients") and the type of annotation staining ("Expression type"). The data is based on The Human Protein Atlas version 15 and Ensembl version 78.38.

cancer.csv.zip
CSV-file, 5.3 MB
 
3 Subcellular location data
Subcellular localization of proteins based on immunofluorescently stained cells. The comma-separated file includes Ensembl gene identifier ("Gene"), main subcellular location of the protein ("Main location"), other locations ("Other location"), the type of annotation (annotated protein expression (APE), based on more than one antibody, or staining, based on one antibody only) ("Expression type"), and the reliability or validation of the expression value ("Reliability"). The data is based on The Human Protein Atlas version 15 and Ensembl version 78.38.

subcellular_location.csv.zip
CSV-file, 131.5 KB
 
4 RNA gene data
RNA levels in 45 cell lines and 32 tissues based on RNA-seq. The comma-separated file includes Ensembl gene identifier ("Gene"), analysed sample ("Sample"), fragments per kilobase of transcript per million fragments mapped ("Value" and "Unit"), and abundance class ("Abundance"). The data is based on The Human Protein Atlas version 15 and Ensembl version 78.38.
RNA sequencing data for human tissue
RNA sequencing data for human cell lines

rna_tissue.csv.zip
CSV-file, 3.5 MB
rna_celline.csv.zip
CSV-file, 4.5 MB
 
5 RNA isoform data
RNA levels in 45 cell lines and 32 tissues based on RNA-seq. The tab-separated file includes Ensembl gene identifier ("Gene"), Ensembl transcript identifier ("Transcript"), analysed sample ("Sample") and fragments per kilobase of transcript per million fragments mapped ("FPKM"). The data is based on The Human Protein Atlas version 15 and Ensembl version 78.38.



transcript_rna_tissue.tsv.zip
TSV-file, 41.1 MB
transcript_rna_celline.tsv.zip
TSV-file, 29.7 MB
 
6 Data from the Human Protein Atlas in XML format
The XML file contains most of the data in the Human Protein Atlas version 15, including protein expression data (in normal and tumor tissues and in cell lines), antigen sequences, Western blot data for antibodies, protein array data for antibodies, RNA-seq data, external references such as UniProt identifiers, and more. The data is based on Ensembl version 78.38. The file structure is presented in the XSD-schema. This data can also be downloaded for a resulting gene set when using the search function (via the xml link on the result page).
The XML file presented here is compressed with gzip due to its size. It can be uncompressed with an archive program like 7‑zip.

proteinatlas.xml.gz
XML-file (gzip compressed), 307 MB
 
7 Data from the Human Protein Atlas in RDF format
This file contains a subset of the data in the Human Protein Atlas version 15 corresponding to the tissue annotations on gene level. This data can also be downloaded for a resulting gene set when using the search function (via the RDF link on the result page). This RDF release is BETA and will be extended and developed in coming releases. We thank Mark Thompson, Rajaram Kaliyaperumal and Eelke van der Horst (LUMC, The Netherlands), and Christine Chichester (SIB, Switzerland) for providing templates for generating the first beta-release of HPA nanopublications. Their contribution was made possible by IMI project Open PHACTS and EU FP7 project RD-Connect. This beta was developed within an ELIXIR collaboration.

proteinatlas.trig.gz
RDF trig-file (gzip compressed), 99.6 MB
 
8 Data from the Human Protein Atlas in TAB format
This file contains a subset of the data in the Human Protein Atlas version 15 corresponding to the data seen in the search result. This data can also be downloaded for a resulting gene set when using the search function (via the TAB link on the result page).

proteinatlas.tab.gz
TAB-file (gzip compressed), 1.3 MB