The esophagus-specific proteome
The main function of the esophagus is to transport swallowed food and liquids to the stomach. This approximately 25 cm long tube consists of outer layers of striated and smooth muscle, for mechanical propulsion of food, and an inner mucosa lined by non-cornified squamous epithelia. The transcriptome analysis shows that 66% of all human proteins (n=19692) are expressed in the esophagus and 297 of these genes show an elevated expression in esophagus compared to other tissue types.
An analysis of these proteins show that the majority of proteins were related to epithelial cell function, and are also expressed in other squamous mucosa including oral mucosa, vagina and exocervix.
- 48 esophagus enriched genes
- Most of the tissue enriched genes encode proteins involved in epithelial function
- 297 genes defined as elevated in the esophagus
- Over half the group-enriched esophagus genes (n=110) are shared with skin (n=33) or tonsil (n=11), or both (n=19)
Figure 1. The distribution of all genes across the five categories based on transcript abundance in esophagus as well as in all other tissues.
297 genes show elevated expression in the esophagus compared to other tissues. The three categories of genes with elevated expression in esophagus compared to other organs are shown in Table 1.
Table 1. The genes with elevated expression in esophagus
Number of genes
||At least five-fold higher mRNA levels in a particular tissue as compared to all other tissues
||At least five-fold higher mRNA levels in a group of 2-7 tissues
||At least five-fold higher mRNA levels in a particular tissue as compared to average levels in all tissues
||Total number of elevated genes in esophagus
The list of tissue enriched genes (n=48) includes previously characterized genes with cellular location and functions well in-line with the function of the esophagus, as well as a large number of genes with unknown function and expression pattern.
Table 2. The 12 genes with the highest level of enriched expression in esophagus. "Predicted localization" shows the classification of each gene into three main classes: Secreted, Membrane, and Intracellular, where the latter consists of genes without any predicted membrane and secreted features. "mRNA (tissue)" shows the transcript level as FPKM values, TS-score (Tissue Specificity score) corresponds to the score calculated as the fold change to the second highest tissue.
||small proline-rich protein 1A
||mucin 21, cell surface associated
||dynactin associated protein
||IGF-like family member 1
||defensin, beta 104B
||defensin, beta 104A
Some of the proteins predicted to be membrane-spanning are intracellular, e.g., in the Golgi or mitochondrial membranes, and some of the proteins predicted to be secreted can potentially be retained in a compartment belonging to the secretory pathway, such as the ER, or remain attached to the outer face of the cell membrane by a GPI anchor.
The esophagus transcriptome
An analysis of the expression levels of each gene made it possible to calculate the relative mRNA pool for each of the categories. The analysis show that 75% of the mRNA molecules in the esophagus correspond to housekeeping genes and 19% of the mRNA pool corresponds to genes categorized to be either esophagus enriched, group enriched, or enhanced in esophagus. Thus, most of the transcriptional activity in the esophagus is related to proteins with presumed housekeeping functions as they are found in all tissues and cells analyzed.
A Gene ontology analysis of the esophagus-enriched genes (n=48) show that the top categories are related to cell envelope organization, external encapsulating structure organization, keratinization, keratinocyte differentiation and epithelial- and epidermal cell differentiation. Compared to skin, which share many features with esophagus, Gene ontology analysis show that the major difference is that the esophagus does not have top-hit gene categories associated with water homeostasis and melanin biosynthesis.
Protein expression of genes elevated in esophagus
In-depth analysis of the elevated genes in esophagus using antibody-based proteomics allowed us to create an overview of the localization of the corresponding proteins. A large number of these proteins have functions related to squamous differentiation and are thus often also shared with other tissue types that are composed of squamous epithelia.
Proteins specifically expressed in esophagus
The inner lining of the esophagus is made up by glycoprotein-rich mucosal squamous epithelia that lacks an outer layer of cornified cells (as in the skin). Like most squamous epithelia, the esophagus express a variety of keratin intermediate filaments proteins whose function is to provide structural integrity between the cells. Among structural proteins, Keratin 4 (KRT4), -6 (KRT6A, KRT6B and KRT6C) -13 (KRT13), and -32 (KRT32) showed high enrichment together with the Ca2+ binding proteins cornulin (CRNN) and S100A14. KRT13 is primarily expressed in the mucosal epithelia, as was also MUC21, which is observed in esophageal epithelial cells. An interesting enriched protein is the alcohol-degrading enzyme ADH7, which is observed in mucinous epithelial cells of the esophagus and stomach.
Proteins specifically expressed in esophageal muscle
Among the genes that show enrichment in the esophagus but does not show protein expression in the epithelial cells are two muscle-specific genes: MYH8, and NKX6-1. Whereas MYH8 is a well known muscle-specific gene that is group-enriched together with skeletal muscle, the transcription factor NKX6-1 appears to be specifically expressed only by muscles in the esophagus and is previously not described in this tissue.
Genes shared between esophagus and other tissues
There are 110 group-enriched genes expressed in the esophagus. Group enriched genes are defined as genes showing a 5-fold higher average level of mRNA expression in a group of 2-7 tissues, including esophagus, compared to all other tissues.
In order to illustrate the relation of esophagus tissue to other tissue types, a network plot was generated, displaying the number of commonly expressed genes between different tissue types.
Figure 2. An interactive network plot of the esophagus enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of esophagus enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 4 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.
The esophagus shares a striking amount of transcripts (n=33) with skin, which is a tissue with highly similar squamous epithelial structure as the esophagus. Many of these skin/esophagus specific genes belong to gene-families known to be important for normal squamous epithelial function. A Gene ontology analysis on these 33 common genes shared between esophagus and skin reveal that the top shared categories are related to epidermal- and epithelial development, as well as keratinocyte- and epidermal cell differentiation.
The tonsil also has squamous epithelium components (an addition to its lymphocyte containing center) and several transcripts are shared between esophagus and tonsil (n=11). Examples of these esophagus/tonsil group-enriched proteins include the cross-linking small proline-rich proteins CornifinA (SPRR1A) and SPRR3, the Ca2+ binding protein S100A2 and the proteinase and peptidase inhibitors CSTA, CSTB, A2ML1 and SPINK7.
Several genes expressed both in skin and esophagus are previously well characterized in both tissue types, as proteins important for the normal differentiation and function of squamous epithelia, e.g. keratins including KRT -5 (KRT5), -15 (KRT15), and -31 (KRT31), and genes related to cell adhesion and squamous differentiation (e.g. desmoplakin 1 (DSP), envoplakin (EVPL), desmocollin 3 (DSC3), SLURP1 and KLK8). KRT31 and SLURP1 show restricted expression in skin and esophagus, whereas ENDOU shows additional expression in the placenta.
The Keratin 31 (KRT31) protein is a member of the keratin gene family. Keratin 31 is described as a type I hair keratin, an acidic protein which heterodimerizes with type II keratins to form hair and nails.
The secreted LY6/PLAUR domain containing 1 (SLURP1) protein is a member of the Ly6/uPAR family of proteins but lacks a GPI-anchoring signal sequence. SLURP1 is suggested to be a late differentiation protein, predominantly expressed in the granular layer of skin. Moreover, SLURP1 is identified in several biological fluids such as sweat, saliva, tears, and urine. It is thought that this secreted protein contains antitumor activity.
The esophagus is the gastrointestinal canal that connects the mouth with the stomach. In contrast to the rest of the digestive system, the esophagus does not have any absorptive or digestive functions. Anatomically, it is continuous with the back of the oral cavity and pharynx and runs downward through the diaphragm for approximately 20-30 cm until it reaches the stomach. When swallowing, food is pressed from the mouth and pharynx into the esophagus. The swallowing reflex then opens the upper esophageal sphincter muscle to allow entry of food to the esophagus and the epiglottis folds down to prevent food from entering into the trachea and respiratory organs. The smooth muscles lining the length of the esophagus then contract rhythmically to help push the food towards the lower esophageal sphincter muscle that opens to allow entry of food to the stomach. Both the upper and lower sphincter muscles are constricted by default unless swallowing or vomiting. The lower sphincter muscles also protect the esophagus from the acidic contents and digestive enzymes of the stomach.
The esophagus has the same general gross anatomical and histological organization as the rest of the gastrointestinal tract with an outer muscular layer, a submucosa, followed by a muscularis mucosa layer, followed by a lamina propria that surrounds the inner "tubing" which in the case of the esophagus consists of a stratified squamous mucinous epithelium. However, since the esophagus is located outside of the abdominal cavity it has no mesothelial covering. Instead, the outermost layer is an adventitia.
The innermost part is the esophageal epithelium, which has a quite rapid turnover of cells due to the continuous wear and tear of food ingestion. Like most epithelial tissues, cell renewal takes place in the basal part of the epithelium and as new cells are generated, older cells loose contact with the basal membrane and are pushed towards the edge to the epithelium. In the basal cell layer cells appear columnar with round nuclei, but as cells detach and are pushed towards the superficial edge of the epithelium they gradually change appearance and differentiate into flattened and tightly coupled cells.
From the inside and out, the squamous epithelium rests on the lamina propria that consists of loose connective tissue and focal lymphocytes. After this layer comes the lamina muscularis mucosae that is composed of smooth muscle cells, followed by the submucosal layer, which is composed of loose connective tissue containing mucus secreting glands, small blood vessels and lymphocytes. After the submucosa comes the tunica muscularis that is composed of an inner layer of circular muscles, followed by externally located longitudinal muscle fibers. In the third of the esophagus that is closed to the mouth, the external layer is composed of skeletal muscle, the middle third it contains a mixture of smooth and skeletal muscle, and the third closest to the stomach, it contains only smooth muscle.
The histology of human esophagus including detailed images and information can be viewed in the Protein Atlas Histology Dictionary.
Here, the protein-coding genes expressed in the esophagus are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize protein expression patterns of proteins that correspond to genes with elevated expression in the esophagus.
Transcript profiling and RNA-data analyses based on normal human tissues have been described previously (Fagerberg et al., 2013). Analyses of mRNA expression including over 99% of all human protein-coding genes was performed using deep RNA sequencing of 124 individual samples corresponding to 32 different human normal tissue types. RNA sequencing results of 3 fresh frozen tissues representing normal esophagus was compared to 121 other tissue samples corresponding to 31 tissue types, in order to determine genes with elevated expression in esophagus. A tissue-specific score, defined as the ratio between mRNA levels in esophagus compared to the mRNA levels in all other tissues, was used to divide the genes into different categories of expression.
These categories include: genes with elevated expression in esophagus, genes expressed in all tissues, genes with a mixed expression pattern, genes not expressed in esophagus, and genes not expressed in any tissue. Genes with elevated expression in esophagus were further sub-categorized as i) genes with enriched expression in esophagus, ii) genes with group enriched expression including esophagus and iii) genes with enhanced expression in esophagus.
Human tissue samples used for protein and mRNA expression analyses were collected and handled in accordance with Swedish laws and regulation and obtained from the Department of Pathology, Uppsala University Hospital, Uppsala, Sweden as part of the sample collection governed by the Uppsala Biobank. All human tissue samples used in the present study were anonymized in accordance with approval and advisory report from the Uppsala Ethical Review Board.
Relevant links and publications
Uhlén et al (2015). Tissue-based map of the human proteome. Science
PubMed: 25613900 DOI: 10.1126/science.1260419
Yu et al (2015). Complementing tissue characterization by integrating transcriptome profiling from the Human Protein Atlas and from the FANTOM5 consortium. Nucleic Acids Res.
PubMed: 26117540 DOI: 10.1093/nar/gkv608
Fagerberg et al (2014). Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics.
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600
Histology dictionary - the esophagus