Health

Diversity of human and mouse homeobox gene expression in development and adult tissues

Diversity of homeobox gene expression in human tissues and organs

To assess which tissues and organs express each homeobox gene, we mapped publicly available RNAseq data to the human genome and calculated FPKM values for every homeobox gene (Additional file 3: Table S2). Figure 1 shows relative gene expression levels (normalised to maximum expression for each gene), clustered according to expression profile (Additional file 4: Figure S2 shows the same, but with gene names). From this analysis, we compiled lists of homeobox genes with similar expression profiles across adult human tissues or preimplantation stages (Additional file 5: Tables S3–S8).

https://static-content.springer.com/image/art%3A10.1186%2Fs12861-016-0140-y/MediaObjects/12861_2016_140_Fig1_HTML.gif

A clear pattern is that most homeobox genes have moderately specific expression patterns; by this we mean that most genes have one site of maximal expression (shaded in red in Fig. 1), and few other tissues with high or moderate expression, with most tissues being negative or substantially lower. There are important exceptions, however, and we identify 20 homeobox genes with very widespread expression profiles across a large number of tissues (peach coloured categories in Fig. 1 and Additional file 4: Figure S2; listed in Additional file 5: Table S3). These widely-expressed genes include six TALE class genes, including several genes (MEIS1, MEIS2, PBX1, PBX3) whose protein products are known to form co-factor complexes with a range of partner transcription factors [21]. This role as common co-factors may explain the widespread expression we detect. Also included in the list of widely expressed homeobox genes are PRRX (PRD class), SIX5 (SINE class), CUX1 (CUT class), three members of the CERS class encoding transmembrane proteins, and eight members of the ZF class. We propose that these genes have general roles in cellular functioning. It is perhaps surprising that POU2F1 is not among the list, since this gene has formerly been described as ubiquitously expressed [22]. The reason is that elevated expression in preimplantation stages causes this gene to cluster with preimplantation-specific homeobox genes. It is striking that there are no ANTP class genes in the ‘widespread expression’ category, despite these comprising the largest homeobox class in humans (101/242 genes in the current analysis). This finding further supports the contention that ANTP class genes are primarily involved in spatial patterning during embryonic development.

Additional file 5: Tables S4 to S7 list sets of homeobox genes that show degrees of tissue specificity; these groupings are generated by expression clustering analysis. Biologically similar tissues, such as ‘neural tissues’ or ‘immune-related tissues’, form distinct groups in the analysis. Additional file 5: Table S4 (blue in Fig. 1 and Additional file 4: Figure S2) comprises genes expressed predominantly in brain and neural tissues, including cerebral cortex, corpus callosum, hippocampus, parietal lobe, amygdala, substantia nigra, foetal brain and tissues of the eye. Different homeobox genes are expressed in distinct subsets of these tissues, as shown in Additional file 5: Table S4. There are no Hox genes in this set, despite the fact that numerous studies have examined the role of Hox genes in neural patterning. However, we note that the neural RNAseq data analysed are derived predominantly from anterior brain regions whereas most studies of vertebrate Hox gene expression reveal spatial expression only in body regions posterior to the middle of the hindbrain [23]. Adult forebrain expression of Hox genes has been reported [10] but is relatively low level, explaining why this does not show as a major Hox gene expression site in our analysis. Even though Hox genes do not feature in the ‘neural-enriched’ set, it does include several other ANTP class genes including several implicated in specification and patterning of anterior brain regions in other vertebrates: BARHL1, BARHL2, EN1, EN2, TLX3, NKX6-2, NKX2-2, DLX1, DLX2, HMX1, VAX2, GSX2. Amongst the PRD class, homeobox genes in this dataset include the retinal gene CRX, the PAX6 gene which is mutated in aniridia, two human Rax genes and the two human Vsx genes.

Additional file 5: Table S5 (yellow in Fig. 1 and Additional file 4: Figure S2) includes homeobox genes predominantly expressed in immune tissues such as B-cells, T-cells, monocytes, neutrophils and bone marrow. These include several homeobox genes known to be associated with immune function notably: PAX5, somatic and germline mutations in which are associated with B-cell precursor acute lymphoblastic leukemia [24]; HLX which modulates interferon expression in T-cells [25]; SATB1, encoding a chromatin loop-associated homeodomain protein implicated in T-cell development [7]; POU2F2 required for B-cell maturation and survival [26]; VENTX involved in macrophage differentiation [27]. The inclusion of PBX2 and PBX4 in this set is more surprising and suggests further investigation. We caution, however, that the precise delineation of the ‘immune-enriched’ dataset (unlike most other tissue datasets) is sensitive to changing the FPKM cut-off used for defining expression versus background (not shown).

Additional file 5: Table S6 (pink in Fig. 1 and Additional file 4: Figure S2) comprises homeobox genes expressed predominantly in reproductive tissues and early development, specifically testis, placenta, oocyte and preimplantation embryos (zygote, 2-cell, 4-cell, 8-cell, morula, blastocyst). Several homeobox genes have already been described as characteristic of one or more of these tissues or developmental stages, and these are found in our list. Examples include NANOG and POU5F1 which are well-characterised markers of pluripotent cells and several totipotent-cell expressed PRD class genes that have been the focus of recent functional studies (ARGFX, CPHX1, CPHX2, DPRX, LEUTX, TPRX1, TPRX2, OTX1, OTX2) [19, 20]. Interestingly, many other homeobox genes also cluster in this set on the basis of their expression, including two Hox genes (HOXD1, HOXC13) indicating they are worthy of further study in this regard (Additional file 5: Table S6). Hoxd1 expression has been previously reported in preimplantation mouse and cow embryos [2830] but not to our knowledge Hoxc13; however, one of the two hoxc13 duplicates in zebrafish is expressed in early cleavage stages [31].

We refined the analysis to identify homeobox genes that are expressed only in these reproductive tissues and developmental stages (no expression??=2 FPKM in other cell types or tissues); we also added ovary to this set, as this was not grouped with them by expression clustering methodology. We identify 23 human homeobox genes that are expressed exclusively in reproductive or very early developmental tissues in this analysis (Fig. 2). Over half (13/23) have a clearly defined maximum expression level confined to a small developmental window from 8-cell to the morula stage of embryo development. Not only is the expression of these genes tightly regulated, but we note 12 of them (RHOXF2, RHOXF2B, CPHX1, CPHX2, DPRX, LEUTX, TPRX1, TPRX2, ARGFX, NANOGNB, DUXA, DUXB) are phylogenetically restricted to within eutherian mammals [19, 3234]. The correlation between tight expression specificity and similar phylogenetic distribution suggests there may have been selective pressures to co-opt novel homeobox genes to new developmental roles during the evolution of eutherian mammals. The peak of 8-cell to morula suggests these genes may combine to prepare the totipotent stages of embryonic development for subsequent cell fate specialisation. Indeed, two recent studies have postulated regulatory roles for several of these genes during early human embryo development [19, 20].

https://static-content.springer.com/image/art%3A10.1186%2Fs12861-016-0140-y/MediaObjects/12861_2016_140_Fig2_HTML.gif
Fig. 2

Heatmap showing gene expression for human homeobox genes expressed specifically in reproductive tissues, development stages and embryonic stem cells. A gene was determined to be ‘embryo or reproductive tissue-specific’ if the FPKM expression level was greater than 2 in one or more stages and below 2 in all examined adult tissues

Additional file 5: Table S7 (light green in Fig. 1 and Additional file 4: Figure S2) lists an assemblage of homeobox genes with predominant expression in particular organs system; these organs do not necessarily group together in expression clustering. For example, two genes have highest expression in gall bladder (ONECUT1, ONECUT2), several posterior Hox genes plus EVX1 and NKX3-1 associate with colon and prostate, and PDX1 is in duodenum. Other examples are given in Additional file 5: Table S7.

Additional file 5: Table S8 (dark green in Fig. 1 and Additional file 4: Figure S2) groups homeobox genes that do not have clear expression in the RNAseq datasets under study. Many of these are genes with well characterised roles in mid to late embryonic development in other vertebrates (e.g. CDX4, EVX2, GSX1, DMBX1, PAX4, PAX7); it is likely that their assignment to this category reflects the fact that in this analysis we used adult human tissues and preimplantation stages since there are few RNAseq datasets from postimplantation human development; model species such as mouse are more amenable for studying such developmental stages.