GET THE APP

Comparative Studies of Vertebrate Mitochondrial Carbonic Anhydrase (CA5) Genes and Proteins: Evidence for Gene Duplication in Mammals with CA5A Being Liver Specific and CA5B Broadly Expressed and Located on the X-Chromosome
Journal of Data Mining in Genomics & Proteomics

Journal of Data Mining in Genomics & Proteomics
Open Access

ISSN: 2153-0602

Research Article - (2020) Volume 11, Issue 1

Comparative Studies of Vertebrate Mitochondrial Carbonic Anhydrase (CA5) Genes and Proteins: Evidence for Gene Duplication in Mammals with CA5A Being Liver Specific and CA5B Broadly Expressed and Located on the X-Chromosome

Roger S Holmes*
 
*Correspondence: Roger S Holmes, Griffith Research Institute for Drug Discovery (GRIDD) and School of Environment and Science, Griffith University, Nathan, QLD 4111, Australia, Email:

Author info »

Abstract

At least fifteen families of mammalian carbonic anhydrases (CA) (E.C. 4.2.1.2) catalyse the hydration of carbon dioxide and related functions. CA5A and CA5B genes encode distinct mitochondrial enzymes and perform essential biochemical roles, including ammonia detoxification and glucose metabolism. Bioinformatic methods were used to predict the amino acid sequences, secondary structures and gene locations for CA5A and CA5B genes and proteins using data from vertebrate genome projects. CA5A and CA5B genes usually contained 7 coding exons for each of the vertebrate genomes examined. Human CA5A and CA5B subunits contained 305 and 317 amino acids, respectively, with key amino acid residues including mitochondrial transit peptides; three Zinc binding sites (His130, His132, His155); and a Tyr164 active site. Phylogenetic analyses of vertebrate CA5 gene families suggested that it is an ancient gene in vertebrate evolution which had undergone a gene duplication event in a mammalian ancestral genome forming the CA5A and CA5B gene families in monotreme, marsupial and eutherian mammals. CA5A was predominantly expressed in liver whereas CA5B had a wide tissue distribution profile, was localized on the X-chromosome and was more highly conserved during mammalian evolution.

Keywords

Mitochondrial enzymes; Vertebrate genome; Vertebrate; Carbon dioxide

Introduction

At least fifteen families of mammalian carbonic anhydrases (CA) (E.C. 4.2.1.2) catalyse the hydration of carbon dioxide and related functions, and are involved in a range of biological functions, including respiration, bodily fluid formation, calcification, regulating acid base balance and bone reabsorption [1-5]. CA genes are differentially expressed in the body and have diverse tissue and subcellular distribution profiles [2,4,5]. These include the CA5A and CA5B genes, which encode distinct mitochondrial enzymes and perform essential biochemical roles, including ammonia detoxification and glucose metabolism, and have similar 3D structures with the other CA isozymes [6-20]. Targeted mutagenesis studies have shown that both enzymes play important metabolic roles although Ca5a ‘null’ mice showed more significant deleterious effects than CA5B ‘null’ mice, with Ca5A/Ca5b double knockouts showing more substantial effects [7]. Genetic analyses of human CA5A and CA5B have shown that these enzymes are encoded by distinct genes, with CA5A and CA5B localized on separate chromosomes, chromosome 16 and the X-chromosome, respectively [8,20]. Moreover, genetic variants of CA5A in human populations have caused hyperammonia in early childhood [20] and mutagenesis studies of zebrafish CA5 suggested that this enzyme regulates acid-base homeostasis in this organism [14].

This paper reports the predicted amino acid sequences, gene locations and exon structures for CA5-like vertebrate genes and proteins, including primates (human (Homo sapiens) and baboon (Papio anubis), other eutherian mammals (mouse (Mus musculus), rat (Rattus norvegicus) and cow (Bos Taurus)), a marsupial mammal (opossum) (Monodelphis domestica), a monotreme mammal (platypus) (Ornithorhynchus anatinus)), and representatives of birds (chicken (Gallus gallus)), reptiles [alligator (Alligator mississippiensis)), frogs (Xenopus tropicalis and Xenopus laevis), fish [zebra fish (Danio rerio), medaka (Oryzias latipes) and coelacanth (Latimeria chalumnae)) and elephant shark (Callorhinchus millii). The phylogenetic and evolutionary relationships of these genes and enzymes are described with a hypothesis for a gene duplication event for an ancestral mammalian CA5 gene, generating CA5A and CA5B genes, which are separately localized on mammalian genomes, including the X-chromosome for CA5B, and are differentially expressed in tissues of the body.

Methods

Vertebrate CA5 gene and protein identification

BLAST (Basic Local Alignment Search Tool) studies were undertaken using web tools from the National Center for Biotechnology Information (NCBI) [21]. Protein BLAST analyses used the human and mouse CA5A and CA5B amino acid sequences deduced from reported sequences for these genes [7,8,15]. Nonredundant protein sequence databases for several mammalian and other vertebrate genomes were obtained using the blastp algorithm for the following genome sequences: human (Homo sapiens) [22]; baboon (Papio anubis) [23], cow (Bos Taurus) [24]; mouse (Mus musculus); [25] rat (Rattus norvegicus); [26] opossum (Monodelphis domestica); [27] platypus (Ornithyrinchus anatinus); [28] chicken (Gallus gallus); [29] alligator (Alligator mississippiensis); frog (Xenopus tropicalis); [30] and zebrafish (Danio rerio) [31].

BLAT analyses were subsequently undertaken for each of the predicted CA5-like amino acid sequences using the UC Santa Cruz web browser [32] to obtain the predicted locations for each of the vertebrate CA5-like genes, including exon boundary locations and gene sizes . Structures for human CA5A and CA5B isoforms were obtained using the AceView website to examine predicted gene and protein structures to interrogate this database of human mRNA sequence [33] (Table 1).

Vertebrate Species CA Gene Transcript ID* Exons UNIPROT Amino
    Gene Location   (Strand) ID acids
Human Homo sapiens CA5A 16:87,888,132-87,936,450 L19297 7 (-) P35216 305
    CA5B X:15,750,024-125,782,661 BC028142 7 (+) Q9Y2D0 317
Baboon Papio anubis CA5A 20:69,860,214-69,909,041 *XP_003917341 7 (-) A0A5F7ZRP5 307
    CA5B X:13,133,143-13,166,842 *XP_003917488 7 (+) A0A096N1V1 317
Mouse Mus musculus CA5A 8:121,916,367-121,944,793 BC030174 7 (-) P23589 299
    CA5B X:163,979,194-164,014,944 BC034413 7 (-) Q9QZA0 317
Rat Rattus norvegicus CA5A 19:54,732,112-54,761,585 BC088147 7 (-) P43165 304
    CA5B X:32,244,503-32,290,304 BC081872 7 (+) Q66HG6 317
Cow Bos Taurus CA5A 18:13,345,984-13,368,318 *NP_001179338 7 (-) na 310
    CA5B X:127,727,046-127,743,377 *NP_001074377 7 (-) na 317
Opossum Monodelphis domestica CA5A 1:693,776,094=693,836,235 *XP_007477337 7 (-) F6W8Y6 298
    CA5B 7:24,269,877-24,307,423 *XP_007500932 7 (-) F6XEV7 313
Platypus Ornithorhynchus anatinus CA5A ^DS180954v1:2,170,173-2,206,662 *XP_001508964.3 7 (-) na 307
    CA5B ^DS181337v1:8,449,952-8,469,447 *XP_028935902.1 7 (+) na 315
Chicken Gallus gallus CA5 11:17,993,314-18,005,353 *XP_414195 7 (-) F1N986 314
Alligator Alligator mississippiensis CA5 ^JH738261:154,749-178,224 *XP_019346721 7 (-) A0A151ND60 306
Tropical frog Xenopus tropicalis CA5 4:68,200,520-68,227,829 *XP_012816720 7 (-) Q28BX3 309
Clawed toad Xenopus laevis CA5 4L:54,653,647-54,676,100 *XP_018112094.1 7 (-) Q6NTY3 311
Zebra fish Danio rerio CA5 25:12,798,498-12,808,955 *NP_001104671 7 (-) na 310
Medaka Oryzias latipes CA5 6:19,186,720-19,195,404 *XP_004069790 7 (+) na 314
Coelacanth /em> CA5 ^JH126597:1,194,229-1,249,345 *XP_014340058 7 (-) H3B5R4 310
Shark Callorhinchus milii CA5 ^KI635866:5,082,728-5,091,173 *XP_007887550 7 (+) A0A4W3J095 290

Table 1: Mammalian CA5A and CA5B and other vertebrate CA5 genes and proteins

Predicted structures and properties of vertebrate CA5 subunits

Alignments of predicted CA5-like amino acid sequences and estimates of sequence identities were undertaken using a ClustalW method [34]. Predicted secondary structures for vertebrate CA5 subunits were obtained using alignments with the reported tertiary structures for mouse CA5A9 and CA5B [10]. Predictions of the CA5A, CA5B and CA5 protein N-terminal sequences serving as mitochondrial targeting peptides, and the cleavage site for this peptide, were undertaken using MITOPROT [35].

Human CA5A and CA5B gene expression and predicted gene regulation sites

The human genome browser was used to examine predicted CpG islands [36] and transcription factor binding sites (TFBS) (ORegAnno IDs: Open Regulatory Annotations) [37] for human CA5A and CA5B using the UC Santa Cruz Genome Browser [32]. The GTEx web browser was used to examine the human tissue expression profiles for CA5A, CA5Aps and CA5B [38].

Phylogenetic studies and sequence divergence

Mammalian CA5A and CA5B and other vertebrate CA5 sequences were subjected to phylogenetic analysis using the http://www. phylogeny.fr/ portal to enable alignment (MUSCLE), curation (Gblocks), phylogeny (PhyML) and tree rendering (TreeDyn) to reconstruct phylogenetic relationships [39]. Mammalian sequences were identified as members of the CA5A or CA5B (mitochondrial) groups, whereas non-mammalian vertebrate sequences were identified as members of the CA5 group.

Results and Discussion

Alignments and biochemical features of vertebrate CA5 amino acid sequences

Amino acid sequence alignments for opossum (marsupial) CA5A and CA5B, chicken and zebrafish CA5 amino acid sequences are shown in Figure 1, together with the previously reported sequences for human and mouse CA5A and CA5B [7,8,15]. The vertebrate CA5-like sequences exhibited >50% identities, suggesting that these protein subunits are products of the same gene family (Table 2).

data-mining-genomics-sequences

Figure 1. Amino acid sequence alignments for vertebrate CA5A, CA5B and CA5 sequences
See Table 1 for sources of CA5A, CA5B and CA5 sequences; * identical residues; : 1 or 2 conservative substitutions; . 1 or 2 non-conservative substitutions; active site His residues for binding Zn2+; catalytic active site residues; helices H1, H2 etc; sheets B1 B2 etc; conserved serine and threonine residues are involved in substrate binding; bold underlined font shows predicted exons (numbered) junctions; and mitochondrial transit peptides are shown.

  Human CA5A Mouse CA5A Opossum CA5A Human CA5B Mouse CA5B Opossum CA5B Chicken CA5 Zebra fish CA5
Human CA5A 100 72 67 59 59 61 64 50
Mouse CA5A 72 100 63 57 57 57 63 50
 Opossum CA5A 67 63 100 63 62 67 66 54
Human CA5B 59 59 63 100 89 80 69 54
Mouse CA5B 59 57 62 89 100 79 69 53
Opossum CA5B 61 57 67 80 79 100 74 57
Chicken CA5 64 57 66 69 69 74 100 58
Zebra fish CA5 50 50 54 54 53 57 58 100

Table 2: Percentage identities for mammalian CA5A and CA5B and other vertebrate CA5 amino acid sequences. Numbers show the percentage of amino acid sequence identities. Numbers in bold show higher sequence identities for more closely related CA5 family members.

Amino acid sequences for the mammalian CA5-like proteins examined contained 299-310 (CA5A) and 313-317 (CA6B) residues, whereas the lower vertebrate CA5 sequences examined contained 290-314 residues. The elephant shark CA5 was the smallest among the vertebrate CA5-like proteins examined with 290 amino acid residues (Table 1). The N-termini showed the lowest levels of sequence identity among the sequences examined perhaps due to the presence of the transit peptide for facilitating mitochondrial localization (Figure 1).

X-ray crystallographic studies for mouse CA5A and CA5B have enabled the identification of key structural and catalytic residues among those aligned for vertebrate CA-like sequences examined (Figure 1) [7,15]. These included mouse Tyr94, Tyr158 and Tyr161 which were identified as catalytic residues; His124, His126 and His149, which were responsible for chelating the Zinc residue at the active site; and 229Thr-230Thr, involved in substrate binding. Tyr 94 is conserved for all of the vertebrate CA5 sequences examined, whereas Tyr158 underwent a substitution with phenylalanine for human and mouse CA5B; and Tyr161 underwent a similar phenylalanine substitution in the mammalian CA5B and lower vertebrate CA5 sequences examined. Genetic substitution of mouse CA5A Ser233 with Pro 233 resulted in markedly reduced activity, [20] which suggests a significant role in catalysis for this conserved amino acid.

Predicted gene locations, exon structures and tissue expression for vertebrate CA5 genes

Table 1 and Figure 1 summarize the predicted locations and exon structures for vertebrate CA5-like genes based upon BLAT interrogations of several vertebrate genomes using the sequences for human [6,8] and mouse [7] CA5A and CA5B, and the predicted sequences for other vertebrate CA5 subunits (Table 1) and the UC Santa Cruz Web Browser [32]. Vertebrate CA5 genes contained 7 coding exons with the predicted exon start sites in identical or similar positions. Figure 2 describes the tissue expression profiles for CA5A and CA5B, as well as a CA5A pseudogene (CA5Aps) [38] (Figure 2).

data-mining-genomics-expression

Figure 2. Comparative tissue expression levels for human CA5A, CA5Aps and CA5B
RNA-seq gene expression profiles across 53 selected tissues (or tissue segments) were examined from the public database for human CA5, CA5Aps and CA5B based on expression levels for 175 individuals38 (http://www.gtex.org). Tissues: 1. Adipose-Subcutaneous; 2. Adipose-Visceral (Omentum); 3. Adrenal gland; 4. Artery-Aorta; 5. Artery-Coronary; 6. Artery- Tibial; 7. Bladder; 8. Brain-Amygdala; 9. Brain-Anterior cingulate Cortex (BA24); 10. Brain-Caudate (basal ganglia); 11. Brain-Cerebellar Hemisphere; 12. Brain- Cerebellum; 13. Brain-Cortex; 14. Brain-Frontal Cortex; 15. Brain-Hippocampus; 16. Brain- Hypothalamus; 17. Brain-Nucleus accumbens (basal ganglia); 18. Brain- Putamen (basal ganglia); 19. Brain-Spinal Cord (cervical c-1); 20. Substantia nigra; 21. Breast-Mammary Tissue; 22. Cells-EBV-transformed lymphocytes; 23. Cells- Transformed fibroblasts; 24. Cervix-Ectocervix; 25. Cervix-Endocervix; 26. Colon-Sigmoid; 27. Colon-Transverse; 28. Esophagus-GastroesophagealBrain-Junction; 29. Esophagus-Mucosa; 30. Esophagus-Muscularis; 31. Fallopian Tube; 32. Heart-Atrial Appendage; 33. Heart-Left Ventricle; 34. Kidney-Cortex; 35. Liver; 36. Lung; 37. Minor Salivary Gland; 38. Muscle-Skeletal; 39. Nerve-Tibial; 40. Ovary; 41. Pancreas; 42. Pituitary; 43. Prostate; 44. Skin-Not Sun Exposed (Suprapubic); 45. Skin-Sun Exposed (Lower leg); 46. Small Intestine-Terminal Ileum; 47. Spleen; 48. Stomach; 49. Testis; 50. Thyroid; 51. Uterus; 52. Vagina; 53. Whole Blood. TPM, Transcripts per million, calculated from a gene model with isoforms collapsed to a single gene. Box plots show a median and 25th and 75th percentiles; points are shown as the outliers if they are above or below 1.5 times the interquartile range.

Human CA5A was predominantly expressed in liver, as compared with CA5Aps which was detected at low levels only in human testis, with both genes located on chromosome 16: CA5A covering 48.5kb from 87,970,135-87,921,623 on the reverse strand, while CA5Aps covered 17.5kb from 29,618,785-29,636,328, also on the reverse strand of chromosome 16. The human CA5B gene comprised two consecutive components (CA5BP1 and CA5B) located on the plus strand of the X-chromosome (15,693,048-15,806,528) (Figure 3) and was expressed with a broad tissue distribution profile, with highest levels of expression observed in testis and arteries (Figure 2). Comparative levels of total median expression for these genes showed nearly 6 times higher levels for human CA5B as compared with CA5A, but with human liver showing > 2 times the liver specific expression of the CA5A gene as compared with tissue specific expressions of the CA5B gene.

data-mining-genomics-isoforms

Figure 3. Gene structures for human CA5A and CA5B
From AceView website33 http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/ The major isoforms are shown for the CA5A and CA5B transcripts; note that the CA5B transcript was split into 2 components with CA5B encoding the CA5B enzyme and CA5BP1 encoding a pseudogene; capped 5’- and 3’- ends for the predicted mRNA sequences are identified; a predicted CpG island (CpG47), a gene enhancer regulatory element (GERE)43; and transcription factor binding sites (EGR1,40 HNF4A,41 CEBPA,42 STAT3,44 FOXH1,45 ZEB146) are shown. The numbers of nucleotides separating exons are also shown.

Figure 3 presents the predicted structures of human CA5A and CA5B gene transcripts [33] (Figure 3).

There were 7 coding exons for the CA5A isoform 2 precursor mRNA (RefSeq:NM_001367225.1) sequence which contained several transcription factor binding sites in the 5’ region: an early growth response gene (EGR1), involved in regulating synaptic plasticity [40]; hepatocyte nuclear factor 4A (HNF4A), a nuclear transcription factor which controls the expression of several hepatic genes [41]; and CCAAT enhancer binding protein alpha (CEBPA), a transcription factor that coordinates proliferation arrest and the differentiation of hepatocytes [42]. There were also 7 coding exons for the human CA5B gene on the X-chromosome, which included a gene enhancer regulatory element (GERE) at the 5’end of the gene [43], located near CpG47 and several transcription factor binding sites: STAT3 (a tyrosine phosphorylated transcription factor) [44]; EGR140 (shared with CA5A); FOXH1 (a fork head DNA binding transcription factor which is essential for development of the anterior heart field) [45]; and ZEB1 (Zinc finger E-box-binding homeobox 1, which is required for neural differentiation of human embryonic stem cells [46].

Phylogeny of primate CA5A and CA5B and vertebrate CA5 sequences

A phylogenetic tree (Figure 4) was constructed from alignments of mammalian CA5A and CA5B amino acid sequences with other lower vertebrate CA5 sequences, with representatives of bird (chicken), reptile (alligator), amphibian (frogs), fish and a shark species. The phylogram showed clustering into 2 major groups of the mammalian CA5 sequences consistent with previous reports for mammalian CA5A and CA5B genes and enzymes [7-11], [15-20]. In contrast, evidence was obtained for single copies of CA5 genes and enzymes among lower vertebrates, including the Xenopus laevis (clawed toad) species, for which multiple copies of genes have been reported due to the genome undergoing tetraploidization [47]. The phylogram suggested an ancestral relationship between lower vertebrate CA5 and mammalian CA5, with the latter gene undergoing a gene duplication event resulting in the appearance of CA5A and CA5B genes, in the ancestor leading to the appearance of monotremes, and subsequently marsupial and eutherian mammals. Monotremes have been described as arising from primitive birds which diverged from marsupials and eutherians about 163 to 186 Ma (million years ago) [48]. Platypus and opossum genomic sequences have been reported [27,28] and incorporated into the genome browser, enabling identification of CA5A and CA5B-like genes and enzymes sequences in these species. Moreover, this study of other vertebrate CA5-like genes and enzymes is consistent with CA5 being an ancient gene present throughout vertebrate evolution, including sharks, fish, frogs, reptiles and birds, which has undergone a major gene duplication event during the emergence of mammals, generating the separate CA5A and CA5B evolutionary pathways (Figure 4).

data-mining-genomics-vertebrate

Figure 4. Phylogenetic tree of mammalian CA5A and CA5B sequences and other vertebrate CA5 sequences
The tree is labeled with the CA5 gene name and the name of the vertebrate; note the major clusters include the lower vertebrate CA5 group and two groups for the mammalian CA5A and CA56B enzymes; a gene duplication event generating the mammalian CA5A and CA5B gene families is proposed to have occurred in a mammalian CA5 ancestral gene leading to the formation of the monotreme, marsupial and eutherian mammal groups. A genetic distance scale is shown. The number of times a clade (sequences common to a node or branch) occurred in the bootstrap replicates are shown. Only replicate values of 0.9 or more which are highly significant are shown. 100 bootstrap replicates were performed in each case. Note the higher level of sequence conservation observed for the eutherian mammalian CA5B sequence. Sequences were derived from those reported in Table 1.

It may be noted however that the CA5B gene is consistently located on the X-chromosome in eutherian mammalian genomes but is located on chromosome 7 (an autosome) in the opossum genome (Monodelphis domestica) (Table 1). This may reflect on the evolution of the mammalian X-chromosome which is highly conserved among eutherians due to suppression of recombination between X and Y chromosomes [49,50].

Phylogenetic relationships among primate CA5A (Figure 5 and Table 3) and CA5B (Figure 5 and Table 4) were examined using known and predicted genomic and enzyme sequences for 15 primates which are representative of species separated by > 40 million years of primate evolution [51]. Both phylograms separated into 3 distinct groups, with species representative of Hominidae (great apes, including humans and related species); old world monkeys (including rhesus, baboons and related species); and Cebidae (marmosets and squirrel monkeys).

data-mining-genomics-genetic

Figure 5. Phylogenetic trees of primate CA5A and CA5B sequences
The trees are labeled with the CA5 gene name (CA5A for upper phylogram; and CA5B for lower phylogram) and the name of the primate; note 3 major clusters in each case for primates which are closely related phylogenetically; genetic distance scales are shown for each enzyme, with CA5A showing ~3 times larger genetic distances than for CA5B. The number of times a clade (sequences common to a node or branch) occurred in the bootstrap replicates are shown. Only replicate values of 0.9 or more which are highly significant are shown. 100 bootstrap replicates were performed in each case. Sequences were derived from those reported in Tables 3 and 4.

Primate Species Gene Transcript Exons UNIPROT Amino
    Location ID* (Strand) ID acids
Human Homo sapiens 16:87,888,132-87,936,450 L19297 7 (-) P35216 305
Chimp Pan troglodytes 16:73,578,402-73,625,475 *XP_523486.2 7 (-) H2QBP6 305
Gorilla Gorilla gorilla ^CYUI01015509v1:365,282-412,997 *XP_030858903.1 7 (-) na 304
Orang-utan Pongo abelii 16:66,031,718-66,080,856 *XP_024089455.1 7 (-) H2NRR0 304
Gibbon Nomascus leucogenys 2:161,000,175-161,047,105 *XP_030675794.1 7 (-) na 308
Baboon Papio anubis 20:69,860,214-69,909,041 *XP_003917341.1 7 (-) A0A096N1V1 307
Green monkey Chlorocebus sabaeus 5:73,284,646-73,332,986 *XP_007992511.1 7 (-) na 307
Gelada monkey Theropithecus gelada na *XP_025226548.1 na na 307
Rhesus macaque Macaca mulatta 20:74,907,317-74,958,053 *XP_014982249.2 7 (-) A0A5F7ZRP5 307
Pig-tailed macaque Macaca nemestrina 20:76,285,953-76,343,242 *XP_011751584.1 7 (-) na 307
Crab-eating macaque Macaca fascicularis 20:76,285,953-76,343,288 *XP_005592801.1 7 (-) A0A2K5VV27 307
Golden snub-nosed monkey Rhinopithecus roxellana ^KN299711v1: 1,242,256-1,292,347 *XP_010353882.2 7 (+) A0A2K6RLY9 307
Squirrel monkey Saimiri boliviensis ^JH378111:45,180,556-45,231,845 *XP_003922881.1 7 (-) na 307
Capuchin monkey Cebus capucinus na *XP_017399192.1 na na 308
Marmoset Callithrix jacchus 20:42,330,747-42,381,907 *XP_002761295.2 7 (-) na 305

Table 3: Primate CA5A genes and proteins

    Location ID* (Strand) ID acids
Human Homo sapiens X:15,750,024-15,782,661 BC028142 7 (+) Q9Y2D0 317
Chimp Pan troglodytes X:15,712,460-15,742,580 BC028142 7 (+) H2R4T4 317
Gorilla Gorilla gorilla ^CYUI01014975v1:11,824,638-11,857,316 *XP_005063896.1 7 (+) na 317
Orang-utan Pongo abelii X:12,309,344-12,341,641 *XP_002831460.1 7 (+) H2NRR0 317
Gibbon Nomascus leucogenys X:13,770,923-13,804,629 *XP_030663014.1 7 (+) G1RE91 317
Baboon Papio anubis X:13,133,143-13,166,842 *XP_003917488 7 (+) A0A096NAK3 317
Green monkey Chlorocebus sabaeus X:14,209,163-14,239,996 *XP_007989305.1 7 (+) A0A0D9RQX6 317
Gelada monkey Theropithecus gelada na *XP_025228522.1 na na 317
Rhesus macaque Macaca mulatta X:15,449,497-15,482,570 *XP_014982249.2 7 (+) I0FMW0 317
Pig-tailed macaque Macaca nemestrina 20:76,285,953-76,343,242 *XP_011751584.1 7 (+) A0A2K5WNP4 317
Crab-eating macaque Macaca fascicularis X:13,576,554-13,609,605 EHH60731.1 7 (+) A0A2K5WNT1 317
Golden snub-nosed monkey Rhinopithecus roxellana ^KN296004v1:1268-16370 *XP_030789461.1 7 (+) na 317
Squirrel monkey Saimiri boliviensis ^JH378105:71,210,293-71,210,293 *XP_003920396.1 7 (+) A0A2K6S8K1 317
Capuchin monkey Cebus capucinus na *XP_017374775.1 na A0A2K5RPU0 317
Marmoset Callithrix jacchus X:13,917,420-13,953,682 *XP_002762698.1 7 (+) F7A852 317

Table 4: Primate CA5B genes and proteins

Conclusion

Phylogeny studies examined several vertebrate CA5 subunits and demonstrated that this is an ancient gene in vertebrate evolution which appears to have undergone a gene duplication event in a mammalian ancestral gene prior to the appearance of monotreme, marsupial and eutherian genomes, generating 2 distinct related mitochondrial CA5 genes and enzymes, CA5A and CA5B. These enzymes have been shown to play key roles in ammonia detoxification and glucose metabolism, with similar 3D structures to other CA isozymes.

Acknowledgement

The advice of Dr Laura Cox of the Centre for Precision Medicine, Wake Forest School of Medicine, Winston Salem NC USA is gratefully acknowledged.

Conflict of Interest

The author reports no conflicts of interest.

References

Author Info

Roger S Holmes*
 
Griffith Research Institute for Drug Discovery (GRIDD) and School of Environment and Science, Griffith University, Nathan, QLD 4111, Australia
 

Citation: Holmes RS (2020) Comparative Studies of Vertebrate Mitochondrial Carbonic Anhydrase (CA5) Genes and Proteins: Evidence for Gene Duplication in Mammals with CA5A Being Liver Specific and CA5B Broadly Expressed and Located on the X-Chromosome. J Data Mining Genomics Proteomics 10:223. doi: 10.35248/2165-7556.20.11.223.

Received Date: Jun 04, 2020 / Accepted Date: Jun 19, 2020 / Published Date: Jun 25, 2020

Copyright: © 2020 Holmes RS. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Top