COMPARATIVE GENOMICS ANALYSIS OF GROWTH HORMONE (GH), INSULIN-LIKE GROWTH FACTOR 1 (IGF-1) AND MYOSTATIN (MSTN) GENE SEQUENCES IN CHICKEN, RABBIT AND SHEEP

4000.00

CHAPTER ONE

1.0 INTRODUCTION

Bioinformatics is the science of storing, extracting, organizing, analyzing, interpreting and utilizing information from biological sequences and molecules (Khalid, 2010). Bioinformatics is often defined as the application of computational techniques to understand and organize the information associated with biological macro-molecules (Luscombe et al., 2001). It has been mainly fueled by advances in DNA sequencing and mapping techniques (Khalid, 2010). Over the past few decades, rapid developments in genomic, other molecular research technologies and information technologies have combined to produce a tremendous amount of information related to molecular biology. The primary goal of bioinformatics is to increase the understanding of biological processes (Khalid, 2010).As biology is increasingly becoming a technology-driven science, databases have become indispensable to store not only data, but also the results of experiments generated by different research projects around the world (Hey et al., 2009). A biological database is a collection of information, or data from a biological system, stored in a computer readable format. Some databases are also called data repositories if they function as a place where large biological datasets can be stored and retrieved by users. Sharing of data between scientists accelerates the speed of discoveries and has the potential to greatly advance a scientific field as a whole (this is known as the Fourth Paradigm of Data-Driven Scientific Discovery (Hey et al., 2009). There are two types of biological databases: public databases that are freely accessible on-line, and private databases that require payment before you can access them (Dutilh and Keșmir, 2016).

The genome of a species encodes genes and other functional elements, interspersed with non-functional nucleotides in a single uninterrupted string of DNA (IHGSC, 2001).

Recognizing protein-coding genes typically relies on finding stretches of nucleotides free of stop codons called Open Reading Frames (ORFs) that are too long to have likely occurred by chance. Since stop codons occur at a frequency of roughly 1 in 20 random sequence, ORFs of at least 60 amino acids will occur frequently by chance (5% under a simple Poisson model), and even ORFs of 150 amino acids will appear by chance in a large genome (0.05%). This poses a huge challenge for higher eukaryotes in which genes are typically broken into many, small exons (on average 125 nucleotides long for internal exons in mammals (IHGSC, 2001).

Some regions within a protein sequence are more conserved than others during evolution (Dutilh and Keșmir, 2016). These regions are generally important for the function of a protein and/or the maintenance of its three dimensional structure, or other features related to its localization or modification. By analyzing constant and variable properties of such groups of similar sequences, it is possible to derive a signature for a protein family or domain, which distinguishes its members from other unrelated proteins by sequence alignment, which allows us to discover these signatures (Dutilh and Keșmir, 2016). Sequence alignment is defined as the bioinformatics task of locating equivalent regions of two or more sequences, and aligning their nucleotide or amino acid residues side by side, to maximize their similarity (Dutilh and Keșmir, 2016). Multiple sequence alignments allow for identification of conserved sequence regions. This is very useful in designing experiments to test and modify the function of specific proteins, in predicting the function and structure of proteins, and in identifying new members of protein families (Dutilh and Keșmir, 2016).

DNA Sequencing is a technique/method by which the exact order of nucleotides within a DNA molecule is determined (Mayor et al., 2000). Comparative data analysis provides the opportunity to determine what is shared and what is unique to each species (Mayor et al., 2000).

Growth in animals is controlled by a complex system, in which the somatotropic axis plays a key role. The genes that operate in the somatotropic axis are responsible for the postnatal growth, mainly GH that acts on the growth of bones and muscles mediated by IGF-1 (Sellier, 2000). The growth hormone (GH) and insulin-like growth factor 1 (IGF-1) genes are candidates for growth in bovine, since they play a key role in growth regulation and development (Hossner et al., 1997; Tuggle and Trenkle, 1996). Effects of GH on growth are observed in several tissues, including bone, muscle and adipose tissue. These effects result from both direct action of GH on the partition of nutrients and cellular multiplication and IGF-1-mediated action stimulating cell proliferation and metabolic processes associated to protein deposition (Boyd and Bauman, 1989). IGF-1 stimulates protein metabolism and is important for the function of some organs, being considered a factor of cellular proliferation and differentiation (Andreaet al., 2005). Polymorphisms in GH gene have been used as a genetic marker associated with different performances and productions traits such as body weight, birth weight and weaning weight in goat (Wickramaratne et al., 2010), The rabbit GH gene has already been sequenced by Wallis and Wallis (1995) and has been investigated as a gene associated with market weight of commercial rabbit (Fontanesi et al., 2012). Mutations of this GH gene have been described in goats (Malveiro et al., 2001), and poultry (Feng et al., 1997) to affect important production traits.

In chickens divergently selected for high or low growth rates, there were significantly higher IGF-1 mRNA levels in the high growth rate line than in the low growth rate line (Beccavin, et al., 2001). The growth hormone receptor (GHR), insulin-like growth factor-1 (GH-IGF-1) system controls the number of follicles in animals that are recruited to the rapid growth phase (Roberts et al., 1994; Monget, et al., 2002). It is also known that the GH-IGF-1 system has been modified as a result of selection for enhanced growth rate (Ballard et al., 1990; Ge et al., 2001). The insulin-like growth factor gene (IGF1) is a candidate gene for growth, body composition and metabolism, skeletal characteristics and growth of adipose tissue and fat deposition in chickens (Zhou et al., 2005). Earlier research on GHR, IGF-1 and IGFBP-3 in cattle, goats and chickens showed genetic polymorphisms and their association with production traits (Liu et al., 2010). The IGF1 gene is essential for normal embryonic and postnatal growth in mammals (Bian et al., 2008).

Myostatin (MSTN), previously called Growth differentiation factor 8 (GDF8), is a member of transforming growth factor-β (TGF-β) superfamily. It is a negative regulator for both embryonic development and adult homeostasis of skeletal muscle (Tu et al., 2014). Myostatin (MSTN) is a negative regulator of the muscle growth factor, which belongs to the transforming growth factor beta superfamily (McPherron et al., 1997). It is able to negatively control the growth of muscle cells by inhibiting the transcriptional activity of MyoD family members. Its expression is negatively correlated with muscle weight (Weber et al., 2005). Mutations in the myostatin gene have also been shown to cause doublemuscling in humans and other species (Clop et al., 2006). These findings suggest that strategies for inhibiting myostatin function may be applied to improve animal growth. Homozygote and heterozygote cattle with mutations of the MSTN gene-conserved Ribbon bases exhibit the advantage of strong muscle in increase birth weight, and obvious double-hip muscle characteristics (Casas et al., 1999). As the candidate gene in pig double-hip muscle, the MSTN gene has an important impact on the amount of lean meat and fat deposition (Sonstegard et al., 1998). The rabbit is a high quality and efficient meat producing livestock as well as a common experimental animal. Therefore, providing

information on its genetic basis and regulation mechanism of skeletal muscle growth and development has an important theoretical and practical significance (Qiao, 2014). The effects of the SNPs of myostatin gene on chicken growth in a F2 resource population are associated with increase in abdominal fat weight, abdominal fat percentage, birth weight and breast muscle percentage (Zhiliang et al., 2004). Notably, these data suggest that myostatin could be an ideal molecular marker for marker-assisted selection for skeletal muscle and adipose growth in chicken breeding program. It was reported that TTTTA deletion phenomenon occurred in MSTN gene was unique for goats when compared with sheep, cattle, water buffalo, domestic yak, pigs, and humans (Grisolia et al., 2009; Zhang et al., 2013) Khichar et al. (2016) found an important effect of a 5-base pair (bp) deletion onearly body weight and size of a goat.

1.1 Justification

Identification of a candidate gene is a powerful method for understanding the direct genetic basis involved in the expression of quantitative traits and their differences between individuals (Rothschild and Soller, 1997; Nagaraja et al., 2000). Mutations of the MSTN gene-conserved region bases in chicken, rabbit and goat will lead to the activation or inhibition of the gene expression product and the loss or increase in function or inhibiting muscle growth, which will result in excessive muscle development and expression (Lee and McPherron, 1999). Indeed, there have been several recent examples in which comparative sequence data have led to the discovery and understanding of function of previously undefined genes. The complete human/mouse orthologous-sequence dataset proved particularly valuable in the characterization of gene families in humans and mice (Dehal et al., 2001). For instance, by comparing olfactory receptor gene families on human chromosome 19, computational analysis indicated that humans have approximately 49 olfactory receptor genes, but only 22 had maintained an open reading frame and appeared functional. This contrasts with the vast majority of the homologous mouse genes that have retained an open reading frame. This finding of reduced olfactory receptor diversity in humans is consistent with the reduced olfactory needs and capabilities of humans relative to rodents (Pennacchio and Rubin, 2003).

Growth hormone gene (GH) a single polypeptide produced in the anterior pituitary gland is a promising candidate gene marker for improving milk and meat production in goats and other farm animals (Min et al., 2005). IGF1 is a mediator of many biological effects; it increases the absorption of glucose, stimulates myogenesis and production of progesterone, inhibits apoptosis, participates in the activation of cell cycle genes, increases the synthesis of lipids, and intervenes in the synthesis of DNA, protein, RNA , and in cell proliferation (Mohammadi et al., 2011)

The increasing availability of genomic sequence from multiple organisms has provided biomedical scientists with a large dataset for orthologous-sequence comparisons. The rationale for using cross-species sequence comparisons to identify biologically active regions of a genome is based on the observation that sequences that perform important functions are frequently conserved between evolutionarily distant species, distinguishing them from nonfunctional surrounding sequences. (Pennacchio and Rubin, 2003). Sequence alignment is a good way of predicting the function of a gene or protein. Moreover, sequences contain a lot more information, such as from which organism the gene or protein is derived, and what are the evolutionary relationships of the gene or species with other genes or species. Much of this information can only be discovered by finding homologs of the gene or protein in other species (Dutilh and Keșmir, 2016).

To justify this study, a comparative genomics analysis to access the similarities and differences between these three growth genes; Growth hormone (GH), Myostatin (MSTN) and Insulin-like growth factor-1 (IGF-1) gene among chicken, rabbit, and sheep will identify the similarities or differences in the rate of increase in growth and body size to maturity, final body size at maturity, and body conformation at maturity. The analysis of sequences conserved between these three species will further enrich available information of biologically active sequences in these species.

COMPARATIVE GENOMICS ANALYSIS OF GROWTH HORMONE (GH), INSULIN-LIKE GROWTH FACTOR 1 (IGF-1) AND MYOSTATIN (MSTN) GENE SEQUENCES IN CHICKEN, RABBIT AND SHEEP