+44 1223 790975
Stevia rebaudiana Bertoni, due to its nutritional importance and its emerging use as natural sweetener, is gaining attention worldwide. To better understand the gene regulatory network of this plant, we have initiated high-throughput small RNAs sequencing to discover novel microRNAs (miRNAs). miRNAs are a class of short endogenous non-coding small RNA molecules of about 18-22 nucleotides in length whose main function is to down regulate gene expression in different manners like translational repression, mRNA cleavage and epigenetic modification. We constructed S. rebaudiana sRNA library and after sequencing it using illumina genome analyzer II, a total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained. From these reads, twelve 12 novel miRNAs were predicted whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been described earlier in stevia or other plant species. Putative target genes were predicted for most novel miRNAs which include mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Our result has increased the number of miRNAs in stevia, which should be useful for further investigation into the biological functions and evolution of miRNAs in stevia and other plant species.<
Keywords: Stevia rebaudiana; Small RNA sequencing; miRNAs; Novel, Target prediction
miRNAs are a class of endogenous, small, noncoding, ~22 nucleotides single-stranded RNAs that act as posttranscriptional regulators in eukaryotes . They have reported to be located mostly within noncoding regions of genomes, and usually transcribed from RNA polymerase II promoters [2,3]. The generation of mature miRNA is a complicated enzyme-catalyzed process, from the initial transcript pri-miRNA to the precursor (pre-miRNA) with a characteristic hairpin structure, then a miRNA duplex (miRNA:miRNA*) . In the end, the mature miRNAs function within large complexes to negatively regulate specific target mRNAs. Perfect complementarity between miRNA: mRNA strand generally results in cleavage, such as in plants, whereas imperfect base-pairing leads to translational repression [4,5]. MiRNA genes represent about 1%-2% of the known eukaryotic genomes and constitute an important class of fine-tuning regulators that are involved in several physiological or disease-associated cellular processes . Plant miRNAs target a large number of genes with functions in a range of development processes, including meristem cell identity [7-9], leaf organ morphogenesis and polarity [8,10,11] and floral differentiation and development [12,13]. miRNAs are also reported to be involved in plant responses to biotic and environmental stresses [14-18].Increasing evidences showed that miRNA repertoire of plant or animal species includes a set of conserved (ancient, abundant) and non-conserved or novel (species specific, recently evolved) miRNAs . However, the biggest challenge in plants is to identify novel miRNAs and to understand their mode of action, role in various metabolic processes. The availability of high throughput next generation sequencing (NGS) technologies such as 454 and Illumina have further revolutionized sRNA discovery. Whereas, transcribed sequences such as ESTs has led to the identification of only conserved (abundant) miRNAs or miRNAs previously identified in other species, on the other hand, NGS provides high throughput tools to make new discoveries of additional species specific or lowly expressed miRNAs in plants irrespective of whether their genome is sequenced or not, e.g. Arabidopsis [17-19], Oryza sativa , Populus trichocarpa , Triticum aestivum , Brachypodium distachyon , Vitis vinifera , Arachis hypogaea , Citrus trifoliate  , Carthamus tinctorius  and many more. NGS strategy for discovery of miRNAs may be successful not only for these plant species of full genomic and sufficient EST database available, but as well as for those with incomplete genomic information but with sufficient EST sequences available that is the case of our model plant Stevia rebaudiana. The latest miRBase release (v20, June 2013) contains 24,521 microRNA loci from 206 species, processed to produce 30,424 mature microRNA products. The rate of deposition of novel microRNAs and the number of researchers involved in their discovery continue to increase, driven largely by sRNA deep sequencing experiments. Stevia rebaudiana Bertoni (family Asteraceae) is a perennial herb which accumulates up to 30% (w/w leaf dry weight) diterpenoid steviol glycosides (SGs) [28,29]. SGs are glucosylated derivatives of diterpenoid alcohol steviol. Stevioside and rebaudioside A are the major SGs found in stevia. Sweetness indices of these and other related compounds ranges between 30 to 300 times higher than that of sucrose  and are used as non-calorific sweeteners in many countries of the world. In addition to being a sweetener, stevia has been suggested to exert beneficial effects on human health and is considered to have anti-hyperglycemic, anti-hypersensitive, anti-oxidant, anti-tumor/anti- inflammatory, immunomodulatory, antidiarrheal, antimicrobial and anti-rotavirus activities . Stevioside has been reported to lower the postprandial blood glucose level of Type II diabetes patients and blood pressure in mildly hypertensive patients [31,32]. SGs being important compounds of stevia, imparting medicinal properties to the plant, so the molecular studies done on this plant were mainly focused on understanding the biosynthesis and regulation of the genes involved in SG biosynthesis. A control over the regulation of these early and late genes can help to manipulate the diterepenoid contents . Further, the unraveling of sRNA guided circuitry in stevia has further enhanced the value of its gene and EST information and improved our ability to devise strategies to enhance certain essential features of stevia that are less amenable to functional genomics analysis leading to its enhanced nutritive value. The identification of miRNA and their targets is important not only to help us learn more about the roles of miRNAs in stevia development and physiology but also to provide a framework for further designing RNAi based experiments for regulation of gene expression in this species. In our previous study, we generated a sRNA library of S. rebaudiana and identified 100 miRNAs belonging to 34 highly conserved families along with 12 novel miRNAs . Genome of S. rebaudiana has not been sequenced, thus, it was not possible to perform an extensive study to discover stevia miRNAs using computational analysis with this limited number of available sequences. In recent time, nucleotide sequence information on Stevia has increased with the availability of 29,874 cDNA sequences at NCBI database. Thus we used our sRNA library to search new sequences and this led to identification of 12 putative novel miRNAs along with their precursors. A scan of the S. rebaudiana EST and nucleotide databases with novel miRNA sequences revealed their putative targets.
Identification of novel stevia miRNAs
In this study, we predicted 12 potential novel miRNAs based on bioinformatics analysis using the stevia EST (5646) and nt (29874) sequences available in NCBI database. Using 36 cycled single end sequencing by genome analyzer II, a total of 30,472,534 sequences were obtained from smRNAome of S. rebaudiana. After removing the low quality and adaptor sequences, 17,295,850 sequences were obtained. Among these sequences 15,327,722 sequences ranged from 16-28 nt in length. Then, we removed the sequencing tags that can be annotated as non- coding RNAs including tRNA or rRNA’s and then filtered the known miRNA sequences from further analysis. A total of 6,075,552 redundant sequences survived the above filtration steps and 627,678 unique sequences were analyzed for novel miRNA prediction. These unique sRNA sequences were mapped to EST and nt sequences by searching all S. rebaudiana sRNA’s using the BLAST standalone program. The generated Blast output file showed the following characteristics as given in Table 1. Out of the total 70,081,886 hits to the database (length: 3,637,171) containing 5,646 EST and 29,874 nucleotide sequences (Total: 35,520), sequences better than e-value 10 i.e. 3,186,710 sequences were taken into further consideration (Table 1). Out of these sequences, sequences showing exact matches i.e. 2401 sequences were filtered for further secondary structure prediction. A distinguishing feature of miRNAs is the ability of their pre miRNA sequences to adopt canonical stem loop hairpin structure. These above unique sRNA sequences were then mapped to EST and nt sequences by searching all S. rebaudiana sRNA’s and predicting the secondary structure of sequences surrounding them (±150bp) using RNA fold annotation tool provided in the UEA sRNA tool kit. We identified additional 12 novel miRNA sequences that satisfied the secondary structure and other criterions (Table 2). The hairpin structures of all the novel miRNAs are given in supplementary file S1. A sRNA is considered as a potential miRNA candidate only if it meets the following strict criteria as defined by Zhang et al. ; 1) the sequence could fold into an appropriate stem loop hairpin secondary structure; 2) the sRNA sits in one arm of hairpin structure; 3) no more than 6 mismatches between mature miRNA sequence and its opposite miRNA* sequence in the secondary structure; 4) no loop or break in the miRNA or miRNA* sequences; 5) predicted secondary structure has negative minimal folding free energy; 6) minimum A+U% (30-70%) as mature miRNAs contain more A+U nucleotides than G+C and 7) usually begins with 5’ U which is one of the characteristic feature of miRNAs.
|No. of Sequences||35,520|
|No. of Hits to DB||70,081,886|
|No. of extensions||25,687,690|
|No. of successful extensions||25,687,690|
|No. of sequences better than 10.0||3,186,710|
|No. of HSP's gapped||25,681,316|
|No. of HSP's successfully gapped||25,681,237|
|Length of database||3,637,171|
Table 1: Characteristics of BLAST output file.
|miRNA||Sequence||Length||Clones||EST/nt accession no.||Strand||A+U content||Mismatch||miRNA start||miRNA end||EST/nt start||EST/nt end||e-value||Precursor|
Table 2: Summary of novel miRNA candidates identified in S. rebaudiana: Clones: No. of times the miRNA repeats itself in library; EST/nt accession number: Accession no. of EST or gene showing complementarity with miRNA; Strand: miRNA complimentary or reverse complimentary to EST or nt; miRNA start- miRNA end: stretch of miRNA length or nt showing complementarity with stevia ESTs or nt; EST/nt start-EST/nt end: Stretch of EST or gene showing complementarity with miRNA; Mismatch: No. of mismatched nt in the complementarity region between miRNA and EST or gene of stevia; Precursor: stem loop secondary structure of respective miRNA predicted (YES) or not (NO) using Vienna RNA package; A+U content: Percentage of sum of adenine and uracil in the miRNA sequence.
|Novel miRNA||Target accession||Score||Target annotation||miRNA: target alignment||Target inhibition||UPE|
|Stv_14||BG525503||3.0||PSRP1 [Citrus sinensis]||miRNA 20 UUUAAGUCCUAGUAAAAAGU 1
::: ::::: ::::::::.:
Target 130 AAAAUCAGGUUCAUUUUUUA 149
|BG526779||3.0||3-hydroxyisobutyryl-CoA hydrolase-like [Glycine max]||miRNA 20 UUUAAGUCCUAGUAAAAAGU 1
Target 638 AAAUUCAUUAUUAUUUUUUA 657
|miRNA 20 UUUAAGUCCUAGUAAAAAGU 1
::::: ::::::::::: ::
Target 279 AAAUUAAGGAUCAUUUUACA 298
|GANE01015186||3.0||Pyridoxal phosphatase [R. communis]||miRNA 20 UUUAAGUCCUAGUAAAAAGU 1
Target 264 AAAUUUAUGAUUAUUUUUCU 283
|Stv_16||BG522304||3.5||Elongation factor Ts [Morus notabilis]||miRNA 20 AGUAGUUAGACUCAGGGAUU 1
:: :::::.::: ::.:::.
Target 439 UCCUCAAUUUGAUUCUCUAG 458
|BG526206||4.0||Magnesium-chelatase subunit chlI family protein [P. trichocarpa]||miRNA 20 AGUAGUUAGACUCAGGGAUU 1
:: :::::::: :::.: ::
Target 270 UCCUCAAUCUGUGUCUCAAA 289
|BG526183||4.0||Zinc finger A20 and AN1 domain-containing [Triticum urartu]||miRNA 21 UAGUAGUUAGACUCAGGGAUU 1
:::::::: . :::::::.:
Target 171 AUCAUCAACUCAAGUCCCUGA 191
|Stv_17||BG524980||3.0||Histone H3.3 [A. thaliana]||miRNA 21 UCUUCAGUAGUGUCUAUGUCU 1
::::: : :::::::::::::
Target159 AGAAGCC-UCACAGAUACAGA 178
|BG524941||4.0||hypoxanthine-guanine phosphoribosyltransferase, putative [R. communis]||miRNA 20 CUUCAGUAGUGUCUAUGUCU 1
Target300 GAAGAUAUUGUGGAUACAGG 319
|Stv_18||BG524022||3||hypersensitive-induced response protein
|miRNA 19 CGGUCUUACAGUAGCACAA1
target 405 GUUUGAAUGUCCUCGUGUU424
|BG524910||3||hypersensitive-induced response protein 1- [Oryza brachyantha]||miRNA 19 CGGUCUUACAGUAGCACAA1
target 608 GUUUGAAUGUCCUCGUGUU627
|Stv_19||BG523996||4.0||TMV resistance protein [Prunus persica]||miRNA 21 AUGUACCUACGAAAAUACAUU 1
::.::: .::: : :::::::
Target342 UAUAUGUGUGCGUGUAUGUAA 362
|BG522961||4.0||lipid transfer protein [S. rebaudiana]||miRNA 20 UGUACCUACGAAAAUACAUU 1
.:: ::::: ::::::::.
Target 24 GCAAGGAUGGUUUUAUGUGU 43
|Stv_21||BG523327||4.0||probable receptor-like protein kinase At5g24010-like [Solanum lycopersicum]||miRNA 22 UCGCUCAAGACACAAGCAAUUA 1
: : ::::.:: ::.::::::
Target 75 AACUAGUUUUGCUUUUGUUAAU 396
|BG525651||4.0||GATA transcription factor[Solanum lycopersicum]||miRNA 22 UCGCUCAAGACACAAGCAAUUA 1
: : ::::.:: ::.::::::
Target515 AACUAGUUUUGCUUUUGUUAAU 536
|Stv_22||BG523289||4.0||cholinephosphate cytidylyltransferase, putative
|miRNA 20 ACAGCUCUUUAGCAAAGGUC 1
: : ::::::::::::::
Target232 GGCCAAGAAAUCGUUUCCAA 251
|BG522421||4.0||Elongation factor P family protein [Theobroma cacao]||miRNA 20 ACAGCUCUUUAGCAAAGGUC 1
:::.:::::: : :::: ::
Target318 UGUUGAGAAAACUUUUCGAG 337
|Stv_24||GANE01025305||3.5||unnamed protein [Vitis vinifera]||miRNA 20 UUCCGAGCAUUCAGGUAGCC 1
Target 784 UGGGCUUGUAGGUUCAUUGG 803
|GANE01015947||3.5||50S ribosomal protein [Vitis vinifera]||miRNA 21 GUUCCGAGCAUUCAGGUAGCC 1
.:.::::: :.. ::::::::
Target508 UAGGGCUCUUGGUUCCAUCGG 528
|>GANE01022382||4.0||Alpha/beta-Hydrolasesuperfamily protein isoform 1
|miRNA 21 GUUCCGAGCAUUCAGGUAGCC 1
:::::: ::::: .:::.::
Target267 CAAGGCGCGUAAAGUCAUUGG 287
Table 3: Putative target genes of novel miRNAs identified using psRNATarget and TAPIR programs.
Prediction of novel miRNA targets
miRNA target prediction in plants is easier due to high and significant complementarities of miRNA- mRNA targets and targets of most plant miRNAs possesses a single perfect or near perfect complementary site in coding region . Assuming this to be generally the case, the Stevia transcript library (EST and nucleotide databases) was searched for complementarity with the sequences of identified novel miRNAs using the two target prediction web server’s psRNATarget and TAPIR. Almost all the targets predicted through psRNA target server were further authenticated by TAPIR which showed the same targets. Further, TAPIR predicted additional putative targets. A deep insight into miRNA targets helps us in understanding the range of sRNA expression, regulation and their functional importance. Further, multiple targets were predicted for only 8 out of 12 novel miRNAs were predicted using stevia transcript library. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways which include structural proteins, elongation factors, protein kinases, heat shock proteins, dehydrogenases, transferases, synthases etc. Using stevia transcript library as reference library Stv_14 targets PSRP1 which is a ribosome-binding factor, 3-hydroxyisobutyryl- CoA hydrolase, 6-phosphofructokinase and pyridoxal phosphatase involved in various physiological processes (Table 3). Stv_16 targets elongation factor Ts involved in translation, magnesium-chelatase chII family protein, and zinc finger A20 and AN1 domain-containing stress-associated protein 8 involved in various stress responses (Table 3). Further, stv_17 targets Histone H3 involved in DNA structure and hypoxanthine-guanine phosphoribosyl transferase and stv_18 targets hypersensitive induced protein which is responsible for protein histidine kinase binding in certain stress responses (Table 3). Further, stv_19 predicted targets include lipid transfer protein and certain hypothetical protein of unknown function. Stv_21 is predicted to target receptor protein like kinase and GATA transcription factor and Stv_22 targets some interesting genes i.e. cholinephosphate cytidylyltransferase and Elongation factor P involved in translational machinery. Lastly, Stv_24 is predicted to target, alpha/beta-hydrolases super family protein, unnamed protein and a 50S ribosomal protein involved in ribosome biogenesis (Table 3). Each of these novel miRNAs were found to bind to some other genes but they bound those genes outside their respective ORF (Open reading frame) regions, so they were not considered as their respective targets.Further, mode of action of all the novel miRNAs on targets was found to be both through translation and cleavage but predominantly through cleavage as is the case seen in other plants [37,38]. No targets were predicted for stv_13, stv_15, stv_20 and stv_23 using S. rebaudiana transcript library as reference. So, targets of only 8 out of 12 novel miRNAs were predicted speculating that rest of the miRNA may be typically present in stevia and needs further experimental work to locate their gene targets. The above described information regarding predicted targets could be utilized efficiently to assess the regulatory roles of these novel miRNAs in stevia. Further if any of these putative novel miRNA is found to have control over the regulation of the early and late genes in SGs biosynthetic pathway as seen in case of conserved miRNA 414  can help to manipulate the diterepenoid contents.
Computational approaches are successful in identifying novel or species specific miRNAs in many plants, but they require knowledge of the complete genome sequence, which is unavailable for most plant species. This has resulted into a sort of knowledge skew where most of the miRNAs have been reported only for those species whose genomic sequences are available or homologous sequences are known. Only 16% of total reported miRNAs in miRBase are from species whose genome is not sequenced. Also, the majority of these 16% miRNAs has been identified using homology search and exhibit a very few species specific miRNAs . However, large genomic fragmented data in the form of GSSs, high-throughput genomics sequences (HTGSs), nonredundant nucleotides (NRs), as well as ESTs, are available for several plant species and can be used for identification of novel miRNAs. Thus, barring a few model organisms, there is almost negligible miRNA information for most of the species. In our case, although the S. rebaudiana full genome sequence is not available, the large number of stevia expressed sequence tags and nucleotide sequences are an excellent source for precursor identification. Further, the development of NGS technology has greatly improved the capacity to identify low abundance or tissue-specific miRNAs, and has enhanced the discovery of several conserved, nonconserved or lowly expressed miRNAs through cloning and deep sequencing of sRNA and transcriptome libraries in A. thaliana [40,41], Triticum aestivum (wheat; , Solanum lycopersicum (tomato; ), Oryza sativa (rice; ), and Manihot esculenta (Cassava; ). But still a large number of plant species are unexplored and stevia was one of them until 2012 when in our pioneer study we identified 34 conserved miRNA families and 12 novel miRNA families using deep sequencing . A vast survey of miRNAs in stevia will provide useful information to elucidate their physiological functions, gene regulation, biogenesis and evolutionary roles in plants. The objective of this work was, thus, to identify and catalogue more novel miRNAs in S. rebaudiana. The identification of miRNA and, subsequently, their targets will lay the foundation for unraveling the complex miRNA-mediated regulatory networks controlling development and other physiological processes. In our study, we predicted 12 new novel miRNAs and their targets whose precursors were potentially generated from stevia EST and nt sequences. All novel sequences have not been earlier described in other plant species. According to the criterion mentioned by Zhang et al  to be considered as a as a potential miRNA candidate, all of the predicted 12 novel miRNAs i.e. Stv_12 to Stv_24 folded into perfect stem loop hairpin secondary structure in which the miRNA sequence was placed on one arm of the hairpin structure. Further, satisfied the other criteria with no more than 6 mismatches and no break or loop in the miRNA and miRNA* sequences (Table 2 and Supplementary Figure S1). Further, all the predicted novel miRNAs possessed A+U% within the range of 42.1 -75% with Stv_14 showing the highest A+U% i.e. 75% . Lastly, the criteria which says that miRNAs usually begins with 5’ U was satisfied as 5 of the 12 novel miRNAs i.e. Stv_13, Stv_14, Stv_16, Stv_17 and Stv_19 have 5’ U as the starting nucleotide (Table 2).The predicted novel miRNAs exhibited much lower expression levels, consistent with the notion that non-conserved miRNAs are often expressed at lower levels than conserved miRNAs . Similarly, in our case, except Stv_20 with 3 clones all the rest 11 novel miRNAs were sequenced only once or twice (Table 2). The low abundance of novel miRNAs might suggest a specific role for these miRNAs under various growth conditions, in specific tissues, or during specific developmental stages. This suggests miRNAs identified in stevia might represent only a meager portion of novel miRNAs due to the fact that sRNA library was constructed from young plant leaves under normal conditions. In general, novel miRNAs represents either lineage-specific or species specific miRNAs are expressed at low levels. Whether these low abundance miRNAs are expressed at higher levels in other tissues and organs, or whether they are regulated by environmental stresses, remain to be further investigated. Furthermore, based on BLASTn searches and hairpin structure prediction, we found potential precursors for all the 12 novel miRNAs. Lacking of genomic information might be the main reason that leads to the identification of only 12 novel miRNAs. Their mature sequences were aligned with all the miRNAs registered in the miRNA database miRBase (http://www.mirbase.org/)  and PMRD (http:// bioinformatics. cau.edu.cn/PMRD/) , CSRDB (http://sundarlab. ucdavis.edu/ smrnas/) and non-redundant sequences in Genebank. There was no homolog matching with them, which implied the 12 miRNAs found in our data are novel and might be stevia-specific miRNAs and could play more species-specific roles. The high degree of sequence complementarity between plant miRNAs and their target mRNAs has facilitated the bioinformatic prediction of miRNA targets. Plant miRNAs have been predicted or confirmed to regulate a wider variety of developmental and physiological processes than animal miRNAs . With these increasing evidence, it is concluded that miRNAs regulatory impact on plants is more pervasive than was previously suspected. miRNA target prediction in plants is easier due to high and significant complementarities of miRNA- mRNA targets and targets of most plant miRNAs possesses a single perfect or near perfectcomplementary site in coding region . Assuming this to be generally the case, the Stevia transcript library (EST and nucleotide databases) was searched for complementarity with the sequences of identified novel miRNAs using the two target prediction web server’s psRNATarget (http://bioinfo3.noble.org/psRNATarget/) and TAPIR (http:// bioinformatics.psb.ugent.be/webtools/tapir). Almost all the targets predicted through psRNA target server were further authenticated by TAPIR which showed the same targets. Further, TAPIR and psRNA target program individually predicted additional putative targets. A deep insight into miRNA targets helps us in understanding the range of sRNA expression, regulation and their functional importance. To analyze probable sRNA targets is significant in plants because the complimentary sites of potential sRNA can exist anywhere along the target mRNA rather than at 3’UTR in case of animals. As usually the targets are conserved between plant species but to know specific targets in our model plant S. rebaudiana we predicted putative target genes for 8 out of 12 stevia specific or novel miRNAs taking stevia transcript library as reference library. Similarly, mode of action of miRNAs on targets was found to be predominantly through cleavage which is usually the case seen in plants [37,38]. Stv_14 targets 3-hydroxyisobutyryl-CoA hydrolase which participates in 3 metabolic pathways: valine, leucine and isoleucine degradation, beta-alanine metabolism, and propanoate metabolism and ribosome binding factor PRSP1 involved in translational machinery. Stv_16 targets Zinc finger A20 and AN1 domain-containing stress-associated protein involved in environmental stress responses induced by cold, dehydration and salt stresses. Further, Stv_17 targets an interesting gene i.e. Histone H3 which is one of the five main histone proteins involved in the structure of chromatin in eukaryotic cells . Stv_18 and Stv_19 are predicted to target certain stress proteins i.e. TMV resistance protein and hypersensitive induced protein respectively proving the indispensable roles of miRNAs in various stress responses. Further as miRNAs regulation is seen at both transcriptional and translational level , stv_21 is found to target GATA transcription factors which are a family of transcription factors characterized by their ability to bind to the DNA sequence “GATA”  and Stv_22 on the other hand targets Elongation factor P , an important component of translation machinery. Lastly, Stv_24 is also predicted to regulate certain important proteins involved in translation i.e.50S ribosomal protein for the biogenesis of the ribosomes involved in protein synthesis. Further, mode of action of all the novel miRNAs on targets was found to be both through translation and cleavage but predominantly through cleavage as in reported in all other plants [1,6]. No targets were predicted for stv_10, stv_11, stv_12 and stv_23 using S. rebaudiana transcript library as reference library. Target of only 8 out of 12 novel miRNAs were predicted speculating that rest of the miRNA may be typically present in stevia and needs further experimental work to locate their gene targets. The above described information regarding predicted targets could be utilized efficiently to assess the regulatory roles of these novel miRNAs in stevia. Further if any of these putative novel miRNA is found to have control over the regulation of the early and late genes in SGs biosynthetic pathway can help to manipulate the diterepenoid contents. As an evolutionary important plant genus, stevia might contain a certain amount of novel genes, including miRNA genes. With the availability of more genomic sequence information on stevia genus, more miRNAs will be identified in this species. The cloning and identification of these genes and figuring out their regulation relationships would be very helpful for exploiting new genes and regulatory pathways and their evolution in plant. The identification of miRNA and their targets is important not only to help us learn more about the roles of miRNAs in stevia development and physiology but also to provide a framework for further designing RNAi based experiments for regulation of gene expression in this species.
Plant Material, RNA isolation, sRNA library preparation and sequencing
Plants of S. rebaudiana were grown and maintained in Punjab University, Chandigarh, India. Young leaves were harvested by snap freezing method and stored at -80 ºC till further use. Total RNA enriched with sRNA fraction was isolated using protocol developed by Ghawana et al.  combined with miRNA easy spin kit (Qiagen, Germany) as mentioned in Mandhan et al (2012) . RNA samples were sent to the Microarray and Genomic Core Facility, Huntsman cancer institute, University of Utah, Salt Lake City, USA, for preparation of the sRNA library and sequencing using the Genome Analyzer II (Illumina, USA) .
Processing of raw sequences: To analyze the sRNA data, online version of the UEA sRNA toolkit (http://srna-tools.cmp.uea.ac.uk/)  was used. Toolkit provides a package of various tools for the analysis of high-throughput sRNA data including the sequence preprocessing tool which aids in conversion of FASTQ to FASTA format, removal of adaptor sequences, extracting sequences of defined size range and their abundance calculation. Further, filter tool was used to remove t/r- RNAs (non-coding) sequences from further analysis by comparing the t/rRNA sequences available at Rfam database, genomic tRNA database and EMBL release 95. Finally, sRNAs in the size range of 19nt to 24nt were extracted which constitutes the ideal miRNA size range.
Prediction of novel miRNAs: Stevia 5548 ESTs and 29874 nt sequences were downloaded from the NCBI database and conserved miRNA sequences were removed from the above processing file using Filter tool available in the UEA sRNA toolkit and the rest were used to perform BLASTN searches against stevia sequences in order to obtain precursors for identifying potential novel miRNAs. The selected EST and nt sequences which showed significant complementarity with the sRNA in the dataset were then folded into a secondary structure using RNA fold annotation tool provided in the UEA sRNA tool kit which uses Vienna RNA package to fold miRNA precursors using the minimum free energy algorithm  and yields a single optimal structure. The sRNA was considered to be a novel miRNA if a perfect stem loop structure was formed with sRNA sequence at one arm of the stem as well as keeping other criterion given by Meyers et al.  under consideration.
Target gene prediction: The potential targets of novel miRNAs were predicted using the psRNATarget program  and secondly checked with new web server called TAPIR , designed for the prediction of plant miRNA targets. psRNA target program was executed with parameters at their default settings i.e. score or maximum expectation at 3, length of complementary scoring at 20, target accessibility at 25, flanking length around target 17 bp upstream and 13 bp downstream and keeping range of central mismatch leading to translational inhibition in between 9-11. Newly identified novel miRNA sequences were used as custom miRNA sequences and S. rebaudiana transcript library (EST and nucleotide databases) was used as custom plant database. All predicted target genes were evaluated under the psRNA target server by scoring, and the criteria used were as follows: each G:U wobble pairing was assigned 0.5 points, each indel was assigned 2.0 points, and all other non-canonical Watson-Crick pairings were assigned 1.0 point each. The total score for an alignment was calculated based on 20 nt. When the query exceeded 20 nt length, scores for all possible consecutive 20 nt subsequences were computed, and the minimum score was considered the total score for the query-subject alignment. Because targets complementary to the miRNA 5’ end appear to be critical, mismatches other than G: U wobbles at positions 2-7 at the 5’ end were further penalized by 0.5 points in the final score . Sequences were considered to be miRNA targets if the total score was less than 3.0 points. For further validation of the above miRNA targets TAPIR was used keeping the score cutoff value (default 4) and the free energy ratio cutoff value (default 0.7) using FASTA search engine. TAPIR indicates several parameter values (names, free energy ratio, start position of the duplex on the mRNA, seed and non-seed mismatches, gaps and G-U pairs) including a full representation of the miRNA-mRNA duplex with an alignment string. Sensitivity and specificity tests for the TAPIR fast method was done with psRNA target server using various parameters. All the results showed that TAPIR with the FASTA search engine (score cutoff 4) has a higher rate of true positive, while keeping the false positives to values that are similar to those of psRNA target . Once potential target mRNA sequences were obtained, BLAST was performed using these target sequence and the NCBI database as reference to predict functions of potential targets.
The authors declare no conflict of interests.
Authors are thankful to Dr. Brian Dalley, Director, Microarray and Genomic Core Facility, Huntsman cancer institute, University of Utah, Salt Lake City-UT, USA for providing services for library preparation and GAII sequencing. Vibha Mandhan is thankful to University Grants commission, India for providing research fellowship.