GET THE APP

International Journal of Biomedical Data Mining

International Journal of Biomedical Data Mining
Open Access

ISSN: 2090-4924

Abstract

A pipeline for ncRNA sequence reconstruction and structure characterization of potential homologs from BLAST output

Schwarz M and Panek J

Abstract

The BLAST calculation is utilized by numerous investigates as an exploratory RNA arrangement search apparatus. It is amazingly valuable, yet its yield incorporates essentially grouping data just, which isn't adequate for portrayal of arrangement sections. Subsequently we have fostered a pipeline to recognize total groupings of the parts, foresee auxiliary designs of the subject arrangements and gather their homology to the question RNA. The pipeline incorporates a few phases: 1) reconstitution of BLAST hits with secured Locarna calculation, 2) surmising of homology to the question RNA with RSEARCH calculation, 3) forecast of an auxiliary design with Centroid-homfold calculation. Our pipeline can be utilized for portrayal of ncRNAs overall by broadening data remembered for the BLAST yield. Additionally, it tends to be very helpful when homologs of uncharacterized, for example recently recognized ncRNAs should be found and for which more complex strategies for homology search can't be utilized as they require more data of the RNA in their information that isn't accessible.

Searching for similar sequences in a database via BLAST or a similar tool is one of the most common bioinformatics tasks applied in general, and to non-coding RNAs. However, the results of the search might be difficult to interpret due to the presence of partial matches to the database subject sequences. Here, we present rboAnalyzer – a tool that helps with interpreting sequence search result by (1) extending partial matches into plausible full-length subject sequences, (2) predicting homology of RNAs represented by full-length subject sequences to the query RNA, (3) pooling information across homologous RNAs found in the search results and public databases such as Rfam to predict more reliable secondary structures for all matches, and (4) contextualizing the matches by providing the prediction results and other relevant information in a rich graphical output. Using predicted full-length matches improves secondary structure prediction and makes rboAnalyzer robust with regards to identification of homology. The output of the tool should help the user to reliably characterize non-coding RNAs in BLAST output. The usefulness of the rboAnalyzer and its ability to correctly extend partial matches to full-length is demonstrated on known homologous RNAs. To allow the user to use custom databases and search options, rboAnalyzer accepts any search results as a text file in the BLAST format. The main output is an interactive HTML page displaying the computed characteristics and other context of the matches. The output can also be exported in an appropriate sequence and/or secondary structure formats.

Published Date: 2021-01-28;

Top