We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Improvement of detection performance of fusion genes from RNA-seq data by clustering short reads.
- Authors
Sota, Yoshiaki; Seno, Shigeto; Shigeta, Hironori; Osato, Naoki; Shimoda, Masafumi; Noguchi, Shinzaburo; Matsuda, Hideo
- Abstract
Fusion genes are involved in cancer, and their detection using RNA-Seq is insufficient given the relatively short reading length. Therefore, we proposed a shifted short-read clustering (SSC) method, which focuses on overlapping reads from the same loci and extends them as a representative sequence. To verify their usefulness, we applied the SSC method to RNA-Seq data from four types of cell lines (BT-474, MCF-7, SKBR-3, and T-47D). As the slide width of the SSC method increased to one, two, five, or ten bases, the read length was extended from 201 bases to 217 (108%), 234 (116%), 282 (140%), or 317 (158%) bases, respectively. Furthermore, fusion genes were investigated using STAR-Fusion, a fusion gene detection tool, with and without the SSC method. When one base was shifted by the SSC method, the reads mapped to multiple loci decreased from 9.7% to 4.6%, and the sensitivity of the fusion gene was improved from 47% to 54% on average (BT-474: from 48% to 57%, MCF-7: 49% to 53%, SKBR-3: 50% to 57%, and T-47D: 43% to 50%) compared with original data. When the reads are shifted more, the positive predictive value was also improved. The SSC method could be an effective method for fusion gene detection.
- Subjects
GENE fusion; CELL lines
- Publication
Journal of Bioinformatics & Computational Biology, 2019, Vol 17, Issue 3, pN.PAG
- ISSN
0219-7200
- Publication type
Article
- DOI
10.1142/S0219720019400080