Item type |
Trans(1) |
公開日 |
2012-04-19 |
タイトル |
|
|
タイトル |
A Method for Isoform Prediction from RNA-Seq Data by Iterative Mapping |
タイトル |
|
|
言語 |
en |
|
タイトル |
A Method for Isoform Prediction from RNA-Seq Data by Iterative Mapping |
言語 |
|
|
言語 |
eng |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
[Original Paper] RNA-Seq, alternative splicing, isoform, mapping(Outstanding Paper Award、優秀論文賞受賞) |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_6501 |
|
資源タイプ |
journal article |
著者所属 |
|
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者所属 |
|
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者所属 |
|
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者所属 |
|
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者所属(英) |
|
|
|
en |
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者所属(英) |
|
|
|
en |
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者所属(英) |
|
|
|
en |
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者所属(英) |
|
|
|
en |
|
|
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University |
著者名 |
Tomoshige, Ohno
Shigeto, Seno
Yoichi, Takenaka
Hideo, Matsuda
|
著者名(英) |
Tomoshige, Ohno
Shigeto, Seno
Yoichi, Takenaka
Hideo, Matsuda
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Alternative splicing plays an important role in eukaryotic gene expression by producing diverse proteins from a single gene. Predicting how genes are transcribed is of great biological interest. To this end, massively parallel whole transcriptome sequencing, often referred to as RNA-Seq, is becoming widely used and is revolutionizing the cataloging isoforms using a vast number of short mRNA fragments called reads. Conventional RNA-Seq analysis methods typically align reads onto a reference genome (mapping) in order to capture the form of isoforms that each gene yields and how much of every isoform is expressed from an RNA-Seq dataset. However, a considerable number of reads cannot be mapped uniquely. Those so-called multireads that are mapped onto multiple locations due to short read length and analogous sequences inflate the uncertainty as to how genes are transcribed. This causes inaccurate gene expression estimations and leads to incorrect isoform prediction. To cope with this problem, we propose a method for isoform prediction by iterative mapping. The positions from which multireads originate can be estimated based on the information of expression levels, whereas quantification of isoform-level expression requires accurate mapping. These procedures are mutually dependent, and therefore remapping reads is essential. By iterating this cycle, our method estimates gene expression levels more precisely and hence improves predictions of alternative splicing. Our method simultaneously estimates isoform-level expressions by computing how many reads originate from each candidate isoform using an EM algorithm within a gene. To validate the effectiveness of the proposed method, we compared its performance with conventional methods using an RNA-Seq dataset derived from a human brain. The proposed method had a precision of 66.7% and outperformed conventional methods in terms of the isoform detection rate. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Alternative splicing plays an important role in eukaryotic gene expression by producing diverse proteins from a single gene. Predicting how genes are transcribed is of great biological interest. To this end, massively parallel whole transcriptome sequencing, often referred to as RNA-Seq, is becoming widely used and is revolutionizing the cataloging isoforms using a vast number of short mRNA fragments called reads. Conventional RNA-Seq analysis methods typically align reads onto a reference genome (mapping) in order to capture the form of isoforms that each gene yields and how much of every isoform is expressed from an RNA-Seq dataset. However, a considerable number of reads cannot be mapped uniquely. Those so-called multireads that are mapped onto multiple locations due to short read length and analogous sequences inflate the uncertainty as to how genes are transcribed. This causes inaccurate gene expression estimations and leads to incorrect isoform prediction. To cope with this problem, we propose a method for isoform prediction by iterative mapping. The positions from which multireads originate can be estimated based on the information of expression levels, whereas quantification of isoform-level expression requires accurate mapping. These procedures are mutually dependent, and therefore remapping reads is essential. By iterating this cycle, our method estimates gene expression levels more precisely and hence improves predictions of alternative splicing. Our method simultaneously estimates isoform-level expressions by computing how many reads originate from each candidate isoform using an EM algorithm within a gene. To validate the effectiveness of the proposed method, we compared its performance with conventional methods using an RNA-Seq dataset derived from a human brain. The proposed method had a precision of 66.7% and outperformed conventional methods in terms of the isoform detection rate. |
書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AA12177013 |
書誌情報 |
IPSJ Transactions on Bioinformatics(TBIO)
巻 5,
p. 27-33,
発行日 2012-04-19
|
ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
1882-6679 |
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |