Improved genome assembly and evidence-based global gene model set for the chordate Ciona intestinalis

New insight into intron and operon populations

Yutaka Satou, Katsuhiko Mineta, Michio Ogasawara, Yasunori Sasakura, Eiichi Shoguchi, Keisuke Ueno, Lixy Yamada, Jun Matsumoto, Jessica Wasserscheid, Ken Dewar, Graham B. Wiley, Simone L. Macmil, Bruce A. Roe, Robert W. Zeller, Kenneth E M Hastings, Patrick Lemaire, Erika Lindquist, Toshinori Endo, Kohji Hotta, Kazuo Inaba

Research output: Contribution to journalArticle

128 Citations (Scopus)

Abstract

Background: The draft genome sequence of the ascidian Ciona intestinalis, along with associated gene models, has been a valuable research resource. However, recently accumulated expressed sequence tag (EST)/cDNA data have revealed numerous inconsistencies with the gene models due in part to intrinsic limitations in gene prediction programs and in part to the fragmented nature of the assembly. Results: We have prepared a less-fragmented assembly on the basis of scaffold-joining guided by paired-end EST and bacterial artificial chromosome (BAC) sequences, and BAC chromosomal in situ hybridization data. The new assembly (115.2 Mb) is similar in length to the initial assembly (116.7 Mb) but contains 1,272 (approximately 50%) fewer scaffolds. The largest scaffold in the new assembly incorporates 95 initial-assembly scaffolds. In conjunction with the new assembly, we have prepared a greatly improved global gene model set strictly correlated with the extensive currently available EST data. The total gene number (15,254) is similar to that of the initial set (15,582), but the new set includes 3,330 models at genomic sites where none were present in the initial set, and 1,779 models that represent fusions of multiple previously incomplete models. In approximately half, 5′-ends were precisely mapped using 5′-full-length ESTs, an important refinement even in otherwise unchanged models. Conclusion: Using these new resources, we identify a population of non-canonical (non-GT-AG) introns and also find that approximately 20% of Ciona genes reside in operons and that operons contain a high proportion of single-exon genes. Thus, the present dataset provides an opportunity to analyze the Ciona genome much more precisely than ever.

Original languageEnglish
Article numberR152
JournalGenome Biology
Volume9
Issue number10
DOIs
Publication statusPublished - 2008 Oct 14

Fingerprint

Ciona intestinalis
Chordata
genome assembly
operon
Operon
Introns
introns
genome
Genome
Expressed Sequence Tags
gene
Population
Genes
expressed sequence tags
genes
Bacterial Artificial Chromosomes
bacterial artificial chromosomes
chromosome
Urochordata
Ascidiacea

ASJC Scopus subject areas

  • Genetics
  • Cell Biology
  • Ecology, Evolution, Behavior and Systematics

Cite this

Improved genome assembly and evidence-based global gene model set for the chordate Ciona intestinalis : New insight into intron and operon populations. / Satou, Yutaka; Mineta, Katsuhiko; Ogasawara, Michio; Sasakura, Yasunori; Shoguchi, Eiichi; Ueno, Keisuke; Yamada, Lixy; Matsumoto, Jun; Wasserscheid, Jessica; Dewar, Ken; Wiley, Graham B.; Macmil, Simone L.; Roe, Bruce A.; Zeller, Robert W.; Hastings, Kenneth E M; Lemaire, Patrick; Lindquist, Erika; Endo, Toshinori; Hotta, Kohji; Inaba, Kazuo.

In: Genome Biology, Vol. 9, No. 10, R152, 14.10.2008.

Research output: Contribution to journalArticle

Satou, Y, Mineta, K, Ogasawara, M, Sasakura, Y, Shoguchi, E, Ueno, K, Yamada, L, Matsumoto, J, Wasserscheid, J, Dewar, K, Wiley, GB, Macmil, SL, Roe, BA, Zeller, RW, Hastings, KEM, Lemaire, P, Lindquist, E, Endo, T, Hotta, K & Inaba, K 2008, 'Improved genome assembly and evidence-based global gene model set for the chordate Ciona intestinalis: New insight into intron and operon populations', Genome Biology, vol. 9, no. 10, R152. https://doi.org/10.1186/gb-2008-9-10-r152
Satou, Yutaka ; Mineta, Katsuhiko ; Ogasawara, Michio ; Sasakura, Yasunori ; Shoguchi, Eiichi ; Ueno, Keisuke ; Yamada, Lixy ; Matsumoto, Jun ; Wasserscheid, Jessica ; Dewar, Ken ; Wiley, Graham B. ; Macmil, Simone L. ; Roe, Bruce A. ; Zeller, Robert W. ; Hastings, Kenneth E M ; Lemaire, Patrick ; Lindquist, Erika ; Endo, Toshinori ; Hotta, Kohji ; Inaba, Kazuo. / Improved genome assembly and evidence-based global gene model set for the chordate Ciona intestinalis : New insight into intron and operon populations. In: Genome Biology. 2008 ; Vol. 9, No. 10.
@article{e4a2ad31a32140ad8626a97ec5a9ae5e,
title = "Improved genome assembly and evidence-based global gene model set for the chordate Ciona intestinalis: New insight into intron and operon populations",
abstract = "Background: The draft genome sequence of the ascidian Ciona intestinalis, along with associated gene models, has been a valuable research resource. However, recently accumulated expressed sequence tag (EST)/cDNA data have revealed numerous inconsistencies with the gene models due in part to intrinsic limitations in gene prediction programs and in part to the fragmented nature of the assembly. Results: We have prepared a less-fragmented assembly on the basis of scaffold-joining guided by paired-end EST and bacterial artificial chromosome (BAC) sequences, and BAC chromosomal in situ hybridization data. The new assembly (115.2 Mb) is similar in length to the initial assembly (116.7 Mb) but contains 1,272 (approximately 50{\%}) fewer scaffolds. The largest scaffold in the new assembly incorporates 95 initial-assembly scaffolds. In conjunction with the new assembly, we have prepared a greatly improved global gene model set strictly correlated with the extensive currently available EST data. The total gene number (15,254) is similar to that of the initial set (15,582), but the new set includes 3,330 models at genomic sites where none were present in the initial set, and 1,779 models that represent fusions of multiple previously incomplete models. In approximately half, 5′-ends were precisely mapped using 5′-full-length ESTs, an important refinement even in otherwise unchanged models. Conclusion: Using these new resources, we identify a population of non-canonical (non-GT-AG) introns and also find that approximately 20{\%} of Ciona genes reside in operons and that operons contain a high proportion of single-exon genes. Thus, the present dataset provides an opportunity to analyze the Ciona genome much more precisely than ever.",
author = "Yutaka Satou and Katsuhiko Mineta and Michio Ogasawara and Yasunori Sasakura and Eiichi Shoguchi and Keisuke Ueno and Lixy Yamada and Jun Matsumoto and Jessica Wasserscheid and Ken Dewar and Wiley, {Graham B.} and Macmil, {Simone L.} and Roe, {Bruce A.} and Zeller, {Robert W.} and Hastings, {Kenneth E M} and Patrick Lemaire and Erika Lindquist and Toshinori Endo and Kohji Hotta and Kazuo Inaba",
year = "2008",
month = "10",
day = "14",
doi = "10.1186/gb-2008-9-10-r152",
language = "English",
volume = "9",
journal = "Genome Biology",
issn = "1474-7596",
publisher = "BioMed Central",
number = "10",

}

TY - JOUR

T1 - Improved genome assembly and evidence-based global gene model set for the chordate Ciona intestinalis

T2 - New insight into intron and operon populations

AU - Satou, Yutaka

AU - Mineta, Katsuhiko

AU - Ogasawara, Michio

AU - Sasakura, Yasunori

AU - Shoguchi, Eiichi

AU - Ueno, Keisuke

AU - Yamada, Lixy

AU - Matsumoto, Jun

AU - Wasserscheid, Jessica

AU - Dewar, Ken

AU - Wiley, Graham B.

AU - Macmil, Simone L.

AU - Roe, Bruce A.

AU - Zeller, Robert W.

AU - Hastings, Kenneth E M

AU - Lemaire, Patrick

AU - Lindquist, Erika

AU - Endo, Toshinori

AU - Hotta, Kohji

AU - Inaba, Kazuo

PY - 2008/10/14

Y1 - 2008/10/14

N2 - Background: The draft genome sequence of the ascidian Ciona intestinalis, along with associated gene models, has been a valuable research resource. However, recently accumulated expressed sequence tag (EST)/cDNA data have revealed numerous inconsistencies with the gene models due in part to intrinsic limitations in gene prediction programs and in part to the fragmented nature of the assembly. Results: We have prepared a less-fragmented assembly on the basis of scaffold-joining guided by paired-end EST and bacterial artificial chromosome (BAC) sequences, and BAC chromosomal in situ hybridization data. The new assembly (115.2 Mb) is similar in length to the initial assembly (116.7 Mb) but contains 1,272 (approximately 50%) fewer scaffolds. The largest scaffold in the new assembly incorporates 95 initial-assembly scaffolds. In conjunction with the new assembly, we have prepared a greatly improved global gene model set strictly correlated with the extensive currently available EST data. The total gene number (15,254) is similar to that of the initial set (15,582), but the new set includes 3,330 models at genomic sites where none were present in the initial set, and 1,779 models that represent fusions of multiple previously incomplete models. In approximately half, 5′-ends were precisely mapped using 5′-full-length ESTs, an important refinement even in otherwise unchanged models. Conclusion: Using these new resources, we identify a population of non-canonical (non-GT-AG) introns and also find that approximately 20% of Ciona genes reside in operons and that operons contain a high proportion of single-exon genes. Thus, the present dataset provides an opportunity to analyze the Ciona genome much more precisely than ever.

AB - Background: The draft genome sequence of the ascidian Ciona intestinalis, along with associated gene models, has been a valuable research resource. However, recently accumulated expressed sequence tag (EST)/cDNA data have revealed numerous inconsistencies with the gene models due in part to intrinsic limitations in gene prediction programs and in part to the fragmented nature of the assembly. Results: We have prepared a less-fragmented assembly on the basis of scaffold-joining guided by paired-end EST and bacterial artificial chromosome (BAC) sequences, and BAC chromosomal in situ hybridization data. The new assembly (115.2 Mb) is similar in length to the initial assembly (116.7 Mb) but contains 1,272 (approximately 50%) fewer scaffolds. The largest scaffold in the new assembly incorporates 95 initial-assembly scaffolds. In conjunction with the new assembly, we have prepared a greatly improved global gene model set strictly correlated with the extensive currently available EST data. The total gene number (15,254) is similar to that of the initial set (15,582), but the new set includes 3,330 models at genomic sites where none were present in the initial set, and 1,779 models that represent fusions of multiple previously incomplete models. In approximately half, 5′-ends were precisely mapped using 5′-full-length ESTs, an important refinement even in otherwise unchanged models. Conclusion: Using these new resources, we identify a population of non-canonical (non-GT-AG) introns and also find that approximately 20% of Ciona genes reside in operons and that operons contain a high proportion of single-exon genes. Thus, the present dataset provides an opportunity to analyze the Ciona genome much more precisely than ever.

UR - http://www.scopus.com/inward/record.url?scp=55349149407&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=55349149407&partnerID=8YFLogxK

U2 - 10.1186/gb-2008-9-10-r152

DO - 10.1186/gb-2008-9-10-r152

M3 - Article

VL - 9

JO - Genome Biology

JF - Genome Biology

SN - 1474-7596

IS - 10

M1 - R152

ER -