Validating the significance of genomic properties of Chi sites from the distribution of all octamers in Escherichia coli

Kazuharu Arakawa, Reina Uno, Yoichi Nakayama, Masaru Tomita

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Chi sites (5′-GCTGGTGG-3′) are homologous recombinational hotspot octamer sequences, which attenuate the exonuclease activity of RecBCD in Escherichia coli. They are overrepresented in the genome (1008 occurrences), preferentially located within coding regions (98%), oriented in the direction of replication (75%), and occur most commonly on the mRNA-synonymous sense strand of the double helix (79%). Previous statistical studies of the genome sequence suggested that these genomic properties of Chi sites appear to be related to their role in recombinational repair and therefore to replication and transcription. In this study, we employ three mathematical models to predict the properties of Chi sites from single nucleotide and multi-nucleotide compositions, and validate them statistically using the distribution of all octamer sequences in the entire genome, or exclusively within ORFs. The model based on the overall distribution of all octamers provided better predictions than the single nucleotide composition model, and the ORF and sense strand preference of Chi sites were shown to be within the standard deviation of all octamers. In contrast, the orientation bias of the Chi sites in the direction of replication was significant, although the bias was not as pronounced as with the single nucleotide composition model, suggesting a selective pressure related to the role of RecBCD in replication.

Original languageEnglish
Pages (from-to)239-246
Number of pages8
JournalGene
Volume392
Issue number1-2
DOIs
Publication statusPublished - 2007 May 1

Fingerprint

Nucleotides
Escherichia coli
Genome
Open Reading Frames
Exonucleases
Theoretical Models
Messenger RNA
Direction compound

Keywords

  • Bioinformatics
  • Homologous recombination
  • Orientation bias
  • RecBCD
  • Strand bias

ASJC Scopus subject areas

  • Genetics

Cite this

Validating the significance of genomic properties of Chi sites from the distribution of all octamers in Escherichia coli. / Arakawa, Kazuharu; Uno, Reina; Nakayama, Yoichi; Tomita, Masaru.

In: Gene, Vol. 392, No. 1-2, 01.05.2007, p. 239-246.

Research output: Contribution to journalArticle

@article{c29af26a69f940d78e30ecb4281954fa,
title = "Validating the significance of genomic properties of Chi sites from the distribution of all octamers in Escherichia coli",
abstract = "Chi sites (5′-GCTGGTGG-3′) are homologous recombinational hotspot octamer sequences, which attenuate the exonuclease activity of RecBCD in Escherichia coli. They are overrepresented in the genome (1008 occurrences), preferentially located within coding regions (98{\%}), oriented in the direction of replication (75{\%}), and occur most commonly on the mRNA-synonymous sense strand of the double helix (79{\%}). Previous statistical studies of the genome sequence suggested that these genomic properties of Chi sites appear to be related to their role in recombinational repair and therefore to replication and transcription. In this study, we employ three mathematical models to predict the properties of Chi sites from single nucleotide and multi-nucleotide compositions, and validate them statistically using the distribution of all octamer sequences in the entire genome, or exclusively within ORFs. The model based on the overall distribution of all octamers provided better predictions than the single nucleotide composition model, and the ORF and sense strand preference of Chi sites were shown to be within the standard deviation of all octamers. In contrast, the orientation bias of the Chi sites in the direction of replication was significant, although the bias was not as pronounced as with the single nucleotide composition model, suggesting a selective pressure related to the role of RecBCD in replication.",
keywords = "Bioinformatics, Homologous recombination, Orientation bias, RecBCD, Strand bias",
author = "Kazuharu Arakawa and Reina Uno and Yoichi Nakayama and Masaru Tomita",
year = "2007",
month = "5",
day = "1",
doi = "10.1016/j.gene.2006.12.022",
language = "English",
volume = "392",
pages = "239--246",
journal = "Gene",
issn = "0378-1119",
publisher = "Elsevier",
number = "1-2",

}

TY - JOUR

T1 - Validating the significance of genomic properties of Chi sites from the distribution of all octamers in Escherichia coli

AU - Arakawa, Kazuharu

AU - Uno, Reina

AU - Nakayama, Yoichi

AU - Tomita, Masaru

PY - 2007/5/1

Y1 - 2007/5/1

N2 - Chi sites (5′-GCTGGTGG-3′) are homologous recombinational hotspot octamer sequences, which attenuate the exonuclease activity of RecBCD in Escherichia coli. They are overrepresented in the genome (1008 occurrences), preferentially located within coding regions (98%), oriented in the direction of replication (75%), and occur most commonly on the mRNA-synonymous sense strand of the double helix (79%). Previous statistical studies of the genome sequence suggested that these genomic properties of Chi sites appear to be related to their role in recombinational repair and therefore to replication and transcription. In this study, we employ three mathematical models to predict the properties of Chi sites from single nucleotide and multi-nucleotide compositions, and validate them statistically using the distribution of all octamer sequences in the entire genome, or exclusively within ORFs. The model based on the overall distribution of all octamers provided better predictions than the single nucleotide composition model, and the ORF and sense strand preference of Chi sites were shown to be within the standard deviation of all octamers. In contrast, the orientation bias of the Chi sites in the direction of replication was significant, although the bias was not as pronounced as with the single nucleotide composition model, suggesting a selective pressure related to the role of RecBCD in replication.

AB - Chi sites (5′-GCTGGTGG-3′) are homologous recombinational hotspot octamer sequences, which attenuate the exonuclease activity of RecBCD in Escherichia coli. They are overrepresented in the genome (1008 occurrences), preferentially located within coding regions (98%), oriented in the direction of replication (75%), and occur most commonly on the mRNA-synonymous sense strand of the double helix (79%). Previous statistical studies of the genome sequence suggested that these genomic properties of Chi sites appear to be related to their role in recombinational repair and therefore to replication and transcription. In this study, we employ three mathematical models to predict the properties of Chi sites from single nucleotide and multi-nucleotide compositions, and validate them statistically using the distribution of all octamer sequences in the entire genome, or exclusively within ORFs. The model based on the overall distribution of all octamers provided better predictions than the single nucleotide composition model, and the ORF and sense strand preference of Chi sites were shown to be within the standard deviation of all octamers. In contrast, the orientation bias of the Chi sites in the direction of replication was significant, although the bias was not as pronounced as with the single nucleotide composition model, suggesting a selective pressure related to the role of RecBCD in replication.

KW - Bioinformatics

KW - Homologous recombination

KW - Orientation bias

KW - RecBCD

KW - Strand bias

UR - http://www.scopus.com/inward/record.url?scp=33947587178&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33947587178&partnerID=8YFLogxK

U2 - 10.1016/j.gene.2006.12.022

DO - 10.1016/j.gene.2006.12.022

M3 - Article

C2 - 17270364

AN - SCOPUS:33947587178

VL - 392

SP - 239

EP - 246

JO - Gene

JF - Gene

SN - 0378-1119

IS - 1-2

ER -