A co-localization model of paired ChIP-seq data using a large ENCODE data set enables comparison of multiple samples

Kazumitsu Maehara, Jun Odawara, Akihito Harada, Tomohiko Yoshimi, Koji Nagao, Chikashi Obuse, Koichi Akashi, Taro Tachibana, Toshio Sakata, Yasuyuki Ohkawa

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Deep sequencing approaches, such as chromatin immunoprecipitation by sequencing (ChIP-seq), have been successful in detecting transcription factor-binding sites and histone modification in the whole genome. An approach for comparing two different ChIP-seq data would be beneficial for predicting unknown functions of a factor. We propose a model to represent co-localization of two different ChIP-seq data. We showed that a meaningful overlapping signal and a meaningless background signal can be separated by this model. We applied this model to compare ChIP-seq data of RNA polymerase II C-terminal domain (CTD) serine 2 phosphorylation with a large amount of peak-called data, including ChIP-seq and other deep sequencing data in the Encyclopedia of DNA Elements (ENCODE) project, and then extracted factors that were related to RNA polymerase II CTD serine 2 in HeLa cells. We further analyzed RNA polymerase II CTD serine 7 phosphorylation, of which their function is still unclear in HeLa cells. Our results were characterized by the similarity of localization for transcription factor/histone modification in the ENCODE data set, and this suggests that our model is appropriate for understanding ChIP-seq data for factors where their function is unknown.

Original languageEnglish
Pages (from-to)54-62
Number of pages9
JournalNucleic acids research
Volume41
Issue number1
DOIs
Publication statusPublished - Jan 1 2013

Fingerprint

Encyclopedias
Chromatin Immunoprecipitation
RNA Polymerase III
RNA Polymerase II
DNA
Histone Code
Serine
High-Throughput Nucleotide Sequencing
HeLa Cells
Transcription Factors
Phosphorylation
Datasets
Binding Sites
Genome

All Science Journal Classification (ASJC) codes

  • Genetics

Cite this

A co-localization model of paired ChIP-seq data using a large ENCODE data set enables comparison of multiple samples. / Maehara, Kazumitsu; Odawara, Jun; Harada, Akihito; Yoshimi, Tomohiko; Nagao, Koji; Obuse, Chikashi; Akashi, Koichi; Tachibana, Taro; Sakata, Toshio; Ohkawa, Yasuyuki.

In: Nucleic acids research, Vol. 41, No. 1, 01.01.2013, p. 54-62.

Research output: Contribution to journalArticle

Maehara, Kazumitsu ; Odawara, Jun ; Harada, Akihito ; Yoshimi, Tomohiko ; Nagao, Koji ; Obuse, Chikashi ; Akashi, Koichi ; Tachibana, Taro ; Sakata, Toshio ; Ohkawa, Yasuyuki. / A co-localization model of paired ChIP-seq data using a large ENCODE data set enables comparison of multiple samples. In: Nucleic acids research. 2013 ; Vol. 41, No. 1. pp. 54-62.
@article{8efee605f61048c387f8752c4036b6b3,
title = "A co-localization model of paired ChIP-seq data using a large ENCODE data set enables comparison of multiple samples",
abstract = "Deep sequencing approaches, such as chromatin immunoprecipitation by sequencing (ChIP-seq), have been successful in detecting transcription factor-binding sites and histone modification in the whole genome. An approach for comparing two different ChIP-seq data would be beneficial for predicting unknown functions of a factor. We propose a model to represent co-localization of two different ChIP-seq data. We showed that a meaningful overlapping signal and a meaningless background signal can be separated by this model. We applied this model to compare ChIP-seq data of RNA polymerase II C-terminal domain (CTD) serine 2 phosphorylation with a large amount of peak-called data, including ChIP-seq and other deep sequencing data in the Encyclopedia of DNA Elements (ENCODE) project, and then extracted factors that were related to RNA polymerase II CTD serine 2 in HeLa cells. We further analyzed RNA polymerase II CTD serine 7 phosphorylation, of which their function is still unclear in HeLa cells. Our results were characterized by the similarity of localization for transcription factor/histone modification in the ENCODE data set, and this suggests that our model is appropriate for understanding ChIP-seq data for factors where their function is unknown.",
author = "Kazumitsu Maehara and Jun Odawara and Akihito Harada and Tomohiko Yoshimi and Koji Nagao and Chikashi Obuse and Koichi Akashi and Taro Tachibana and Toshio Sakata and Yasuyuki Ohkawa",
year = "2013",
month = "1",
day = "1",
doi = "10.1093/nar/gks1010",
language = "English",
volume = "41",
pages = "54--62",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - A co-localization model of paired ChIP-seq data using a large ENCODE data set enables comparison of multiple samples

AU - Maehara, Kazumitsu

AU - Odawara, Jun

AU - Harada, Akihito

AU - Yoshimi, Tomohiko

AU - Nagao, Koji

AU - Obuse, Chikashi

AU - Akashi, Koichi

AU - Tachibana, Taro

AU - Sakata, Toshio

AU - Ohkawa, Yasuyuki

PY - 2013/1/1

Y1 - 2013/1/1

N2 - Deep sequencing approaches, such as chromatin immunoprecipitation by sequencing (ChIP-seq), have been successful in detecting transcription factor-binding sites and histone modification in the whole genome. An approach for comparing two different ChIP-seq data would be beneficial for predicting unknown functions of a factor. We propose a model to represent co-localization of two different ChIP-seq data. We showed that a meaningful overlapping signal and a meaningless background signal can be separated by this model. We applied this model to compare ChIP-seq data of RNA polymerase II C-terminal domain (CTD) serine 2 phosphorylation with a large amount of peak-called data, including ChIP-seq and other deep sequencing data in the Encyclopedia of DNA Elements (ENCODE) project, and then extracted factors that were related to RNA polymerase II CTD serine 2 in HeLa cells. We further analyzed RNA polymerase II CTD serine 7 phosphorylation, of which their function is still unclear in HeLa cells. Our results were characterized by the similarity of localization for transcription factor/histone modification in the ENCODE data set, and this suggests that our model is appropriate for understanding ChIP-seq data for factors where their function is unknown.

AB - Deep sequencing approaches, such as chromatin immunoprecipitation by sequencing (ChIP-seq), have been successful in detecting transcription factor-binding sites and histone modification in the whole genome. An approach for comparing two different ChIP-seq data would be beneficial for predicting unknown functions of a factor. We propose a model to represent co-localization of two different ChIP-seq data. We showed that a meaningful overlapping signal and a meaningless background signal can be separated by this model. We applied this model to compare ChIP-seq data of RNA polymerase II C-terminal domain (CTD) serine 2 phosphorylation with a large amount of peak-called data, including ChIP-seq and other deep sequencing data in the Encyclopedia of DNA Elements (ENCODE) project, and then extracted factors that were related to RNA polymerase II CTD serine 2 in HeLa cells. We further analyzed RNA polymerase II CTD serine 7 phosphorylation, of which their function is still unclear in HeLa cells. Our results were characterized by the similarity of localization for transcription factor/histone modification in the ENCODE data set, and this suggests that our model is appropriate for understanding ChIP-seq data for factors where their function is unknown.

UR - http://www.scopus.com/inward/record.url?scp=84871798675&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84871798675&partnerID=8YFLogxK

U2 - 10.1093/nar/gks1010

DO - 10.1093/nar/gks1010

M3 - Article

C2 - 23125363

AN - SCOPUS:84871798675

VL - 41

SP - 54

EP - 62

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 1

ER -