A co-localization model of paired ChIP-seq data using a large ENCODE data set enables comparison of multiple samples

Kazumitsu Maehara, Jun Odawara, Akihito Harada, Tomohiko Yoshimi, Koji Nagao, Chikashi Obuse, Koichi Akashi, Taro Tachibana, Toshio Sakata, Yasuyuki Ohkawa

研究成果: ジャーナルへの寄稿記事

7 引用 (Scopus)


Deep sequencing approaches, such as chromatin immunoprecipitation by sequencing (ChIP-seq), have been successful in detecting transcription factor-binding sites and histone modification in the whole genome. An approach for comparing two different ChIP-seq data would be beneficial for predicting unknown functions of a factor. We propose a model to represent co-localization of two different ChIP-seq data. We showed that a meaningful overlapping signal and a meaningless background signal can be separated by this model. We applied this model to compare ChIP-seq data of RNA polymerase II C-terminal domain (CTD) serine 2 phosphorylation with a large amount of peak-called data, including ChIP-seq and other deep sequencing data in the Encyclopedia of DNA Elements (ENCODE) project, and then extracted factors that were related to RNA polymerase II CTD serine 2 in HeLa cells. We further analyzed RNA polymerase II CTD serine 7 phosphorylation, of which their function is still unclear in HeLa cells. Our results were characterized by the similarity of localization for transcription factor/histone modification in the ENCODE data set, and this suggests that our model is appropriate for understanding ChIP-seq data for factors where their function is unknown.

ジャーナルNucleic acids research
出版物ステータス出版済み - 1 1 2013


All Science Journal Classification (ASJC) codes

  • Genetics