Extracting the author of web pages

Yoshikiyo Kato, Daisuke Kawahara, Kentaro Inui, Sadao Kurohashi, Tomohide Shibata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75% precision when evaluated with candidates ranked within top five.

Original languageEnglish
Title of host publicationProceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08
Pages35-41
Number of pages7
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08 - Napa Valley, CA, United States
Duration: Oct 26 2008Oct 30 2008

Other

Other2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08
CountryUnited States
CityNapa Valley, CA
Period10/26/0810/30/08

Fingerprint

World Wide Web
Evaluation

All Science Journal Classification (ASJC) codes

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Kato, Y., Kawahara, D., Inui, K., Kurohashi, S., & Shibata, T. (2008). Extracting the author of web pages. In Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08 (pp. 35-41) https://doi.org/10.1145/1458527.1458537

Extracting the author of web pages. / Kato, Yoshikiyo; Kawahara, Daisuke; Inui, Kentaro; Kurohashi, Sadao; Shibata, Tomohide.

Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08. 2008. p. 35-41.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kato, Y, Kawahara, D, Inui, K, Kurohashi, S & Shibata, T 2008, Extracting the author of web pages. in Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08. pp. 35-41, 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08, Napa Valley, CA, United States, 10/26/08. https://doi.org/10.1145/1458527.1458537
Kato Y, Kawahara D, Inui K, Kurohashi S, Shibata T. Extracting the author of web pages. In Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08. 2008. p. 35-41 https://doi.org/10.1145/1458527.1458537
Kato, Yoshikiyo ; Kawahara, Daisuke ; Inui, Kentaro ; Kurohashi, Sadao ; Shibata, Tomohide. / Extracting the author of web pages. Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08. 2008. pp. 35-41
@inproceedings{335c0abe704e44e5b453ec49c9462e4e,
title = "Extracting the author of web pages",
abstract = "In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75{\%} precision when evaluated with candidates ranked within top five.",
author = "Yoshikiyo Kato and Daisuke Kawahara and Kentaro Inui and Sadao Kurohashi and Tomohide Shibata",
year = "2008",
doi = "10.1145/1458527.1458537",
language = "English",
isbn = "9781605582597",
pages = "35--41",
booktitle = "Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08",

}

TY - GEN

T1 - Extracting the author of web pages

AU - Kato, Yoshikiyo

AU - Kawahara, Daisuke

AU - Inui, Kentaro

AU - Kurohashi, Sadao

AU - Shibata, Tomohide

PY - 2008

Y1 - 2008

N2 - In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75% precision when evaluated with candidates ranked within top five.

AB - In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75% precision when evaluated with candidates ranked within top five.

UR - http://www.scopus.com/inward/record.url?scp=70349231362&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349231362&partnerID=8YFLogxK

U2 - 10.1145/1458527.1458537

DO - 10.1145/1458527.1458537

M3 - Conference contribution

SN - 9781605582597

SP - 35

EP - 41

BT - Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08

ER -