Detecting academic papers on the web

Emi Ishita, Teru Agata, Atsushi Ikeuchi, Miyata Yosuke, Shuichi Ueda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Our research goal is to develop a search engine for open access to academic papers. English and Japanese test sets were built for detection of academic papers from 20,000 PDF files in each language using five annotators. Six classifiers were trained using similar features for each language. We report F1 of 0.74 for English and 0.54 for Japanese and argue that similar features could easily be generated for other languages as well.

Original languageEnglish
Title of host publicationJCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries
Pages413-414
Number of pages2
DOIs
Publication statusPublished - Jul 25 2011
Event11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11 - Ottawa, ON, Canada
Duration: Jun 13 2011Jun 17 2011

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996

Other

Other11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11
CountryCanada
CityOttawa, ON
Period6/13/116/17/11

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Ishita, E., Agata, T., Ikeuchi, A., Yosuke, M., & Ueda, S. (2011). Detecting academic papers on the web. In JCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries (pp. 413-414). (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). https://doi.org/10.1145/1998076.1998161