Tag recommendation for open government data by multi-label classification and particular noun phrase extraction

Yasuhiro Yamada, Tetsuya Nakatoh

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

抜粋

Open government data (OGD) is statistical data made and published by governments. Administrators often give tags to the metadata of OGD. Tags, which are a collection of a single word or multiple words, express the data. Tags are useful to understand the data without actually reading the data and also to search for OGD. However, administrators have to understand the data in detail in order to assign tags. We take two different approaches for giving appropriate tags to OGD. First, we use a multi-label classification technique to give tags to OGD from tags in the training data. Second, we extract particular noun phrases from the metadata of OGD by calculating the difference between the frequency of a noun phrase and the frequencies of single words within the noun phrase. Experiments using 196,587 datasets on Data.gov show that the accuracy of prediction by the multi-label classification method is enough to develop a tag recommendation system. Also, the experiments show that our extraction method of particular noun phrases extracts some infrequent tags of the datasets.

元の言語英語
ホスト出版物のタイトルIC3K 2018 - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
編集者Jorge Bernardino, Ana Carolina Salgado, Joaquim Filipe
出版者SciTePress
ページ83-91
ページ数9
ISBN(電子版)9789897583308
DOI
出版物ステータス出版済み - 2018
イベント10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2018 - Seville, スペイン
継続期間: 9 18 20189 20 2018

出版物シリーズ

名前IC3K 2018 - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
3

その他

その他10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2018
スペイン
Seville
期間9/18/189/20/18

All Science Journal Classification (ASJC) codes

  • Software

フィンガープリント Tag recommendation for open government data by multi-label classification and particular noun phrase extraction' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Yamada, Y., & Nakatoh, T. (2018). Tag recommendation for open government data by multi-label classification and particular noun phrase extraction. : J. Bernardino, A. C. Salgado, & J. Filipe (版), IC3K 2018 - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (pp. 83-91). (IC3K 2018 - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management; 巻数 3). SciTePress. https://doi.org/10.5220/0006937800830091