Identification of Unnatural Subsets in Statistical Data

Takahiko Suzuki, Tssukasa Kamimasu, Tetsuya Nakatoh, Sachio Hirokawa

    研究成果: Chapter in Book/Report/Conference proceedingConference contribution

    1 被引用数 (Scopus)

    抄録

    Benford's law is an observation on the frequency distribution of first significant digits in natural numerical data. We can measure the unnaturalness of the data by evaluating estrangement of the frequency distribution of leading digits of the data in relation to the Benford's distribution. However, we cannot identify the unnatural part of the data precisely. In this study, we focus on the fact that statistical data is generally provided in tabular form. We specify a subset of the target data by using the item names of rows and columns that define each cell of the table or words appearing in the table title. By measuring the degree of divergence of the subset from Benford's distribution, we can identify unnatural subsets. We apply this method to agriculture-related data from China Statistical Yearbook and succeeded to identify unnatural subsets.

    本文言語英語
    ホスト出版物のタイトルProceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018
    出版社Institute of Electrical and Electronics Engineers Inc.
    ページ74-80
    ページ数7
    ISBN(電子版)9781538674475
    DOI
    出版ステータス出版済み - 7 2 2018
    イベント7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018 - Yonago, 日本
    継続期間: 7 8 20187 13 2018

    出版物シリーズ

    名前Proceedings - 2018 7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018

    会議

    会議7th International Congress on Advanced Applied Informatics, IIAI-AAI 2018
    国/地域日本
    CityYonago
    Period7/8/187/13/18

    All Science Journal Classification (ASJC) codes

    • コンピュータ ネットワークおよび通信
    • 通信
    • 情報システム
    • 情報システムおよび情報管理
    • 教育

    フィンガープリント

    「Identification of Unnatural Subsets in Statistical Data」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

    引用スタイル