J-Medic: A Japanese disease name dictionary based on real clinical usage

Kaoru Ito, Hiroyuki Nagai, Taro Okahisa, Shoko Wakamiya, Tomohide Iwao, Eiji Aramaki

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

4 被引用数 (Scopus)

抄録

Medical texts such as electronic health records are necessary for medical AI development. Nevertheless, it is difficult to use data directly because medical texts are written mostly in natural language, requiring natural language processing (NLP) for medical texts. To boost the fundamental accuracy of Medical NLP, a high coverage dictionary is required, especially one that fills the gap separating standard medical names and real clinical words. This study developed a Japanese disease name dictionary called “J-MeDic” to fill this gap. The names that comprise the dictionary were collected from approximately 45,000 manually annotated real clinical case reports. We allocated the standard disease code (ICD-10) to them with manual, semi-automatic, or automatic methods, in accordance with its frequency. The J-MeDic covers 7,683 concepts (in ICD-10) and 51,784 written forms. Among the names covered by J-MeDic, 55.3% (6,391/11,562) were covered by SDNs; 44.7% (5,171/11,562) were covered by names added from the CR corpus. Among them, 8.4% (436/5,171) were basically coded by humans), and 91.6% (4,735/5,171) were basically coded automatically. We investigated the coverage of this resource using discharge summaries from a hospital; 66.2% of the names are matched with the entries, revealing the practical feasibility of our dictionary.

本文言語英語
ホスト出版物のタイトルLREC 2018 - 11th International Conference on Language Resources and Evaluation
編集者Hitoshi Isahara, Bente Maegaard, Stelios Piperidis, Christopher Cieri, Thierry Declerck, Koiti Hasida, Helene Mazo, Khalid Choukri, Sara Goggi, Joseph Mariani, Asuncion Moreno, Nicoletta Calzolari, Jan Odijk, Takenobu Tokunaga
出版社European Language Resources Association (ELRA)
ページ2365-2369
ページ数5
ISBN(電子版)9791095546009
出版ステータス出版済み - 2019
外部発表はい
イベント11th International Conference on Language Resources and Evaluation, LREC 2018 - Miyazaki, 日本
継続期間: 5 7 20185 12 2018

出版物シリーズ

名前LREC 2018 - 11th International Conference on Language Resources and Evaluation

会議

会議11th International Conference on Language Resources and Evaluation, LREC 2018
国/地域日本
CityMiyazaki
Period5/7/185/12/18

All Science Journal Classification (ASJC) codes

  • 言語学および言語
  • 教育
  • 図書館情報学
  • 言語および言語学

フィンガープリント

「J-Medic: A Japanese disease name dictionary based on real clinical usage」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル