Safe semi-supervised learning based on weighted likelihood

Masanori Kawakita, Junnichi Takeuchi

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n ' unlabeled data are available in addition to n labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Cappé, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e.,it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska etal. (2008) assumes a very limited situation, i.e.,classification, discrete covariates, n ' → ∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n ≤ n '. Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n ' < n by modifying the weight. Numerical experiments illustrate the performance of these methods.

Original languageEnglish
Pages (from-to)146-164
Number of pages19
JournalNeural Networks
Volume53
DOIs
Publication statusPublished - Jan 1 2014

Fingerprint

Supervised learning
Learning
Maximum likelihood
Weights and Measures
Supervised Machine Learning
Safety
Experiments

All Science Journal Classification (ASJC) codes

  • Cognitive Neuroscience
  • Artificial Intelligence

Cite this

Safe semi-supervised learning based on weighted likelihood. / Kawakita, Masanori; Takeuchi, Junnichi.

In: Neural Networks, Vol. 53, 01.01.2014, p. 146-164.

Research output: Contribution to journalArticle

@article{a73294eb832c4420b0c577321cc228a4,
title = "Safe semi-supervised learning based on weighted likelihood",
abstract = "We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n ' unlabeled data are available in addition to n labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Capp{\'e}, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e.,it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska etal. (2008) assumes a very limited situation, i.e.,classification, discrete covariates, n ' → ∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n ≤ n '. Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n ' < n by modifying the weight. Numerical experiments illustrate the performance of these methods.",
author = "Masanori Kawakita and Junnichi Takeuchi",
year = "2014",
month = "1",
day = "1",
doi = "10.1016/j.neunet.2014.01.016",
language = "English",
volume = "53",
pages = "146--164",
journal = "Neural Networks",
issn = "0893-6080",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Safe semi-supervised learning based on weighted likelihood

AU - Kawakita, Masanori

AU - Takeuchi, Junnichi

PY - 2014/1/1

Y1 - 2014/1/1

N2 - We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n ' unlabeled data are available in addition to n labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Cappé, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e.,it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska etal. (2008) assumes a very limited situation, i.e.,classification, discrete covariates, n ' → ∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n ≤ n '. Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n ' < n by modifying the weight. Numerical experiments illustrate the performance of these methods.

AB - We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n ' unlabeled data are available in addition to n labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Cappé, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e.,it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska etal. (2008) assumes a very limited situation, i.e.,classification, discrete covariates, n ' → ∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n ≤ n '. Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n ' < n by modifying the weight. Numerical experiments illustrate the performance of these methods.

UR - http://www.scopus.com/inward/record.url?scp=84896083313&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84896083313&partnerID=8YFLogxK

U2 - 10.1016/j.neunet.2014.01.016

DO - 10.1016/j.neunet.2014.01.016

M3 - Article

VL - 53

SP - 146

EP - 164

JO - Neural Networks

JF - Neural Networks

SN - 0893-6080

ER -