Query by committee in a heterogeneous environment

Hao Shao, Bin Tong, Einoshin Suzuki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

In real applications of inductive learning, labeled instances are often deficient. The countermeasure is either to ask experts to label informative instances in active learning, or to borrow useful information from abundant labeled instances in the source domain in transfer learning. Due to the high cost of querying experts, it is promising to integrate the two methodologies into a more robust and reliable classification framework to compensate the disadvantages of both methods. Recently, a few research studies have been investigated to integrate the two methods together, which is called transfer active learning. However, when there exist unrelated domains which have different distributions or label assignments, an inevitable problem named negative transfer will happen which leads to degenerated performance. Also, how to avoid selecting unconcerned samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to measure the similarities between different domains, so that the negative effects can be alleviated. To avoid querying irrelevant instances, we also present an adaptive strategy that is able to eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both synthetic and real data sets show that our algorithm is able to query less instances and converges faster than the state-of-the-art methods.

Original languageEnglish
Title of host publicationAdvanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings
Pages186-198
Number of pages13
DOIs
Publication statusPublished - Dec 1 2012
Event8th International Conference on Advanced Data Mining and Applications, ADMA 2012 - Nanjing, China
Duration: Dec 15 2012Dec 18 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7713 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other8th International Conference on Advanced Data Mining and Applications, ADMA 2012
CountryChina
CityNanjing
Period12/15/1212/18/12

Fingerprint

Transfer Learning
Heterogeneous Environment
Active Learning
Query
Labels
Integrate
Inductive Learning
Divergence Measure
Adaptive Strategies
Countermeasures
Hybrid Algorithm
Assignment
Eliminate
Converge
Methodology
Costs
Model
Experiment
Problem-Based Learning
Experiments

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Shao, H., Tong, B., & Suzuki, E. (2012). Query by committee in a heterogeneous environment. In Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings (pp. 186-198). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7713 LNAI). https://doi.org/10.1007/978-3-642-35527-1_16

Query by committee in a heterogeneous environment. / Shao, Hao; Tong, Bin; Suzuki, Einoshin.

Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings. 2012. p. 186-198 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7713 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shao, H, Tong, B & Suzuki, E 2012, Query by committee in a heterogeneous environment. in Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7713 LNAI, pp. 186-198, 8th International Conference on Advanced Data Mining and Applications, ADMA 2012, Nanjing, China, 12/15/12. https://doi.org/10.1007/978-3-642-35527-1_16
Shao H, Tong B, Suzuki E. Query by committee in a heterogeneous environment. In Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings. 2012. p. 186-198. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-35527-1_16
Shao, Hao ; Tong, Bin ; Suzuki, Einoshin. / Query by committee in a heterogeneous environment. Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings. 2012. pp. 186-198 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{6db0aa7cc9ff446fb45ade9f3f9ec558,
title = "Query by committee in a heterogeneous environment",
abstract = "In real applications of inductive learning, labeled instances are often deficient. The countermeasure is either to ask experts to label informative instances in active learning, or to borrow useful information from abundant labeled instances in the source domain in transfer learning. Due to the high cost of querying experts, it is promising to integrate the two methodologies into a more robust and reliable classification framework to compensate the disadvantages of both methods. Recently, a few research studies have been investigated to integrate the two methods together, which is called transfer active learning. However, when there exist unrelated domains which have different distributions or label assignments, an inevitable problem named negative transfer will happen which leads to degenerated performance. Also, how to avoid selecting unconcerned samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to measure the similarities between different domains, so that the negative effects can be alleviated. To avoid querying irrelevant instances, we also present an adaptive strategy that is able to eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both synthetic and real data sets show that our algorithm is able to query less instances and converges faster than the state-of-the-art methods.",
author = "Hao Shao and Bin Tong and Einoshin Suzuki",
year = "2012",
month = "12",
day = "1",
doi = "10.1007/978-3-642-35527-1_16",
language = "English",
isbn = "9783642355264",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "186--198",
booktitle = "Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings",

}

TY - GEN

T1 - Query by committee in a heterogeneous environment

AU - Shao, Hao

AU - Tong, Bin

AU - Suzuki, Einoshin

PY - 2012/12/1

Y1 - 2012/12/1

N2 - In real applications of inductive learning, labeled instances are often deficient. The countermeasure is either to ask experts to label informative instances in active learning, or to borrow useful information from abundant labeled instances in the source domain in transfer learning. Due to the high cost of querying experts, it is promising to integrate the two methodologies into a more robust and reliable classification framework to compensate the disadvantages of both methods. Recently, a few research studies have been investigated to integrate the two methods together, which is called transfer active learning. However, when there exist unrelated domains which have different distributions or label assignments, an inevitable problem named negative transfer will happen which leads to degenerated performance. Also, how to avoid selecting unconcerned samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to measure the similarities between different domains, so that the negative effects can be alleviated. To avoid querying irrelevant instances, we also present an adaptive strategy that is able to eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both synthetic and real data sets show that our algorithm is able to query less instances and converges faster than the state-of-the-art methods.

AB - In real applications of inductive learning, labeled instances are often deficient. The countermeasure is either to ask experts to label informative instances in active learning, or to borrow useful information from abundant labeled instances in the source domain in transfer learning. Due to the high cost of querying experts, it is promising to integrate the two methodologies into a more robust and reliable classification framework to compensate the disadvantages of both methods. Recently, a few research studies have been investigated to integrate the two methods together, which is called transfer active learning. However, when there exist unrelated domains which have different distributions or label assignments, an inevitable problem named negative transfer will happen which leads to degenerated performance. Also, how to avoid selecting unconcerned samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to measure the similarities between different domains, so that the negative effects can be alleviated. To avoid querying irrelevant instances, we also present an adaptive strategy that is able to eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both synthetic and real data sets show that our algorithm is able to query less instances and converges faster than the state-of-the-art methods.

UR - http://www.scopus.com/inward/record.url?scp=84872707023&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872707023&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-35527-1_16

DO - 10.1007/978-3-642-35527-1_16

M3 - Conference contribution

AN - SCOPUS:84872707023

SN - 9783642355264

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 186

EP - 198

BT - Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings

ER -