TY - GEN
T1 - Query by committee in a heterogeneous environment
AU - Shao, Hao
AU - Tong, Bin
AU - Suzuki, Einoshin
PY - 2012
Y1 - 2012
N2 - In real applications of inductive learning, labeled instances are often deficient. The countermeasure is either to ask experts to label informative instances in active learning, or to borrow useful information from abundant labeled instances in the source domain in transfer learning. Due to the high cost of querying experts, it is promising to integrate the two methodologies into a more robust and reliable classification framework to compensate the disadvantages of both methods. Recently, a few research studies have been investigated to integrate the two methods together, which is called transfer active learning. However, when there exist unrelated domains which have different distributions or label assignments, an inevitable problem named negative transfer will happen which leads to degenerated performance. Also, how to avoid selecting unconcerned samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to measure the similarities between different domains, so that the negative effects can be alleviated. To avoid querying irrelevant instances, we also present an adaptive strategy that is able to eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both synthetic and real data sets show that our algorithm is able to query less instances and converges faster than the state-of-the-art methods.
AB - In real applications of inductive learning, labeled instances are often deficient. The countermeasure is either to ask experts to label informative instances in active learning, or to borrow useful information from abundant labeled instances in the source domain in transfer learning. Due to the high cost of querying experts, it is promising to integrate the two methodologies into a more robust and reliable classification framework to compensate the disadvantages of both methods. Recently, a few research studies have been investigated to integrate the two methods together, which is called transfer active learning. However, when there exist unrelated domains which have different distributions or label assignments, an inevitable problem named negative transfer will happen which leads to degenerated performance. Also, how to avoid selecting unconcerned samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to measure the similarities between different domains, so that the negative effects can be alleviated. To avoid querying irrelevant instances, we also present an adaptive strategy that is able to eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both synthetic and real data sets show that our algorithm is able to query less instances and converges faster than the state-of-the-art methods.
UR - http://www.scopus.com/inward/record.url?scp=84872707023&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84872707023&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-35527-1_16
DO - 10.1007/978-3-642-35527-1_16
M3 - Conference contribution
AN - SCOPUS:84872707023
SN - 9783642355264
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 186
EP - 198
BT - Advanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings
T2 - 8th International Conference on Advanced Data Mining and Applications, ADMA 2012
Y2 - 15 December 2012 through 18 December 2012
ER -