`Lob-pass' problem and an on-line learning model of rational choice

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

7 引用 (Scopus)

抜粋

We consider an on-line learning model of rational choice, in which the goal of an agent is to choose its actions so as to maximize the number of successes, while learning about its reacting environment through those very actions. In particular, we consider a model of tennis play, in which the only actions that the player can take are a `pass' and a `lob,' and the opponent is modeled by two linear (probabilistic) functions fL(r) = a1r+b1 and fP(r) = a2r+b2, specifying the probability that a lob (and a pass, respectively) will win a point when the proportion of lobs in the past trials is r. We measure the performance of a player in this model by its expected regret, namely how many less points it expects to win as compared to the ideal player (one that knows the two probabilistic functions) as a function of t, the total number of trials, which is unknown to the player a priori. Assuming that the probabilistic functions satisfy the matching shoulder condition, i.e. fL(0) = fP(1), we obtain a variety of upper bounds for assumptions and restrictions of varying degrees, ranging from O(log t), O(t1/3), O(t 1/2 ), O(t3/5), O(t2/3) to O(t5/7) as well as a matching lower bound of order Ω(log t) for the most restrictive case. When the total number of trials t is given to the player in advance, the upper bounds can be improved significantly.

元の言語英語
ホスト出版物のタイトルProc 6 Annu ACM Conf Comput Learn Theory
出版者Publ by ACM
ページ422-428
ページ数7
ISBN(印刷物)0897916115, 9780897916110
DOI
出版物ステータス出版済み - 1993
イベントProceedings of the 6th Annual ACM Conference on Computational Learning Theory - Santa Cruz, CA, USA
継続期間: 7 26 19937 28 1993

出版物シリーズ

名前Proc 6 Annu ACM Conf Comput Learn Theory

その他

その他Proceedings of the 6th Annual ACM Conference on Computational Learning Theory
Santa Cruz, CA, USA
期間7/26/937/28/93

All Science Journal Classification (ASJC) codes

  • Engineering(all)

これを引用

Abe, N., & Takeuchi, J. I. (1993). `Lob-pass' problem and an on-line learning model of rational choice. : Proc 6 Annu ACM Conf Comput Learn Theory (pp. 422-428). (Proc 6 Annu ACM Conf Comput Learn Theory). Publ by ACM. https://doi.org/10.1145/168304.168389