Online portfolio selection is a sequential decision-making problem in which a learner repetitively selects a portfolio over a set of assets, aiming to maximize long-term return. In this paper, we study the problem with the cardinality constraint that the number of assets in a portfolio is restricted to be at most k, and consider two scenarios: (i) in the full-feedback setting, the learner can observe price relatives (rates of return to cost) for all assets, and (ii) in the bandit-feedback setting, the learner can observe price relatives only for invested assets. We propose efficient algorithms for these scenarios, which achieve sublinear regrets. We also provide regret (statistical) lower bounds for both scenarios which nearly match the upper bounds when k is a constant. In addition, we give a computational lower bound, which implies that no algorithm maintains both computational efficiency, as well as a small regret upper bound.
|ジャーナル||Advances in Neural Information Processing Systems|
|出版ステータス||Published - 2018|
|イベント||32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada|
継続期間: 2018 12 2 → 2018 12 8
ASJC Scopus subject areas
- コンピュータ ネットワークおよび通信