Exploration-exploitation trade-off for continuous-time episodic reinforcement learning with linear-convex models L Szpruch, T Treetanthiploet, Y Zhang
arXiv preprint arXiv:2112.10264, 2021
23 2021 Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning L Szpruch, T Treetanthiploet, Y Zhang
SIAM Journal on Control and Optimization 62 (1), 135-166, 2024
13 2024 Asymptotic Randomised Control with applications to bandits SN Cohen, T Treetanthiploet
arXiv preprint arXiv:2010.07252, 2020
8 2020 Gittins’ theorem under uncertainty SN Cohen, T Treetanthiploet
Electronic Journal of Probability 27, 1-48, 2022
6 2022 Correlated bandits for dynamic pricing via the arc algorithm SN Cohen, T Treetanthiploet
arXiv preprint arXiv:2102.04263 12, 2021
5 2021 Insurance pricing on price comparison websites via reinforcement learning T Treetanthiploet, Y Zhang, L Szpruch, I Bowers-Barnard, H Ridley, ...
arXiv preprint arXiv:2308.06935, 2023
3 2023 Pricing and hedging of decentralised lending contracts L Szpruch, MS Vidales, T Treetanthiploet, Y Zhang
arXiv preprint arXiv:2409.04233, 2024
1 2024 Logarithmic regret in the ergodic Avellaneda-Stoikov market making model J Cao, D ©iąka, L Szpruch, T Treetanthiploet
arXiv preprint arXiv:2409.02025, 2024
1 2024 Generalised correlated batched bandits via the ARC algorithm with application to dynamic pricing S Cohen, T Treetanthiploet
arXiv preprint arXiv:2102.04263, 2021
1 2021 -Policy Gradient for Online PricingL Szpruch, T Treetanthiploet, Y Zhang
arXiv preprint arXiv:2405.03624, 2024
2024 Competitive Insurance Pricing Using Model-Based Bandits L Sliwinski, T Treetanthiploet, D Siska, L Szpruch
Available at SSRN 4755027, 2024
2024 Correlated Bandits for Dynamic Pricing Via the Arc Algorithm T Treetanthiploet, SN Cohen
Available at SSRN 3781766, 2021
2021 Stochastic control approach to the multi-armed bandit problems T Treetanthiploet
University of Oxford, 2021
2021