Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing.
MathematicS In Action, Volume 11 (2022) no. 1, pp. 243-258.

A new method for stochastic control based on neural networks and using randomisation of discrete random variables is proposed and applied to optimal stopping time problems. The method models directly the policy and does not need the derivation of a dynamic programming principle nor a backward stochastic differential equation. Unlike continuous optimization where automatic differentiation is used directly, we propose a likelihood ratio method for gradient computation. Numerical tests are done on the pricing of American and swing options. The proposed algorithm succeeds in pricing high dimensional American and swing options in a reasonable computation time, which is not possible with classical algorithms.

Published online:
DOI: 10.5802/msia.26
Classification: 91G60,  60G40,  90C27,  97R40
Keywords: Optimal stopping, American option, Swing option, Combinatorial optimisation, Neural network, Artificial intelligence.
Thomas Deschatre 1; Joseph Mikael 1

1 EDF R&D & FiME, Laboratoire de Finance des Marchés de l’Energie
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
@article{MSIA_2022__11_1_243_0,
     author = {Thomas Deschatre and Joseph Mikael},
     title = {Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing.},
     journal = {MathematicS In Action},
     pages = {243--258},
     publisher = {Soci\'et\'e de Math\'ematiques Appliqu\'ees et Industrielles},
     volume = {11},
     number = {1},
     year = {2022},
     doi = {10.5802/msia.26},
     language = {en},
     url = {https://msia.centre-mersenne.org/articles/10.5802/msia.26/}
}
TY  - JOUR
AU  - Thomas Deschatre
AU  - Joseph Mikael
TI  - Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing.
JO  - MathematicS In Action
PY  - 2022
DA  - 2022///
SP  - 243
EP  - 258
VL  - 11
IS  - 1
PB  - Société de Mathématiques Appliquées et Industrielles
UR  - https://msia.centre-mersenne.org/articles/10.5802/msia.26/
UR  - https://doi.org/10.5802/msia.26
DO  - 10.5802/msia.26
LA  - en
ID  - MSIA_2022__11_1_243_0
ER  - 
%0 Journal Article
%A Thomas Deschatre
%A Joseph Mikael
%T Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing.
%J MathematicS In Action
%D 2022
%P 243-258
%V 11
%N 1
%I Société de Mathématiques Appliquées et Industrielles
%U https://doi.org/10.5802/msia.26
%R 10.5802/msia.26
%G en
%F MSIA_2022__11_1_243_0
Thomas Deschatre; Joseph Mikael. Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing.. MathematicS In Action, Volume 11 (2022) no. 1, pp. 243-258. doi : 10.5802/msia.26. https://msia.centre-mersenne.org/articles/10.5802/msia.26/

[1] Martin Abadi; Ashish Agarwal; Paul Barham; Eugene Brevdo; Zhifeng Chen; Craig Citro; Greg S. Corrado; Andy Davis; Jeffrey Dean; Matthieu Devin; Sanjay Ghemawat; Ian Goodfellow; Andrew Harp; Geoffrey Irving; Michael Isard; Yangqing Jia; Rafal Jozefowicz; Lukasz Kaiser; Manjunath Kudlur; Josh Levenberg; Dandelion Mane; Rajat Monga; Sherry Moore; Derek Murray; Chris Olah; Mike Schuster; Jonathon Shlens; Benoit Steiner; Ilya Sutskever; Kunal Talwar; Paul Tucker; Vincent Vanhoucke; Vijay Vasudevan; Fernanda Viégas; Oriol Vinyals; Pete Warden; Martin Wattenberg; Martin Wicke; Yuan Yu; Xiaoqiang Zheng TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, 2015 (Software available from tensorflow.org)

[2] Leif B. G. Andersen A simple approach to the pricing of Bermudan swaptions in the multi-factor Libor market model, 1999 (Available at SSRN 155208)

[3] Achref Bachouch; Côme Huré; Nicolas Langrené; Huyên Pham Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications, Methodol. Comput. Appl. Probab., Volume 24 (2022) no. 1, pp. 143-178 | DOI | MR | Zbl

[4] Christophe Barrera-Esteve; Florent Bergeret; Charles Dossal; Emmanuel Gobet; Asma Meziou; Rémi Munos; Damien Reboul-Salze Numerical methods for the pricing of swing options: a stochastic control approach, Methodol. Comput. Appl. Probab., Volume 8 (2006) no. 4, pp. 517-540 | DOI | MR | Zbl

[5] Sebastian Becker; Patrick Cheridito; Arnulf Jentzen Deep Optimal Stopping., J. Mach. Learn. Res., Volume 20 (2019) no. 74, pp. 1-25 | MR | Zbl

[6] Sebastian Becker; Patrick Cheridito; Arnulf Jentzen Pricing and hedging American-style options with deep learning, J. Risk Financ. Manag., Volume 13 (2020) no. 7, p. 158 | DOI

[7] Sebastian Becker; Patrick Cheridito; Arnulf Jentzen; Timo Welti Solving high-dimensional optimal stopping problems using deep learning, Eur. J. Appl. Math., Volume 32 (2021) no. 3, pp. 470-514 | DOI | MR | Zbl

[8] Irwan Bello; Hieu Pham; Quoc V Le; Mohammad Norouzi; Samy Bengio Neural combinatorial optimization with reinforcement learning, 2017 (Workshop Track of the International Conference on Learning Representations)

[9] Marie Bernhart; Huyên Pham; Peter Tankov; Xavier Warin Swing options valuation: A bsde with constrained jumps approach, Numerical methods in finance, Springer, 2012, pp. 379-400 | DOI | Zbl

[10] Bruno Bouchard; Jean-François Chassagneux Discrete-time approximation for continuously and discretely reflected BSDEs, Stochastic Processes Appl., Volume 118 (2008) no. 12, pp. 2269-2293 | DOI | MR | Zbl

[11] Bruno Bouchard; Xavier Warin Monte-Carlo valuation of American options: facts and new algorithms to improve existing methods, Numerical methods in finance, Springer, 2012, pp. 215-255 | DOI | Zbl

[12] Hans Buehler; Lukas Gonon; Josef Teichmann; Ben Wood Deep hedging, Quant. Finance, Volume 19 (2019) no. 8, pp. 1271-1291 | DOI | MR | Zbl

[13] René Carmona; Nizar Touzi Optimal multiple stopping and valuation of swing options, Math. Finance, Volume 18 (2008) no. 2, pp. 239-268 | DOI | MR | Zbl

[14] Quentin Chan-Wai-Nam; Joseph Mikael; Xavier Warin Machine learning for semi linear PDEs, J. Sci. Comput., Volume 79 (2019) no. 3, pp. 1667-1712 | DOI | MR | Zbl

[15] Weinan E; Jiequn Han; Arnulf Jentzen Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Commun. Math. Stat., Volume 5 (2017) no. 4, pp. 349-380 | MR | Zbl

[16] Nicole El Karoui; Christophe Kapoudjian; Étienne Pardoux; Shige Peng; Marie-Claire Quenez et al. Reflected solutions of backward SDE’s, and related obstacle problems for PDE’s, Ann. Probab., Volume 25 (1997) no. 2, pp. 702-737 | MR

[17] Simon Fécamp; Joseph Mikael; Xavier Warin Deep learning for discrete-time hedging in incomplete markets, J. Comput. Finance (2020)

[18] Diego Garcıa Convergence and biases of Monte Carlo estimates of American option prices using a parametric exercise rule, J. Econ. Dyn. Control, Volume 27 (2003) no. 10, pp. 1855-1879 | DOI | MR | Zbl

[19] Paul Glasserman Monte Carlo methods in financial engineering, 53, Springer, 2013

[20] Xavier Glorot; Yoshua Bengio Understanding the difficulty of training deep feedforward neural networks, Proceedings of the thirteenth international conference on artificial intelligence and statistics (2010), pp. 249-256

[21] Jiequn Han; Arnulf Jentzen; Weinan E Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci. USA, Volume 115 (2018) no. 34, pp. 8505-8510 | MR | Zbl

[22] Côme Huré; Huyên Pham; Achref Bachouch; Nicolas Langrené Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis, SIAM J. Numer. Anal., Volume 59 (2021) no. 1, pp. 525-557 | DOI | MR | Zbl

[23] Côme Huré; Huyên Pham; Xavier Warin Deep backward schemes for high-dimensional nonlinear PDEs, Math. Comput., Volume 89 (2020) no. 324, pp. 1547-1579 | DOI | MR | Zbl

[24] Alfredo Ibáñez Valuation by simulation of contingent claims with multiple early exercise opportunities, Math. Finance, Volume 14 (2004) no. 2, pp. 223-248 | DOI | MR | Zbl

[25] Michael Kohler; Adam Krzyżak; Nebojsa Todorovic Pricing of High-Dimensional American Options by Neural Networks, Math. Finance, Volume 20 (2010) no. 3, pp. 383-410 | DOI | MR | Zbl

[26] Francis A. Longstaff; Eduardo S. Schwartz Valuing American options by simulation: a simple least-squares approach, Rev. Financ. Stud., Volume 14 (2001) no. 1, pp. 113-147 | DOI

[27] Bernt Karsten Øksendal; Agnes Sulem Applied stochastic control of jump diffusions, 498, Springer, 2005

[28] Steven E. Shreve Stochastic calculus for finance II: Continuous-time models, 11, Springer, 2004

[29] Justin Sirignano; Konstantinos Spiliopoulos DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys., Volume 375 (2018), pp. 1339-1364 | DOI | MR | Zbl

[30] Richard S. Sutton; David A. McAllester; Satinder P. Singh; Yishay Mansour Policy gradient methods for reinforcement learning with function approximation, NIPS’99: Proceedings of the 12th International Conference on Neural Information Processing Systems (2000), pp. 1057-1063

[31] Faouzi Trabelsi Study of undiscounted non-linear optimal multiple stopping problems on unbounded intervals, Int. J. Math. Oper. Res., Volume 5 (2013) no. 2, pp. 225-254 | DOI | MR | Zbl

[32] Xavier Warin Gas storage hedging, Numerical methods in finance, Springer, 2012, pp. 421-445 | DOI | MR | Zbl

[33] Tingting Zhao; Hirotaka Hachiya; Gang Niu; Masashi Sugiyama Analysis and improvement of policy gradient estimation, NIPS’11: Proceedings of the 24th International Conference on Neural Information Processing Systems (2011), pp. 262-270

Cited by Sources: