A new method for stochastic control based on neural networks and using randomisation of discrete random variables is proposed and applied to optimal stopping time problems. The method models directly the policy and does not need the derivation of a dynamic programming principle nor a backward stochastic differential equation. Unlike continuous optimization where automatic differentiation is used directly, we propose a likelihood ratio method for gradient computation. Numerical tests are done on the pricing of American and swing options. The proposed algorithm succeeds in pricing high dimensional American and swing options in a reasonable computation time, which is not possible with classical algorithms.
Mots-clés : Optimal stopping, American option, Swing option, Combinatorial optimisation, Neural network, Artificial intelligence.
@article{MSIA_2022__11_1_243_0, author = {Thomas Deschatre and Joseph Mikael}, title = {Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing.}, journal = {MathematicS In Action}, pages = {243--258}, publisher = {Soci\'et\'e de Math\'ematiques Appliqu\'ees et Industrielles}, volume = {11}, number = {1}, year = {2022}, doi = {10.5802/msia.26}, language = {en}, url = {https://msia.centre-mersenne.org/articles/10.5802/msia.26/} }
TY - JOUR AU - Thomas Deschatre AU - Joseph Mikael TI - Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing. JO - MathematicS In Action PY - 2022 SP - 243 EP - 258 VL - 11 IS - 1 PB - Société de Mathématiques Appliquées et Industrielles UR - https://msia.centre-mersenne.org/articles/10.5802/msia.26/ DO - 10.5802/msia.26 LA - en ID - MSIA_2022__11_1_243_0 ER -
%0 Journal Article %A Thomas Deschatre %A Joseph Mikael %T Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing. %J MathematicS In Action %D 2022 %P 243-258 %V 11 %N 1 %I Société de Mathématiques Appliquées et Industrielles %U https://msia.centre-mersenne.org/articles/10.5802/msia.26/ %R 10.5802/msia.26 %G en %F MSIA_2022__11_1_243_0
Thomas Deschatre; Joseph Mikael. Deep combinatorial optimisation for optimal stopping time problems: application to swing options pricing.. MathematicS In Action, Special issue Maths and Industry, Volume 11 (2022) no. 1, pp. 243-258. doi : 10.5802/msia.26. https://msia.centre-mersenne.org/articles/10.5802/msia.26/
[1] TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, 2015 (Software available from tensorflow.org)
[2] A simple approach to the pricing of Bermudan swaptions in the multi-factor Libor market model, 1999 (Available at SSRN 155208)
[3] Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications, Methodol. Comput. Appl. Probab., Volume 24 (2022) no. 1, pp. 143-178 | DOI | MR | Zbl
[4] Numerical methods for the pricing of swing options: a stochastic control approach, Methodol. Comput. Appl. Probab., Volume 8 (2006) no. 4, pp. 517-540 | DOI | MR | Zbl
[5] Deep Optimal Stopping., J. Mach. Learn. Res., Volume 20 (2019) no. 74, pp. 1-25 | MR | Zbl
[6] Pricing and hedging American-style options with deep learning, J. Risk Financ. Manag., Volume 13 (2020) no. 7, p. 158 | DOI
[7] Solving high-dimensional optimal stopping problems using deep learning, Eur. J. Appl. Math., Volume 32 (2021) no. 3, pp. 470-514 | DOI | MR | Zbl
[8] Neural combinatorial optimization with reinforcement learning, 2017 (Workshop Track of the International Conference on Learning Representations)
[9] Swing options valuation: A bsde with constrained jumps approach, Numerical methods in finance, Springer, 2012, pp. 379-400 | DOI | Zbl
[10] Discrete-time approximation for continuously and discretely reflected BSDEs, Stochastic Processes Appl., Volume 118 (2008) no. 12, pp. 2269-2293 | DOI | MR | Zbl
[11] Monte-Carlo valuation of American options: facts and new algorithms to improve existing methods, Numerical methods in finance, Springer, 2012, pp. 215-255 | DOI | Zbl
[12] Deep hedging, Quant. Finance, Volume 19 (2019) no. 8, pp. 1271-1291 | DOI | MR | Zbl
[13] Optimal multiple stopping and valuation of swing options, Math. Finance, Volume 18 (2008) no. 2, pp. 239-268 | DOI | MR | Zbl
[14] Machine learning for semi linear PDEs, J. Sci. Comput., Volume 79 (2019) no. 3, pp. 1667-1712 | DOI | MR | Zbl
[15] Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Commun. Math. Stat., Volume 5 (2017) no. 4, pp. 349-380 | MR | Zbl
[16] et al. Reflected solutions of backward SDE’s, and related obstacle problems for PDE’s, Ann. Probab., Volume 25 (1997) no. 2, pp. 702-737 | MR
[17] Deep learning for discrete-time hedging in incomplete markets, J. Comput. Finance (2020)
[18] Convergence and biases of Monte Carlo estimates of American option prices using a parametric exercise rule, J. Econ. Dyn. Control, Volume 27 (2003) no. 10, pp. 1855-1879 | DOI | MR | Zbl
[19] Monte Carlo methods in financial engineering, 53, Springer, 2013
[20] Understanding the difficulty of training deep feedforward neural networks, Proceedings of the thirteenth international conference on artificial intelligence and statistics (2010), pp. 249-256
[21] Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci. USA, Volume 115 (2018) no. 34, pp. 8505-8510 | MR | Zbl
[22] Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis, SIAM J. Numer. Anal., Volume 59 (2021) no. 1, pp. 525-557 | DOI | MR | Zbl
[23] Deep backward schemes for high-dimensional nonlinear PDEs, Math. Comput., Volume 89 (2020) no. 324, pp. 1547-1579 | DOI | MR | Zbl
[24] Valuation by simulation of contingent claims with multiple early exercise opportunities, Math. Finance, Volume 14 (2004) no. 2, pp. 223-248 | DOI | MR | Zbl
[25] Pricing of High-Dimensional American Options by Neural Networks, Math. Finance, Volume 20 (2010) no. 3, pp. 383-410 | DOI | MR | Zbl
[26] Valuing American options by simulation: a simple least-squares approach, Rev. Financ. Stud., Volume 14 (2001) no. 1, pp. 113-147 | DOI
[27] Applied stochastic control of jump diffusions, 498, Springer, 2005
[28] Stochastic calculus for finance II: Continuous-time models, 11, Springer, 2004
[29] DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys., Volume 375 (2018), pp. 1339-1364 | DOI | MR | Zbl
[30] Policy gradient methods for reinforcement learning with function approximation, NIPS’99: Proceedings of the 12th International Conference on Neural Information Processing Systems (2000), pp. 1057-1063
[31] Study of undiscounted non-linear optimal multiple stopping problems on unbounded intervals, Int. J. Math. Oper. Res., Volume 5 (2013) no. 2, pp. 225-254 | DOI | MR | Zbl
[32] Gas storage hedging, Numerical methods in finance, Springer, 2012, pp. 421-445 | DOI | MR | Zbl
[33] Analysis and improvement of policy gradient estimation, NIPS’11: Proceedings of the 24th International Conference on Neural Information Processing Systems (2011), pp. 262-270
Cited by Sources: