Statistical model checking for Markov decision processes

Statistical Model Checking (SMC) is a computationally very efficient verification technique based on selective system sampling. One well identified shortcoming of SMC is that, unlike probabilistic model checking, it cannot be applied to systems featuring nondeterminism, such as Markov Decision Processes (MDP). We address this limitation by developing an algorithm that resolves nondeterminism probabilistically, and then uses multiple rounds of sampling and Reinforcement Learning to provably improve resolutions of nondeterminism with respect to satisfying a Bounded Linear Temporal Logic (BLTL) property. Our algorithm thus reduces an MDP to a fully probabilistic Markov chain on which SMC may be applied to give an approximate solution to the problem of checking the probabilistic BLTL property. We integrate our algorithm in a parallelised modification of the PRISM simulation framework. Extensive validation with both new and PRISM benchmarks demonstrates that the approach scales very well in scenarios where symbolic algorithms fail to do so.

@INPROCEEDINGS{DBLP:conf/qest/HenriquesMZPC12,
	pdf = {pub/smcmdp.pdf},
	slides = {pub/smcmdp-slides.pdf},
	study = {https://www.cs.cmu.edu/~jmartins/QEST12.zip},

  author    = {David Henriques and
               Jo{\~a}o G. Martins and
               Paolo Zuliani and
               Andr{\'e} Platzer and
               Edmund M. Clarke},
  title     = {Statistical Model Checking for
               {Markov} Decision Processes},
  booktitle = {QEST},
  year      = {2012},
  pages     = {84-93},
  doi       = {10.1109/QEST.2012.19},
  longbooktitle = {Ninth International Conference on
               Quantitative Evaluation of Systems, QEST
               2012, London, UK, 17-20 September, 2012},
  publisher = {IEEE Computer Society},
  keywords  = {statistical model checking,
               Markov decision processes,
               reinforcement learning},
  abstract  = {
    Statistical Model Checking (SMC) is a computationally
    very efficient verification technique based on
    selective system sampling. One well identified
    shortcoming of SMC is that, unlike probabilistic model
    checking, it cannot be applied to systems featuring
    nondeterminism, such as Markov Decision Processes
    (MDP). We address this limitation by developing an
    algorithm that resolves nondeterminism
    probabilistically, and then uses multiple rounds of
    sampling and Reinforcement Learning to provably
    improve resolutions of nondeterminism with respect to
    satisfying a Bounded Linear Temporal Logic (BLTL)
    property. Our algorithm thus reduces an MDP to a fully
    probabilistic Markov chain on which SMC may be applied
    to give an approximate solution to the problem of
    checking the probabilistic BLTL property. We integrate
    our algorithm in a parallelised modification of the
    PRISM simulation framework. Extensive validation with
    both new and PRISM benchmarks demonstrates that the
    approach scales very well in scenarios where symbolic
    algorithms fail to do so.}
}```