Markov decision processes: a tool for sequential decision making under uncertainty

Oguzhan Alagoz¹, Heather Hsu, Andrew J Schaefer, Mark S Roberts

Affiliations

PMID: 20044582
PMCID: PMC3060044
DOI: 10.1177/0272989X09353194

Markov decision processes: a tool for sequential decision making under uncertainty

Oguzhan Alagoz et al. Med Decis Making. 2010 Jul-Aug.

. 2010 Jul-Aug;30(4):474-83.

doi: 10.1177/0272989X09353194. Epub 2009 Dec 31.

Authors

Oguzhan Alagoz¹, Heather Hsu, Andrew J Schaefer, Mark S Roberts

Affiliation

¹ Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA. alagoz@engr.wisc.edu

PMID: 20044582
PMCID: PMC3060044
DOI: 10.1177/0272989X09353194

Abstract

We provide a tutorial on the construction and evaluation of Markov decision processes (MDPs), which are powerful analytical tools used for sequential decision making under uncertainty that have been widely used in many industrial and manufacturing applications but are underutilized in medical decision making (MDM). We demonstrate the use of an MDP to solve a sequential clinical treatment problem under uncertainty. Markov decision processes generalize standard Markov models in that a decision process is embedded in the model and multiple decisions are made over time. Furthermore, they have significant advantages over standard decision analysis. We compare MDPs to standard Markov-based simulation models by solving the problem of the optimal timing of living-donor liver transplantation using both methods. Both models result in the same optimal transplantation policy and the same total life expectancies for the same patient and living donor. The computation time for solving the MDP model is significantly smaller than that for solving the Markov model. We briefly describe the growing literature of MDPs applied to medical decisions.

PubMed Disclaimer

Figures

**Figure 1**
State transition diagram for a Markov model. The circles labeled state 1 (Model for End-stage Liver Disease [MELD] 6–7) through state 18 (MELD 40) represent possible health states for the patient (i.e., the patient can be at one of the 18 states at any time period). Note that each state actually represents 2 adjacent MELD scores; for instance, 1 represents MELD scores 6 and 7, 2 represents MELD scores 8 and 9, and so on. Each time period, the patient may transit to one of the other MELD states or die. At each state, there is a Boolean node that will direct the patient to accept the living donor organ if the MELD score is greater than some value (h*): the optimal transplant MELD is found by solving the model over all possible values of h*. Note that the transition probabilities between pretrans-plant states in the figure depend on the policy; however, this dependency is suppressed for the clarity of presentation. As a result, the transition probabilities between pretransplant states in the figure represent only the transition probabilities when h* = 41.

**Figure 2**
State transition diagram of the Markov decision process model. The state space is identical to Figure 1. At each health state, the patient can take 2 actions: he or she can either choose to have the transplant at the current time period, which is represented by “T”, or wait for one more time period, which is represented by “W”. When the patient chooses the transplant option, he or she moves to the “Transplant” state, which is an absorbing state with probability 1 and gets a reward of r(h) that represents the expected posttransplant life days of the patient when his or her current health state is h. If the patient chooses to wait for one more decision period, he or she can stay at his or her current health state with probability p(h|h), he or she can move to some other health state h’ with probability p(h’|h), or he or she can die at the beginning of the next decision epoch with probability p(D|h). In all cases, the patient will receive a reward of 1 that corresponds to the additional day the patient will live without transplantation. The transitions occur randomly once the patient takes the action “Wait”. These “Transplant”/“Wait” decisions exist for each health state of the patient.

See this image and copyright information in PMC

References

1. Roberts MS. Markov process-based Monte Carlo simulation: a tool for modeling complex disease and its application to the timing of liver transplantation. Proceedings of the 24th Conference on Winter Simulation. 1992:1034–40.
1. Beck JR, Pauker SG. The Markov process in medical prognosis. Med Decis Making. 1983;3(4):419–58. - PubMed
1. Detsky AS, Naglie G, Krahn MD, Naimark D, Redelmeier DA. Primer on medical decision analysis. Part 1: getting started. Med Decis Making. 1997;17(2):123. - PubMed
1. Sandikci B, Maillart LM, Schaefer AJ, Alagoz O, Roberts MS. Estimating the patient's price of privacy in liver transplantation. Oper Res. 2008;56(6):1393–410.
1. Puterman ML. Markov Decision Processes. John Wiley and Sons; New York: 1994.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Markov decision processes: a tool for sequential decision making under uncertainty

Affiliation

Markov decision processes: a tool for sequential decision making under uncertainty

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources