Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul-Aug;30(4):474-83.
doi: 10.1177/0272989X09353194. Epub 2009 Dec 31.

Markov decision processes: a tool for sequential decision making under uncertainty

Affiliations

Markov decision processes: a tool for sequential decision making under uncertainty

Oguzhan Alagoz et al. Med Decis Making. 2010 Jul-Aug.

Abstract

We provide a tutorial on the construction and evaluation of Markov decision processes (MDPs), which are powerful analytical tools used for sequential decision making under uncertainty that have been widely used in many industrial and manufacturing applications but are underutilized in medical decision making (MDM). We demonstrate the use of an MDP to solve a sequential clinical treatment problem under uncertainty. Markov decision processes generalize standard Markov models in that a decision process is embedded in the model and multiple decisions are made over time. Furthermore, they have significant advantages over standard decision analysis. We compare MDPs to standard Markov-based simulation models by solving the problem of the optimal timing of living-donor liver transplantation using both methods. Both models result in the same optimal transplantation policy and the same total life expectancies for the same patient and living donor. The computation time for solving the MDP model is significantly smaller than that for solving the Markov model. We briefly describe the growing literature of MDPs applied to medical decisions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
State transition diagram for a Markov model. The circles labeled state 1 (Model for End-stage Liver Disease [MELD] 6–7) through state 18 (MELD 40) represent possible health states for the patient (i.e., the patient can be at one of the 18 states at any time period). Note that each state actually represents 2 adjacent MELD scores; for instance, 1 represents MELD scores 6 and 7, 2 represents MELD scores 8 and 9, and so on. Each time period, the patient may transit to one of the other MELD states or die. At each state, there is a Boolean node that will direct the patient to accept the living donor organ if the MELD score is greater than some value (h*): the optimal transplant MELD is found by solving the model over all possible values of h*. Note that the transition probabilities between pretrans-plant states in the figure depend on the policy; however, this dependency is suppressed for the clarity of presentation. As a result, the transition probabilities between pretransplant states in the figure represent only the transition probabilities when h* = 41.
Figure 2
Figure 2
State transition diagram of the Markov decision process model. The state space is identical to Figure 1. At each health state, the patient can take 2 actions: he or she can either choose to have the transplant at the current time period, which is represented by “T”, or wait for one more time period, which is represented by “W”. When the patient chooses the transplant option, he or she moves to the “Transplant” state, which is an absorbing state with probability 1 and gets a reward of r(h) that represents the expected posttransplant life days of the patient when his or her current health state is h. If the patient chooses to wait for one more decision period, he or she can stay at his or her current health state with probability p(h|h), he or she can move to some other health state h’ with probability p(h’|h), or he or she can die at the beginning of the next decision epoch with probability p(D|h). In all cases, the patient will receive a reward of 1 that corresponds to the additional day the patient will live without transplantation. The transitions occur randomly once the patient takes the action “Wait”. These “Transplant”/“Wait” decisions exist for each health state of the patient.

Similar articles

Cited by

References

    1. Roberts MS. Markov process-based Monte Carlo simulation: a tool for modeling complex disease and its application to the timing of liver transplantation. Proceedings of the 24th Conference on Winter Simulation. 1992:1034–40.
    1. Beck JR, Pauker SG. The Markov process in medical prognosis. Med Decis Making. 1983;3(4):419–58. - PubMed
    1. Detsky AS, Naglie G, Krahn MD, Naimark D, Redelmeier DA. Primer on medical decision analysis. Part 1: getting started. Med Decis Making. 1997;17(2):123. - PubMed
    1. Sandikci B, Maillart LM, Schaefer AJ, Alagoz O, Roberts MS. Estimating the patient's price of privacy in liver transplantation. Oper Res. 2008;56(6):1393–410.
    1. Puterman ML. Markov Decision Processes. John Wiley and Sons; New York: 1994.

Publication types