Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 19;9(2):214.
doi: 10.3390/biomedicines9020214.

Reinforcement Learning for Radiotherapy Dose Fractioning Automation

Affiliations

Reinforcement Learning for Radiotherapy Dose Fractioning Automation

Grégoire Moreau et al. Biomedicines. .

Abstract

External beam radiotherapy cancer treatment aims to deliver dose fractions to slowly destroy a tumor while avoiding severe side effects in surrounding healthy tissues. To automate the dose fraction schedules, this paper investigates how deep reinforcement learning approaches (based on deep Q network and deep deterministic policy gradient) can learn from a model of a mixture of tumor and healthy cells. A 2D tumor growth simulation is used to simulate radiation effects on tissues and thus training an agent to automatically optimize dose fractionation. Results show that initiating treatment with large dose per fraction, and then gradually reducing it, is preferred to the standard approach of using a constant dose per fraction.

Keywords: automatic treatment planning; cellular simulation; reinforcement learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Agent–environment interaction where the environment is an in silico simulation of a tumor within its surrounding tissues. At each time step t, the agent takes an action at, which represents the dose and the environment transitions from st to st+1, while providing a reward rt as well as an observation ωt+1.
Figure 2
Figure 2
Observation of the 2D model where green pixels are healthy cells, red pixels are tumor cells, and black pixels are empty volumes. The intensities are proportional to the density of cells.
Figure 3
Figure 3
Shape of treatments found by the agent for the collapsed toy environment (in gray) compared with the baseline treatment (in black). Each point represents a dose delivered to the patients.
Figure 4
Figure 4
Shape of treatments found by the agents for the 2D environment (in gray) using the DQN algorithm (top) and the DDPG algorithm (bottom). The baseline is shown (in black) for comparison. Each point represents a dose delivered to the patients.
Figure 5
Figure 5
Effects of the treatments prescribed by the DQN agents on the 2D simulation using the two studied reward functions (K and KD). Green pixels are healthy cells, red pixels are tumor cells, and black pixels are empty volumes.
Figure 6
Figure 6
Effects of the treatments prescribed by the DDPG agents on the 2D simulation using the two studied reward functions (K and KD). Green pixels are healthy cells, red pixels are tumor cells, and black pixels are empty volumes.

References

    1. Washington C.M., Leaver D.T., Trad M. Washington & Leaver’s Principles and Practice of Radiation Therapy. 5th ed. Elsevier; St. Louis, MO, USA: 2021. p. 832.
    1. Baumann M., Petersen C. TCP and NTCP: A basic introduction. Rays. 2005;30:99–104. - PubMed
    1. Matuszak M.M., Kashani R., Green M., Lee C., Cao Y., Owen D., Jolly S., Mierzwa M. Functional Adaptation in Radiation Therapy. Seminars Radiat. Oncol. 2019;29:236–244. doi: 10.1016/j.semradonc.2019.02.006. - DOI - PubMed
    1. François-Lavet V., Henderson P., Islam R., Bellemare M.G., Pineau J. An Introduction to Deep Reinforcement Learning. Found. Trends Mach. Learn. 2018;11:219–354. doi: 10.1561/2200000071. - DOI
    1. Silver D., Huang A., Maddison C.J., Guez A., Sifre L., van den Driessche G., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., et al. Mastering the game of Go with deep neural networks and tree search. Nature. 2016;529:484–489. doi: 10.1038/nature16961. - DOI - PubMed