Reinforcement Learning for Radiotherapy Dose Fractioning Automation

Grégoire Moreau¹, Vincent François-Lavet¹, Paul Desbordes¹, Benoît Macq¹

Affiliations

PMID: 33669816
PMCID: PMC7922060
DOI: 10.3390/biomedicines9020214

Reinforcement Learning for Radiotherapy Dose Fractioning Automation

Grégoire Moreau et al. Biomedicines. 2021.

. 2021 Feb 19;9(2):214.

doi: 10.3390/biomedicines9020214.

Authors

Grégoire Moreau¹, Vincent François-Lavet¹, Paul Desbordes¹, Benoît Macq¹

Affiliation

¹ Institute of Information and Communication Technologies, Electronics and Applied Mathematics, UCLouvain, 1348 Louvain-la-Neuve, Belgium.

PMID: 33669816
PMCID: PMC7922060
DOI: 10.3390/biomedicines9020214

Abstract

External beam radiotherapy cancer treatment aims to deliver dose fractions to slowly destroy a tumor while avoiding severe side effects in surrounding healthy tissues. To automate the dose fraction schedules, this paper investigates how deep reinforcement learning approaches (based on deep Q network and deep deterministic policy gradient) can learn from a model of a mixture of tumor and healthy cells. A 2D tumor growth simulation is used to simulate radiation effects on tissues and thus training an agent to automatically optimize dose fractionation. Results show that initiating treatment with large dose per fraction, and then gradually reducing it, is preferred to the standard approach of using a constant dose per fraction.

Keywords: automatic treatment planning; cellular simulation; reinforcement learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Agent–environment interaction where the environment is an in silico simulation of a tumor within its surrounding tissues. At each time step t, the agent takes an action $a_{t}$ , which represents the dose and the environment transitions from $s_{t}$ to $s_{t + 1}$ , while providing a reward $r_{t}$ as well as an observation $ω_{t + 1}$ .

**Figure 2**
Observation of the 2D model where green pixels are healthy cells, red pixels are tumor cells, and black pixels are empty volumes. The intensities are proportional to the density of cells.

**Figure 3**
Shape of treatments found by the agent for the collapsed toy environment (in gray) compared with the baseline treatment (in black). Each point represents a dose delivered to the patients.

**Figure 4**
Shape of treatments found by the agents for the 2D environment (in gray) using the DQN algorithm (**top**) and the DDPG algorithm (**bottom**). The baseline is shown (in black) for comparison. Each point represents a dose delivered to the patients.

**Figure 5**
Effects of the treatments prescribed by the DQN agents on the 2D simulation using the two studied reward functions (K and KD). Green pixels are healthy cells, red pixels are tumor cells, and black pixels are empty volumes.

**Figure 6**
Effects of the treatments prescribed by the DDPG agents on the 2D simulation using the two studied reward functions (K and KD). Green pixels are healthy cells, red pixels are tumor cells, and black pixels are empty volumes.

See this image and copyright information in PMC

References

1. Washington C.M., Leaver D.T., Trad M. Washington & Leaver’s Principles and Practice of Radiation Therapy. 5th ed. Elsevier; St. Louis, MO, USA: 2021. p. 832.
1. Baumann M., Petersen C. TCP and NTCP: A basic introduction. Rays. 2005;30:99–104. - PubMed
1. Matuszak M.M., Kashani R., Green M., Lee C., Cao Y., Owen D., Jolly S., Mierzwa M. Functional Adaptation in Radiation Therapy. Seminars Radiat. Oncol. 2019;29:236–244. doi: 10.1016/j.semradonc.2019.02.006. - DOI - PubMed
1. François-Lavet V., Henderson P., Islam R., Bellemare M.G., Pineau J. An Introduction to Deep Reinforcement Learning. Found. Trends Mach. Learn. 2018;11:219–354. doi: 10.1561/2200000071. - DOI
1. Silver D., Huang A., Maddison C.J., Guez A., Sifre L., van den Driessche G., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., et al. Mastering the game of Go with deep neural networks and tree search. Nature. 2016;529:484–489. doi: 10.1038/nature16961. - DOI - PubMed

Grants and funding

protherwall/Gouvernement Wallon

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reinforcement Learning for Radiotherapy Dose Fractioning Automation

Affiliation

Reinforcement Learning for Radiotherapy Dose Fractioning Automation

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources