Mean first-passage time control policy versus reinforcement-learning control policy in gene regulatory networks

Probabilistic Boolean Networks are rule-based models for gene regulatory networks. They are used to design intervention strategies in translational genomics such as cancer treatment. Previously, methods for finding control policies with the highest effect on steady-state distributions of probabilistic Boolean networks have been proposed. These methods were derived using the theory of infinite-horizon stochastic control. It is well-known that the direct application of optimal control methods is problematic owing to their high computational complexity and the fact that they require the inference of the system model. To bypass the impediment of model estimation, two algorithms for approximating the optimal control policy have been introduced. These algorithms are based on reinforcement learning and mean first-passage times. In this work, the performance of these two methods are compared using both a melanoma-related network and randomly generated networks. It is shown that the mean-first-passage-time-based algorithm outperforms the reinforcement-learning- based algorithm for smaller amount of training data, which corresponds better to feasible experimental conditions. In contrary to the reinforcement-learning- based algorithm, during the learning period of the mean-first-passagetime-based algorithm, the application of control is not required. Intervention in biological systems during the learning phase may induce undesirable side-effects. 2008 AACC.

Vahedi, Golnaz||Faryabi, Babak||Chamberland, Jean-Francois||Datta, Aniruddha||Dougherty, Edward R

Mean first-passage time control policy versus reinforcement-learning control policy in gene regulatory networks Conference Paper