In search of a coherent theoretical framework for stochastic gene regulation
Vastola, John Joseph
Gene regulatory networks play a significant role in controlling cell fate decisions and other kinds of cellular decision-making (e.g. whether or not to proceed with growth or division). One big picture dream of quantitative biology is to understand these networks well enough to manually control cellular decision-making, and to do so precisely and efficiently; this would enable enterprises like regenerative medicine, which have the potential to greatly improve human health and wellness. For now, understanding even very small gene networks, and the single cell data associated with them, is extremely challenging. One reason why this is so is that we do not yet have a well-developed theoretical framework for thinking about gene regulation or cell fate decisions. In this thesis, I advance the idea that the chemical master equation (CME) can in principle provide such a framework for thinking about gene regulation. The CME includes all other known modeling approaches as special cases, and it has attractive features as a fundamental modeling approach (e.g. models are highly constrained and can be rigorously justified by appeal to the physics of interacting molecules). But in order for the CME to play the foundational role in gene regulation that (for example) the Schrodinger equation plays in quantum mechanics, we must understand it much better. In particular, we should understand (i) the qualitative consequences of every possible gene network model being in principle describable by a CME model, (ii) how to approximate the CME, and (iii) how to solve the CME. I explore each of these throughout the thesis. The main scientific contribution of this thesis is its technical content related to solving the CME. Because the CME behaves mathematically much like the Schrodinger equation, many tools originally developed for studying quantum mechanics (e.g. path integrals and ladder operators) can be adapted to study gene networks. We use these tools to derive the most general known exact solutions of the CME. Furthermore, we show how these analytic insights can greatly speed up model fitting and parameter inference for realistic models of RNA transcription and splicing.