Title | Multi-resolution modeling of a discrete stochastic process identifies causes of cancer |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Yaari, AUri, Sherman, M, Priebe, OClarke, Loh, P-R, Katz, B, Barbu, A, Berger, B |
Conference Name | International Conference on Learning Representations |
Date Published | 09/2020 |
Abstract | Detection of cancer-causing mutations within the vast and mostly unexplored human genome is a major challenge. Doing so requires modeling the background mutation rate, a highly non-stationary stochastic process, across regions of interest varying in size from one to millions of positions. Here, we present the split-Poisson-Gamma (SPG) distribution, an extension of the classical Poisson-Gamma formulation, to model a discrete stochastic process at multiple resolutions. We demonstrate that the probability model has a closed-form posterior, enabling efficient and accurate linear-time prediction over any length scale after the parameters of the model have been inferred a single time. We apply our framework to model mutation rates in tumors and show that model parameters can be accurately inferred from high-dimensional epigenetic data using a convolutional neural network, Gaussian process, and maximum-likelihood estimation. Our method is both more accurate and more efficient than existing models over a large range of length scales. We demonstrate the usefulness of multi-resolution modeling by detecting genomic elements that drive tumor emergence and are of vastly differing sizes. |
URL | https://openreview.net/forum?id=KtH8W3S_RE |
Associated Module:
CBMM Relationship:
- CBMM Funded