Variational Inference on Phylogenetic Time Trees using COVID-19 Datasets
Project with Lloyd Elliott
This project involves advancing computational methods in phylogenetics, and variational inference. Variational inference is a statistical method that approximates complex probability distributions efficiently, making it well-suited for analyzing the tree structure of large-scale genetic datasets. While variational inference has been applied to certain kinds of phylogenetic trees, it has not been applied broadly to trees that explicitly incorporate time (also known as ultra-metric trees). The goal of this study is to apply variational inference to ultra-metric phylogenetic trees, with a particular focus on COVID-19 datasets. The project includes:
- Downloading, and preprocessing COVID-19 datasets from GISAID or VirusSeq.
- Visualization of trees, and producing publication ready plots of experimental results.
- Conducting simulations to infer posterior trees using BEAST, and variational inference software.