A Modified Random Forest Kernel for Highly Nonstationary Gaussian Process Regression with Application to Clinical Data
VanHouten, Jacob Paul
Nonstationary Gaussian process regression can be used to transform irregularly episodic and noisy measurements into continuous probability densities to make them more compatible with standard machine learning algorithms. However, current inference algorithms are time-consuming or have difficulty with the highly bursty, extremely nonstationary data that are common in the medical domain. One efficient and flexible solution uses a partition kernel based on random forests, but its current embodiment produces undesirable pathologies rooted in the piecewise-constant nature of its inferred posteriors. I present a modified random forest kernel that adds a new sources of randomness to the trees, which overcomes existing pathologies and produces good results for highly bursty, extremely nonstationary clinical laboratory measurements.