A Comparison of State-of-the-Art Algorithms for Learning Bayesian Network Structure from Continuous Data

Fu, Lawrence Dachen

A Comparison of State-of-the-Art Algorithms for Learning Bayesian Network Structure from Continuous Data

Fu, Lawrence Dachen

Persistent Link: https://etd.library.vanderbilt.edu/etd-12022005-171510
http://hdl.handle.net/1803/14997

Date: 2005-12-19

Abstract

In biomedical and biological domains, researchers typically study continuous data sets. In these domains, an increasingly popular tool for understanding the relationship between variables is Bayesian network structure learning. There are three methods for learning Bayesian network structure from continuous data. The most popular approach is discretizing the data prior to structure learning. Alternative approaches are integrating discretization with structure learning as well as learning directly with continuous data. It is not known which method is best since there has not been a unified study of the three approaches. The purpose of this work was to perform an extensive experimental evaluation of them. For large data sets consisting of originally discrete variables, discretization-based approaches learned the most accurate structures. With smaller sample sizes or data without an underlying discrete mechanism, a method learning directly with continuous data performed best. Also, for some quality metrics, the integrated methods did not provide improvements over simple discretization methods. In terms of time-efficiency, the integrated approaches were the most computationally intensive, while methods from the other categories were the least intensive.

Show full item record

Files in this item

Name:: ms_thesis.pdf
Size:: 358.2Kb
Format:: PDF

View/Open

This item appears in the following collection(s):

Electronic Theses and Dissertations