Pebl Introduction¶
Pebl is a python library and command line application for learning the structure of a Bayesian network given prior knowledge and observations. Pebl includes the following features:
- Can learn with observational and interventional data
- Handles missing values and hidden variables using exact and heuristic methods
- Provides several learning algorithms; makes creating new ones simple
- Has facilities for transparent parallel execution
- Calculates edge marginals and consensus networks
- Presents results in a variety of formats
Availability¶
Pebl is licensed under a permissive MIT-style license and can be downloaded from its Google code site or from the Python Package Index.
Concepts¶
All Pebl analysis include data, a learner and a result. They may also include prior models and task controllers.
- Data
- This is the set of observations that is used to score a given network. The data can include missing values and hidden/unobserved variables and observations can be marked as being the result of specific interventions. Data can be read from a file or created programatically.
- Learner
- A learner implements a specific learning algorithm. It is given some data, prior model and a stopping criteria and returns a result object.
- Result
- A result object contains a list of the top-scoring networks found during a learner run and some statistics about the analysis. Results from different learning runs with the same data can be merged and visualized in various formats.
- Prior Models
- A key strength of Bayesian analysis is the ability to integrate knowledge with observations. A Pebl prior model specifies the prior belief about the set of possible networks and can include hard and soft constraints.
- Task Controllers
- Pebl uses task controllers to run analyses in parallel. Users can utilize multiple CPU cores or computational clusters without managing any of the details related to parallel programming.