emily's picture
From emily rss RSS  subscribe Subscribe

A Genetic Algorithm for Learning Bayesian Network Adjacency Matrices from Data 



Bayesian Network
Definitions and examples
Inference and learning
Genetic Algorithms
Structure Learning Background
Problem
K2 algorithm
Sparse Candidate
Improving K2: Permutation Genetic Algorithm (GASLEAK)
Shortcoming: greedy, sensitive to ordering
Permutation GA
Master’s thesis: Adjacency Matrix GA (SLAM GA)
Rationale

Evaluation with Known Bayesian Networks

 

 
 
Tags:  Bayesian  Network  Structure 
Views:  4740
Downloads:  110
Published:  August 12, 2007
 
3
download

Share plick with friends Share
save to favorite
Report Abuse Report Abuse
 
Related Plicks
Estimation of Distribution Algorithms Tutorial

Estimation of Distribution Algorithms Tutorial

From: anon-280179
Views: 681 Comments: 0

 
Estimation of Distribution Algorithms Tutorial

Estimation of Distribution Algorithms Tutorial

From: backmae
Views: 180 Comments: 0

 
Bayesian Field Theory

Bayesian Field Theory

From: anon-392689
Views: 229 Comments: 0
Bayesian Field Theory ,texas public library, the library nightclub santa barbara, ann arbor library district, tcu library
 
Bayesian Nonparametrics

Bayesian Nonparametrics

From: anon-389206
Views: 251 Comments: 0
Bayesian Nonparametrics ,turn your iphone into an e-book, huntinton library california, microsoft office 12.0 object library download, alien wars ebook by sam harrison
 
See all 
 
More from this user
Java One 2005 Technical

Java One 2005 Technical

From: emily
Views: 3209
Comments: 0

NSDI - Poland

NSDI - Poland

From: emily
Views: 2240
Comments: 0

Welcome to the Minnesota SharePoint User Group

Welcome to the Minnesota SharePoint User Group

From: emily
Views: 6132
Comments: 0

Java One 2002 Overview

Java One 2002 Overview

From: emily
Views: 2238
Comments: 0

SQL Server 2005

SQL Server 2005

From: emily
Views: 3834
Comments: 1

CATPDG Quick Start Demo

CATPDG Quick Start Demo

From: emily
Views: 1401
Comments: 0

See all 
 
 
 URL:          AddThis Social Bookmark Button
Embed Thin Player: (fits in most blogs)
Embed Full Player :
 
 

Name

Email (will NOT be shown to other users)

 

 
 
Comments: (watch)
 
 
Notes:
 
Slide 1: Ben Perry – M.S. Thesis Defense A Genetic Algorithm for Learning Bayesian Network Adjacency Matrices from Data Benjamin B. Perry Laboratory for Knowledge Discovery in Databases Kansas State University http://www.kddresearch.org http://www.cis.ksu.edu/~bbp9857 Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 2: Overview • Bayesian Network – Definitions and examples – Inference and learning • • Genetic Algorithms Structure Learning Background – Problem – K2 algorithm – Sparse Candidate • Improving K2: Permutation Genetic Algorithm (GASLEAK) – Shortcoming: greedy, sensitive to ordering – Permutation GA • Master’s thesis: Adjacency Matrix GA (SLAM GA) – Rationale • • Evaluation with Known Bayesian Networks Summary Kansas State University Department of Computing and Information Sciences Ben Perry – M.S. thesis defense
Slide 3: Bayesian Belief Networks (BBNS): Definition • Bayesian Network – – – – Directed acyclic graph Vertices (nodes): denote events, or states of affairs (each a random variable) Edges (arcs, links): denote conditional dependencies, causalities Model of conditional dependence assertions (or CI assumptions) • Example (“Ben’s Presentation” BBN) Sleep: Narcoleptic Well X1 Bad All-nighter X2 (sprinkler) Ben is nervous: Extremely, Yes, No X4 X5 Ben’s presentation: Good, Not so good, Failed miserably Appearance: Good, Bad X3 Memory: Elephant, Good, Bad, None • P(Well, Good, Good, No, General Product (Chain)Good) =for BBNs` W) · P(G | W) · P(N | G, G) · P(G | N) Rule P(G) · P(G | n i 1 P  X 1 , X 2 ,  , X n    P  X i | parents  X i   Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 4: Graphical Models of Probability Distributions • Idea – Want: model that can be used to perform inference – Desired properties • Correlations among variables • Ability to represent functional, logical, stochastic relationships • Probability of certain events • Inference: Decision Support Problems – Diagnosis (medical, equipment) – Pattern recognition (image, speech) – Prediction • Want to Learn: Most Likely Model that Generates Observed Data – Under certain assumptions (Causal Markovity), it has been shown that we can do it – Given: data D (tuples or vectors containing observed values of variables) – Return: directed graph (V, E) expressing target CPTs – NEXT: Genetic algorithms Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 5: Genetic Algorithms • Idea – Emulate natural process of survival of the fittest (Example: Roaches adapt) – Each generation has many diverse individuals – Each individual competes for the chance to survive – Most common approach: best individuals live to the next generation and mate – Produce children with traits from both parents – If parents are strong, children might be stronger • Major components (operators) – Fitness function – Chromosome manipulation – Cross-over (Not the “John Edward” type!), mutation • From (Educated?) Guess to Gold – Initial population typically random or not much better than random – bad scores – Performs well with a non-deceptive search space and good genetic operators – Ability to escape local optima with mutations. – Not guaranteed to get the best answer, but usually gets close Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 6: Learning Structure: K2 Algorithm • Algorithm Learn-BBN-Structure-K2 (D, Max-Parents) FOR i  1 to n DO // arbitrary ordering of variables {x1, x2, …, xn} // find best candidate parent // max Dirichlet score WHILE (Parents[xi].Size < Max-Parents) DO Best  argmaxj>i (P(D | xj  Parents[xi]) RETURN ({Parents[xi] | i  {1, 2, …, n}}) IF (Parents[xi] + Best).Score > Parents[xi].Score) THEN Parents[xi] += Best • A Logical Alarm Reduction Mechanism [Beinlich et al, 1989] – BBN model for patient monitoring in surgical anesthesia – Vertices (37): findings (e.g., esophageal intubation), intermediates, observables – K2: found BBN different in only 1 edge21 10 22 from gold standard (elicited from expert) 13 19 6 17 28 25 1 18 26 3 7 8 30 Kansas State University Department of Computing and Information Sciences 20 27 11 29 31 32 12 15 34 23 36 35 24 16 5 4 3 7 9 33 14 2 Ben Perry – M.S. thesis defense
Slide 7: Learning Structure: K2 downfalls • • • • • Greedy (may fall into local maxima) Highly dependent upon node ordering Optimal node ordering must be given If optimal order is already known, an expert could probably create the network Number of orderings consistent with DAGs is exponential (n!) Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 8: Learning Structure: Sparse Candidate • General Idea: – Inspect k-best parent candidates at a time. (K2 only inspects one) – k is typically very small ~ 5 ≤ k ≤ 15 – Exponential to the order of k • Algorithm: Loop until no improvements or iteration limit exceeds: For each node, select the top k parent candidates (mutual information or m_disc) [Restrict] Build a network by manipulating parents (add, remove, reverse from candidate set for each node) . Only accept changes that maximizes the network score (Minimum Descriptor Length) [Maximize phase] • Must handle cycles.. expensive. – K2 gives this to us for free – Next: Improving K2 Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 9: GASLEAK: A Permutation GA for Variable Ordering Dtrain (Structure Learning) D: Training Data [2] Representation Evaluator for Bayesian Network Structure Learning Problems Dval (Inference) Ie : Evidence Specification f(α) Ordering Fitness α Candidate Ordering [1] Permutation Genetic Algorithm Genetic Algorithm for Structure Learning from Evidence, AIS, and K2 Ben Perry – M.S. thesis defense ˆ α Optimized Ordering Kansas State University Department of Computing and Information Sciences
Slide 10: Properties of the Genetic Algorithm • • Elitist Chromosome representation – Integer permutation ordering – Sample chromosome in a BBN of 5 nodes might look like: 3 1 2 0 4 • • Seeding – Random shuffle Operators – Order crossover – Swap mutation • • Fitness – RMSE Job farm – Java-based; Utilize many machines regardless of OS Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 11: GASLEAK results • Not encouraging – Bad fitness function or bad evidence b.v. – Many graph errors 0.996 0.982 0.969 0.955 0.941 0.927 0.913 0.899 0.885 0.871 0.858 0.844 0.830 0.816 0.802 0 200 400 600 800 1000 1200 1400 Histogram of estimated fitness for all 8! = 40320 permutations of Asia variables. Frequency of Validation Set Fitness Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 12: Master’s Thesis: SLAM GA • • SLAM GA – Structure Learning Adjacency Matrix Genetic Algorithm Initial population- tried several approaches: – Completely Random Bayesian Networks (Box-Muller, Max parents) – Many illegal structures; wrote fixCycles algorithm. – Random networks generated from parents pre-selected by the Restrict phase of Sparse Candidate – Performed better than random – Aggregate of k learned networks from K2 given random orderings (cycles eliminated) – Best approach Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 13: Aggregator Instantiater Training Data K2 Manager D Random Order 1 K2 BBN Random Order K2 2 . . . . BBN Aggregator BBN Aggregate BBN Random Order K2 k BBN For small networks, k=1 is best. For larger networks, k=2 is best. Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 14: SLAM GA • Chromosome representation – Edge matrix – n^2 bits – Each bit represents a parent edge to node. – 1 = parent, 0 = not parent • Operators – Crossover: Swap parents, fix cycles. Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 15: SLAM GA: Crossover Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 16: SLAM GA • Chromosome representation – Edge matrix – n^2 – Each bit represents a parent edge to node. – 1 = parent, 0 = not parent • Operators – Crossover: Swap parents, fix cycles. – Mutation: Reverse, delete, or add a random number of edges. Fix cycles. • Fitness – Total Bayesian Dirichlet equivalence score for each node Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 17: Results - Asia Best of first network Learned generation Actual 15 Graph Error 1 Errors Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 18: Results – Asia Best fitness per generation 3750 3700 3650 3600 3550 3500 3450 3400 3350 3300 29 37 41 45 49 57 77 81 85 89 69 13 17 21 25 33 53 61 65 73 93 Generation 97 1 5 9 K2x1 K2x2 Rnd Ben Perry – M.S. thesis defense Best Fitness of Generation Kansas State University Department of Computing and Information Sciences
Slide 19: Results - Poker Best of first network Learned generation Actual 11Graph Errors 2 Graph Errors Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 20: Results - Poker Best fitness per generation 2500 2000 Best Fitness of Generation 1500 K2x1 K2x2 Rnd 1000 500 0 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 Generations 97 1 5 9 Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 21: Results - Golf Best of first generation Learned network Actual 11Graph Errors 4 Graph Errors Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 22: Results - Golf Best fitness per generation 3500 3000 Best Fitness of Generation 2500 2000 K2x1 K2x2 Rnd 1500 1000 500 0 21 53 65 85 Generation 97 33 13 17 25 29 37 41 45 49 57 61 69 73 77 81 89 93 1 5 9 Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 23: Results – Boerlage92 Learned network Initial Actual Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 24: Results - Boerlage92 Boerlage92 1600 1400 1200 1000 K2x1 800 600 400 200 0 1 6 K2x2 Rnd Best Fitness of Generation 16 21 26 31 41 46 51 56 66 71 76 81 86 91 Generation Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences 96 11 36 61
Slide 25: Results - Alarm Best network per generation 8000 7000 6000 5000 K2x1 4000 3000 2000 1000 0 13 21 25 29 33 41 49 53 57 85 89 93 Generation 97 61 65 69 17 37 45 73 77 81 1 5 9 K2x2 Rnd Ben Perry – M.S. thesis defense Best Fitness of Generation Kansas State University Department of Computing and Information Sciences
Slide 26: Final Fitness Values K2x1 K2x2 Random Asia 3722.084 3720.6069 3722.249 Poker Golf Boerlage92 1999.395 3081.16 1228.621 2011.54 3220.985 1429.355 2001.884 3214.614 1459.587 Alarm 5006.827 7095.658 6861.285 Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 27: K2 vs. SLAM GA • K2: – Very good if ordering is known – Ordering is often not known – Greedy, very dependent on ordering. • SLAM GA – Stochastic; falls out of local optima trap – Can improve on bad structures learned by K2 – Takes much longer than K2 Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 28: GASLEAK vs. SLAM GA • GASLEAK: – Gold network never recovered – Much more computationally-expensive – K2 is run on each [new] individual each generation – Each chromosome must be scored – Final network has many graph errors • SLAM GA – For small networks, gold standard network often recovered. – Relatively few graph errors for final network. – Less computationally intensive – Initial population most expensive – Each chromosome must be scored Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 29: SLAM GA: Ramifications • • Effective structure learning algorithm – Ideal for small networks Improvement over GASLEAK – SLAM GA faster in spite of same GA parameters – SLAM GA more accurate • • • Improvement over K2 Aggregate algorithm produces better initial population Parent-swapping crossover technique effective – Diversifies search space while retaining past information Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 30: SLAM GA: Future Work • • Parameter tweaking Better fitness function – Several ‘bad’ structures score better than gold standard – GA works fine • • ‘Intelligent’ mutation operator – Add edges from pre-qualified set of candidate parents New instantiation methods – Use GASLEAK – Other structure-learning algorithms • Scalability – Job farm Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences
Slide 31: Summary • • • • • Bayesian Network Genetic Algorithms Learning Structure: K2, Sparse Candidate GASLEAK SLAM GA Ben Perry – M.S. thesis defense Kansas State University Department of Computing and Information Sciences

   
Time on Slide Time on Plick
Slides per Visit Slide Views Views by Location