Learning essential graphs

Creative Commons License

aGrUM

interactive online version

In [1]:
from pylab import *


import pyagrum as gum
import pyagrum.lib.notebook as gnb

Compare learning algorithms

Essentially MIIC computes the essential graph (CPDAG) from data. Essential graphs are PDAGs (Partially Directed Acyclic Graphs).

In [2]:
learner = gum.BNLearner("res/sample_asia.csv")
learner.useMIIC()
learner.useNMLCorrection()
print(learner)
Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : MIIC
Correction     : NML
Prior          : -

In [3]:
gemiic = learner.learnEssentialGraph()
gnb.show(gemiic)
../_images/notebooks_33-Learning_LearningAndEssentialGraphs_5_0.svg

For the others methods, it is possible to obtain the essential graph from the learned BN.

In [4]:
learner = gum.BNLearner("res/sample_asia.csv")
learner.useGreedyHillClimbing()
bnHC = learner.learnBN()
print(learner)
geHC = gum.EssentialGraph(bnHC)
geHC
gnb.sideBySide(bnHC, geHC)
Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : Greedy Hill Climbing
Score          : BDeu
Prior          : -

G visit_to_Asia visit_to_Asia positive_XraY positive_XraY lung_cancer lung_cancer smoking smoking lung_cancer->smoking tuberculos_or_cancer tuberculos_or_cancer lung_cancer->tuberculos_or_cancer bronchitis bronchitis lung_cancer->bronchitis tuberculos_or_cancer->positive_XraY dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea bronchitis->smoking bronchitis->dyspnoea tuberculosis tuberculosis tuberculosis->visit_to_Asia tuberculosis->tuberculos_or_cancer
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 3 bronchitis 1->3 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3->5 7 dyspnoea 3->7 4 positive_XraY 6->4 6->7
In [5]:
learner = gum.BNLearner("res/sample_asia.csv")
learner.useLocalSearchWithTabuList()
print(learner)
bnTL = learner.learnBN()
geTL = gum.EssentialGraph(bnTL)
geTL
gnb.sideBySide(bnTL, geTL)
Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : Local Search with Tabu List
Tabu list size : 2
Score          : BDeu
Prior          : -

G visit_to_Asia visit_to_Asia positive_XraY positive_XraY tuberculos_or_cancer tuberculos_or_cancer positive_XraY->tuberculos_or_cancer tuberculosis tuberculosis positive_XraY->tuberculosis lung_cancer lung_cancer smoking smoking lung_cancer->smoking bronchitis bronchitis smoking->bronchitis tuberculos_or_cancer->lung_cancer dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea bronchitis->dyspnoea tuberculosis->visit_to_Asia tuberculosis->lung_cancer tuberculosis->tuberculos_or_cancer
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 1->2 5 smoking 1->5 6 tuberculos_or_cancer 1->6 4 positive_XraY 2->4 2->6 3 bronchitis 3->5 7 dyspnoea 3->7 4->6 6->7

Hence we can compare the 4 algorithms.

In [6]:
(
  gnb.flow.clear()
  .add(gemiic, "Essential graph from miic")
  .add(bnHC, "BayesNet from GHC")
  .add(geHC, "Essential graph from GHC")
  .add(bnTL, "BayesNet from TabuList")
  .add(geTL, "Essential graph from TabuList")
  .display()
)
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3 bronchitis 3->5 7 dyspnoea 3->7 4 positive_XraY 6->4 6->7
Essential graph from miic
G visit_to_Asia visit_to_Asia positive_XraY positive_XraY lung_cancer lung_cancer smoking smoking lung_cancer->smoking tuberculos_or_cancer tuberculos_or_cancer lung_cancer->tuberculos_or_cancer bronchitis bronchitis lung_cancer->bronchitis tuberculos_or_cancer->positive_XraY dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea bronchitis->smoking bronchitis->dyspnoea tuberculosis tuberculosis tuberculosis->visit_to_Asia tuberculosis->tuberculos_or_cancer
BayesNet from GHC
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 3 bronchitis 1->3 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3->5 7 dyspnoea 3->7 4 positive_XraY 6->4 6->7
Essential graph from GHC
G visit_to_Asia visit_to_Asia positive_XraY positive_XraY tuberculos_or_cancer tuberculos_or_cancer positive_XraY->tuberculos_or_cancer tuberculosis tuberculosis positive_XraY->tuberculosis lung_cancer lung_cancer smoking smoking lung_cancer->smoking bronchitis bronchitis smoking->bronchitis tuberculos_or_cancer->lung_cancer dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea bronchitis->dyspnoea tuberculosis->visit_to_Asia tuberculosis->lung_cancer tuberculosis->tuberculos_or_cancer
BayesNet from TabuList
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 1->2 5 smoking 1->5 6 tuberculos_or_cancer 1->6 4 positive_XraY 2->4 2->6 3 bronchitis 3->5 7 dyspnoea 3->7 4->6 6->7
Essential graph from TabuList
In [ ]: