Subgraph Mining

DS 352 Syllabus

last updated 15-Oct-2018

Introduction of frequent subgraph mining

Extends association analysis to finding frequent subgraphs

Useful for

Some graph definitions

A graph is isomorphic if it is topologically equivalent to another graph


Topological Equivalence

 


Mapping transactions to graphs

Each transaction is a clique of nodes

Graphs as transactions

Counting Support


Challenges

Node may contain duplicate labels

Support and confidence measures

Additional constraints imposed by pattern structure

Apriori-like approach: Use frequent k-subgraphs to generate frequent (k+1) subgraphs

What is k?

Support: number of graphs that contain a particular subgraph

Apriori principle still holds

Level-wise (Apriori-like) approach:


Vertex growing


Edge growing

 


Apriori-like algorithm

Find frequent 1-subgraphs

Repeat:

Candidate generation

Candidate pruning

Support counting

Eliminate candidate k-subgraphs that are infrequent

Example dataset

In Apriori: Merging two frequent k-itemsets will produce a candidate (k+1)-itemset

In frequent subgraph mining (vertex/edge growing)

Merging two frequent k-subgraphs may produce more than one candidate (k+1)-subgraph

 

 


Multiplicity of candidates (vertices)


Multiplicity of candidates (edges)

Case 1: identical vertex labels

Case 2: Core contains identical labels

Case 3: Core multiplicity

 


Candidate Generation by edge growing

Case 1: if a ≠ c and b ≠ d Case 2: if a = c and b ≠ d Case 3: if a ≠ c and b = d

Case 4: if a = c and b = d