Class VoltageClusterer<V,E>
- java.lang.Object
-
- edu.uci.ics.jung.algorithms.cluster.VoltageClusterer<V,E>
-
public class VoltageClusterer<V,E> extends java.lang.Object
Clusters vertices of a
Graph
based on their ranks as calculated byVoltageScorer
. This algorithm is based on, but not identical with, the method described in the paper below. The primary difference is that Wu and Huberman assume a priori that the clusters are of approximately the same size, and therefore use a more complex method than k-means (which is used here) for determining cluster membership based on co-occurrence data.The algorithm proceeds as follows:
-
first, generate a set of candidate clusters as follows:
-
pick (widely separated) vertex pair, run VoltageScorer
group the vertices in two clusters according to their voltages
store resulting candidate clusters
-
pick a vertex v as a cluster 'seed'
(Wu/Huberman: most frequent vertex in candidate clusters) calculate co-occurrence over all candidate clusters of v with each other vertex separate co-occurrence counts into high/low; high vertices constitute a cluster remove v's vertices from candidate clusters; continueNOTE: Depending on how the co-occurrence data splits the data into clusters, the number of clusters returned by this algorithm may be less than the number of clusters requested. The number of clusters will never be more than the number requested, however.
- See Also:
- "'Finding communities in linear time: a physics approach', Fang Wu and Bernardo Huberman, http://www.hpl.hp.com/research/idl/papers/linear/",
VoltageScorer
,KMeansClusterer
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
VoltageClusterer.MapValueArrayComparator
-
Field Summary
Fields Modifier and Type Field Description protected edu.uci.ics.jung.graph.Graph<V,E>
g
protected KMeansClusterer<V>
kmc
protected int
num_candidates
protected java.util.Random
rand
-
Constructor Summary
Constructors Constructor Description VoltageClusterer(edu.uci.ics.jung.graph.Graph<V,E> g, int num_candidates)
Creates an instance of a VoltageCluster with the specified parameters.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
addOneCandidateCluster(java.util.LinkedList<java.util.Set<V>> candidates, java.util.Map<V,double[]> voltage_ranks)
alternative to addTwoCandidateClusters(): cluster vertices by voltages into 2 clusters.protected void
addTwoCandidateClusters(java.util.LinkedList<java.util.Set<V>> candidates, java.util.Map<V,double[]> voltage_ranks)
Do k-means with three intervals and pick the smaller two clusters (presumed to be on the ends); this is closer to the Wu-Huberman method.java.util.Collection<java.util.Set<V>>
cluster(int num_clusters)
Clusters the vertices ofg
intonum_clusters
clusters, based on their connectivity.protected java.util.Collection<java.util.Set<V>>
cluster_internal(V origin, int num_clusters)
Does the work ofgetCommunity
andcluster
.java.util.Collection<java.util.Set<V>>
getCommunity(V v)
Returns a community (cluster) centered aroundv
.protected java.util.Map<V,double[]>
getObjectCounts(java.util.Collection<java.util.Set<V>> candidates, V seed)
protected java.util.List<V>
getSeedCandidates(java.util.Collection<java.util.Set<V>> candidates)
Returns an array of cluster seeds, ranked in decreasing order of number of appearances in the specified collection of candidate clusters.protected void
setRandomSeed(int random_seed)
-
-
-
Field Detail
-
num_candidates
protected int num_candidates
-
kmc
protected KMeansClusterer<V> kmc
-
rand
protected java.util.Random rand
-
-
Constructor Detail
-
VoltageClusterer
public VoltageClusterer(edu.uci.ics.jung.graph.Graph<V,E> g, int num_candidates)
Creates an instance of a VoltageCluster with the specified parameters. These are mostly parameters that are passed directly to VoltageScorer and KMeansClusterer.- Parameters:
num_candidates
- the number of candidate clusters to create
-
-
Method Detail
-
setRandomSeed
protected void setRandomSeed(int random_seed)
-
getCommunity
public java.util.Collection<java.util.Set<V>> getCommunity(V v)
Returns a community (cluster) centered aroundv
.- Parameters:
v
- the vertex whose community we wish to discover
-
cluster
public java.util.Collection<java.util.Set<V>> cluster(int num_clusters)
Clusters the vertices ofg
intonum_clusters
clusters, based on their connectivity.- Parameters:
num_clusters
- the number of clusters to identify
-
cluster_internal
protected java.util.Collection<java.util.Set<V>> cluster_internal(V origin, int num_clusters)
Does the work ofgetCommunity
andcluster
.- Parameters:
origin
- the vertex around which clustering is to be donenum_clusters
- the (maximum) number of clusters to find
-
addTwoCandidateClusters
protected void addTwoCandidateClusters(java.util.LinkedList<java.util.Set<V>> candidates, java.util.Map<V,double[]> voltage_ranks)
Do k-means with three intervals and pick the smaller two clusters (presumed to be on the ends); this is closer to the Wu-Huberman method.- Parameters:
candidates
-voltage_ranks
-
-
addOneCandidateCluster
protected void addOneCandidateCluster(java.util.LinkedList<java.util.Set<V>> candidates, java.util.Map<V,double[]> voltage_ranks)
alternative to addTwoCandidateClusters(): cluster vertices by voltages into 2 clusters. We only consider the smaller of the two clusters returned by k-means to be a 'true' cluster candidate; the other is a garbage cluster.- Parameters:
candidates
-voltage_ranks
-
-
getSeedCandidates
protected java.util.List<V> getSeedCandidates(java.util.Collection<java.util.Set<V>> candidates)
Returns an array of cluster seeds, ranked in decreasing order of number of appearances in the specified collection of candidate clusters.- Parameters:
candidates
-
-
-