Clustering

Distance matrix

class satyr.Distances

satyr.Distances - class representing distances between objects

Usage:

satyr.Distances(m, n) - creates an m-by-n distance matrix

satyr.Distances([threads], m, dist_type=DISTANCE_LEVENSHTEIN) - compares first m threads with others

dist_type (optional): DISTANCE_LEVENSHTEIN, DISTANCE_JACCARD or DISTANCE_DAMERAU_LEVENSHTEIN

dup()

Usage: distances.dup()

Returns: satyr.Distances - a new clone of the distances

Clones the distances object. All new structures are independent of the original object.

get_distance()

Usage: distances.get_distance(i, j)

Returns: positive float - distance between objects i and j

get_size()

Usage: distances.get_size()

Returns: (m, n) - size of the distance matrix

set_distance()

Usage: distances.set_distance(i, j, d)

Sets distance between objects i and j to d

Dendrogram

class satyr.Dendrogram

satyr.Dendrogram - a dendrogram created by clustering algorithm

Usage: satyr.Dendrogram(distances) - creates new dendrogram from a distance matrix

cut()

Usage: dendrogram.cut(level, min_size)

Returns: list of clusters (lists of objects) which have at least min_size objects and which were merged at most at the specified distance

get_merge_level()

Usage: dendrogram.get_merge_level(i)

Returns: float - merge level between clusters at positions i and i + 1

get_object()

Usage: dendrogram.get_object(i)

Returns: integer - index of the object at position i

get_size()

Usage: dendrogram.get_size()

Returns: integer - number of objects in the dendrogram

Table Of Contents

Previous topic

Stacktrace

This Page