concrete.clustering.ttypes module

class concrete.clustering.ttypes.Cluster(clusterMemberIndexList=None, confidenceList=None, childIndexList=None)

Bases: object

A set of items which are alike in some way. Has an implicit id which is the
index of this Cluster in its parent Clustering’s ‘clusterList’.

- clusterMemberIndexList: The items in this cluster. Values are indices into the
‘clusterMemberList’ of the Clustering which contains this Cluster.
- confidenceList: Co-indexed with ‘clusterMemberIndexList’. The i^{th} value represents the
confidence that mention clusterMemberIndexList[i] belongs to this cluster.
- childIndexList: A set of clusters (implicit ids/indices) from which this cluster was
created. This cluster should represent the union of all the items in all
of the child clusters. (For hierarchical clustering only).

class concrete.clustering.ttypes.ClusterMember(communicationId=None, setId=None, elementId=None)

Bases: object

An item being clustered. Does not designate cluster _membership_, as in
“item x belongs to cluster C”, but rather just the item (“x” in this
example). Membership is indicated through Cluster objects. An item may be a
Entity, EntityMention, Situation, SituationMention, or technically anything
with a UUID.

- communicationId: UUID of the Communication which contains the item specified by ‘elementId’.
This is ancillary info assuming UUIDs are indeed universally unique.
- setId: UUID of the Entity|Situation(Mention)Set which contains the item specified by ‘elementId’.
This is ancillary info assuming UUIDs are indeed universally unique.
- elementId: UUID of the EntityMention, Entity, SituationMention, or Situation that
this item represents. This is the characteristic field.

class concrete.clustering.ttypes.Clustering(uuid=None, metadata=None, clusterMemberList=None, clusterList=None, rootClusterIndexList=None)

Bases: object

An (optionally) hierarchical clustering of items appearing across a set of
Communications (intra-Communication clusterings are encoded by Entities and
Situations). An item may be a Entity, EntityMention, Situation,
SituationMention, or technically anything with a UUID.

- uuid: UUID for this Clustering object.
- metadata: Metadata for this Clustering object.
- clusterMemberList: The set of items being clustered.
- clusterList: Clusters of items. If this is a hierarchical clustering, this may contain
clusters which are the set of smaller clusters.
Clusters may not “overlap”, meaning (for all clusters X,Y):
X cap Y
eq emptyset implies X subset Y ee Y subset X
- rootClusterIndexList: A set of disjoint clusters (indices in ‘clusterList’) which cover all
items in ‘clusterMemberList’. This list must be specified for hierarchical
clusterings and should not be specified for flat clusterings.
