concrete.util.comm_container module

Communication Containers - mapping Communication IDs to Communications

Classes that behave like a read-only dictionary (implementing Python’s collections.abc.Mapping interface) and map Communication ID strings to Communications.

The classes abstract away the storage backend. If you need to optimize for performance, you may not want to use a dictionary abstraction that retrieves one Communication at a time.

class concrete.util.comm_container.DirectoryBackedCommunicationContainer(directory_path, comm_extensions=['.comm', '.concrete', '.gz'], add_references=True)

Bases: collections.abc.Mapping

Maps Comm IDs to Comms, retrieving Comms from the filesystem

DirectoryBackedCommunicationContainer instances behave as dict-like data structures that map Communication IDs to Communications. Communications are lazily retrieved from the filesystem.

Upon initialization, a DirectoryBackedCommunicationContainer instance will (recursively) search directory_path for any files that end with the specified comm_extensions. Files with matching extensions are assumed to be Communication files whose filename (sans extension) is the file’s Communication ID. So, for example, a file named ‘XIN_ENG_20101212.0120.concrete’ is assumed to be a Communication file with a Communication ID of ‘XIN_ENG_20101212.0120’.

Files with the extension .gz will be decompressed using gzip.

A DirectoryBackedCommunicationsContainer will not be able to find any files that are added to directory_path after the container was initialized.

Parameters:
class concrete.util.comm_container.FetchBackedCommunicationContainer(host, port)

Bases: collections.abc.Mapping

Maps Comm IDs to Comms, retrieving Comms from a FetchCommunicationService server

FetchBackedCommunicationContainer instances behave as dict-like data structures that map Communication IDs to Communications. Communications are lazily retrieved from a FetchCommunicationService.

If you need to retrieve large amounts of data from a FetchCommunicationService, then you SHOULD NOT USE THIS CLASS. This class retrieves one Communication at a time using FetchCommunicationService.

Parameters:
class concrete.util.comm_container.MemoryBackedCommunicationContainer(communications_file, max_file_size=1073741824, add_references=True)

Bases: collections.abc.Mapping

Maps Comm IDs to Comms by loading all Comms in file into memory

FetchBackedCommunicationContainer instances behave as dict-like data structures that map Communication IDs to Communications. All Communications in communications_file will be read into memory using a CommunicationReader instance.

Parameters:
class concrete.util.comm_container.RedisHashBackedCommunicationContainer(redis_db, key, add_references=True)

Bases: collections.abc.Mapping

Provides access to Communications stored in a Redis hash, assuming the key of each communication is its Communication id.

RedisHashBackedCommunicationContainer instances behave as dict-like data structures that map Communication IDs to Communications. Communications are lazily retrieved from a Redis hash.

Parameters:
class concrete.util.comm_container.S3BackedCommunicationContainer(bucket, prefix_len=4, add_references=True)

Bases: collections.abc.Mapping

Provides access to Communications stored in an AWS S3 bucket, assuming the key of each communication is its Communication id (optionally prefixed with a fixed-length, random-looking but deterministic hash to improve performance).

S3HashBackedCommunicationContainer instances behave as dict-like data structures that map Communication IDs (with or without prefixes) to Communications. Communications are lazily retrieved from an S3 bucket.

References

http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html

Parameters:
  • bucket (boto.s3.bucket.Bucket) – S3 bucket object
  • prefix_len (int) – length of prefix in each Communication’s key in the bucket. This number of characters will be removed from the beginning of the key to determine the Communication id (without incurring the cost of fetching and deserializing the Communication). A prefix enables S3 to better partition the bucket contents, yielding higher performance and a lower chance of getting rate-limited by AWS.
  • add_references (bool) – If True, calls concrete.util.references.add_references_to_communication() on any retrieved Communication
class concrete.util.comm_container.ZipFileBackedCommunicationContainer(zipfile_path, comm_extensions=['.comm', '.concrete'], add_references=True)

Bases: collections.abc.Mapping

Maps Comm IDs to Comms, retrieving Comms from a Zip file

ZipFileBackedCommunicationContainer instances behave as dict-like data structures that map Communication IDs to Communications. Communications are lazily retrieved from a Zip file.

Parameters: