concrete.search package

class concrete.search.ttypes.SearchCapability(type=None, lang=None)

Bases: object


A search provider describes its capabilities with a list of search type and language pairs.

Attributes:
- type: A type of search supported by the search provider
- lang: Language that the search provider supports.
Use ISO 639-2/T three letter codes.


read(iprot)
validate()
write(oprot)
class concrete.search.ttypes.SearchFeedback

Bases: object


Feedback values


NEGATIVE = -1
NONE = 0
POSITIVE = 1
class concrete.search.ttypes.SearchQuery(terms=None, questions=None, communicationId=None, tokens=None, rawQuery=None, auths=None, userId=None, name=None, labels=None, type=None, lang=None, corpus=None, k=None, communication=None)

Bases: object


Wrapper for information relevant to a (possibly structured) search.

Attributes:
- terms: Individual words, or multiword phrases, e.g., ‘dog’, ‘blue
cheese’. It is the responsibility of the implementation of
Search* to tokenize multiword phrases, if so-desired. Further,
an implementation may choose to support advanced features such as
wildcards, e.g.: ‘blue*’. This specification makes no
committment as to the internal structure of keywords and their
semantics: that is the responsibility of the individual
implementation.
- questions: e.g., “what is the capital of spain?”

questions is a list in order that possibly different phrasings of
the question can be included, e.g.: “what is the name of spain’s
capital?”
- communicationId: Refers to an optional communication that can provide context for the query.
- tokens: Refers to a sequence of tokens in the communication referenced by communicationId.
- rawQuery: The input from the user provided in the search box, unmodified
- auths: optional authorization mechanism
- userId: Identifies the user who submitted the search query
- name: Human readable name of the query.
- labels: Properties of the query or user.
These labels can be used to group queries and results by a domain or group of
users for training. An example usage would be assigning the geographical region
as a label (“spain”). User labels could be based on organizational units (“hltcoe”).
- type: This search is over this type of data (communications, sentences, entities)
- lang: The language of the corpus that the user wants to search.
Use ISO 639-2/T three letter codes.
- corpus: An identifier of the corpus that the search is to be performed over.
- k: The maximum number of candidates the search service should return.
- communication: An optional communication used as context for the query.
If both this field and communicationId is populated, then it is
assumed the ID of the communication is the same as communicationId.


read(iprot)
validate()
write(oprot)
class concrete.search.ttypes.SearchResult(uuid=None, searchQuery=None, searchResultItems=None, metadata=None, lang=None)

Bases: object


Single wrapper for results from all the various Search* services.

Attributes:
- uuid: Unique identifier for the results of this search.
- searchQuery: The query that led to this result.
Useful for capturing feedback or building training data.
- searchResultItems: The list is assumed sorted best to worst, which should be
reflected by the values contained in the score field of each
SearchResult, if that field is populated.
- metadata: The system that provided the response: likely use case for
populating this field is for building training data. Presumably
a system will not need/want to return this object in live use.
- lang: The dominant language of the search results.
Use ISO 639-2/T three letter codes.
Search providers should set this when possible to support downstream processing.
Do not set if it is not known.
If multilingual, use the string “multilingual”.


read(iprot)
validate()
write(oprot)
class concrete.search.ttypes.SearchResultItem(communicationId=None, sentenceId=None, score=None, tokens=None, entity=None)

Bases: object


An individual element returned from a search. Most/all methods
will return a communicationId, possibly with an associated score.
For example if the target element type of the search is Sentence
then the sentenceId field should be populated.

Attributes:
- communicationId
- sentenceId: The UUID of the returned sentence, which appears in the
communication referenced by communicationId.
- score: Values are not restricted in range (e.g., do not have to be
within [0,1]). Higher is better.

- tokens: If SearchType=ENTITY_MENTIONS then this field should be populated.
Otherwise, this field may be optionally populated in order to
provide a hint to the client as to where to center a
visualization, or the extraction of context, etc.
- entity: If SearchType=ENTITIES then this field should be populated.


read(iprot)
validate()
write(oprot)
class concrete.search.ttypes.SearchType

Bases: object


What are we searching over


COMMUNICATIONS = 0
ENTITIES = 3
ENTITY_MENTIONS = 4
SECTIONS = 1
SENTENCES = 2
SITUATIONS = 5
SITUATION_MENTIONS = 6