API Documentation¶
Below please find the documentation for the public classes and functions of elasticsearch_dsl
.
Search¶
-
class
elasticsearch_dsl.
Search
(**kwargs)¶ Search request to elasticsearch.
Parameters: - using – Elasticsearch instance to use
- index – limit the search to index
- doc_type – only query this type.
All the parameters supplied (or omitted) at creation type can be later overriden by methods (using, index and doc_type respectively).
-
count
()¶ Return the number of hits matching the query and filters. Note that only the actual number is returned.
-
delete
() executes the query by delegating to delete_by_query()¶
-
execute
(ignore_cache=False)¶ Execute the search and return an instance of
Response
wrapping all the data.Parameters: response_class – optional subclass of Response
to use instead.
-
classmethod
from_dict
(d)¶ Construct a new Search instance from a raw dict containing the search body. Useful when migrating from raw dictionaries.
Example:
s = Search.from_dict({ "query": { "bool": { "must": [...] } }, "aggs": {...} }) s = s.filter('term', published=True)
-
highlight
(*fields, **kwargs)¶ Request highlighting of some fields. All keyword arguments passed in will be used as parameters for all the fields in the
fields
parameter. Example:Search().highlight('title', 'body', fragment_size=50)
will produce the equivalent of:
{ "highlight": { "fields": { "body": {"fragment_size": 50}, "title": {"fragment_size": 50} } } }
If you want to have different options for different fields you can call
highlight
twice:Search().highlight('title', fragment_size=50).highlight('body', fragment_size=100)
which will produce:
{ "highlight": { "fields": { "body": {"fragment_size": 100}, "title": {"fragment_size": 50} } } }
-
highlight_options
(**kwargs)¶ Update the global highlighting options used for this request. For example:
s = Search() s = s.highlight_options(order='score')
-
response_class
(cls)¶ Override the default wrapper used for the response.
-
scan
()¶ Turn the search into a scan search and return a generator that will iterate over all the documents matching the query.
Use
params
method to specify any additional arguments you with to pass to the underlyingscan
helper fromelasticsearch-py
- https://elasticsearch-py.readthedocs.io/en/master/helpers.html#elasticsearch.helpers.scan
-
script_fields
(**kwargs)¶ Define script fields to be calculated on hits. See https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html for more details.
Example:
s = Search() s = s.script_fields(times_two="doc['field'].value * 2") s = s.script_fields( times_three={ 'script': { 'inline': "doc['field'].value * params.n", 'params': {'n': 3} } } )
-
sort
(*keys)¶ Add sorting information to the search request. If called without arguments it will remove all sort requirements. Otherwise it will replace them. Acceptable arguments are:
'some.field' '-some.other.field' {'different.field': {'any': 'dict'}}
so for example:
s = Search().sort( 'category', '-title', {"price" : {"order" : "asc", "mode" : "avg"}} )
will sort by
category
,title
(in descending order) andprice
in ascending order using theavg
mode.The API returns a copy of the Search object and can thus be chained.
-
source
(fields=None, **kwargs)¶ Selectively control how the _source field is returned.
Parameters: source – wildcard string, array of wildcards, or dictionary of includes and excludes If
source
is None, the entire document will be returned for each hit. If source is a dictionary with keys of ‘include’ and/or ‘exclude’ the fields will be either included or excluded appropriately.Calling this multiple times with the same named parameter will override the previous values with the new ones.
Example:
s = Search() s = s.source(include=['obj1.*'], exclude=["*.description"]) s = Search() s = s.source(include=['obj1.*']).source(exclude=["*.description"])
-
suggest
(name, text, **kwargs)¶ Add a suggestions request to the search.
Parameters: - name – name of the suggestion
- text – text to suggest on
All keyword arguments will be added to the suggestions body. For example:
s = Search() s = s.suggest('suggestion-1', 'Elasticsearch', term={'field': 'body'})
-
to_dict
(count=False, **kwargs)¶ Serialize the search into the dictionary that will be sent over as the request’s body.
Parameters: count – a flag to specify we are interested in a body for count - no aggregations, no pagination bounds etc. All additional keyword arguments will be included into the dictionary.
-
update_from_dict
(d)¶ Apply options from a serialized body to the current instance. Modifies the object in-place. Used mostly by
from_dict
.
-
class
elasticsearch_dsl.
MultiSearch
(**kwargs)¶ Combine multiple
Search
objects into a single request.-
add
(search)¶ Adds a new
Search
object to the request:ms = MultiSearch(index='my-index') ms = ms.add(Search(doc_type=Category).filter('term', category='python')) ms = ms.add(Search(doc_type=Blog))
-
execute
(ignore_cache=False, raise_on_error=True)¶ Execute the multi search request and return a list of search results.
-
Document¶
-
class
elasticsearch_dsl.
Document
(meta=None, **kwargs)¶ Model-like class for persisting documents in elasticsearch.
-
delete
(using=None, index=None, **kwargs)¶ Delete the instance in elasticsearch.
Parameters: - index – elasticsearch index to use, if the
Document
is associated with an index this can be omitted. - using – connection alias to use, defaults to
'default'
Any additional keyword arguments will be passed to
Elasticsearch.delete
unchanged.- index – elasticsearch index to use, if the
-
classmethod
get
(id, using=None, index=None, **kwargs)¶ Retrieve a single document from elasticsearch using it’s
id
.Parameters: - id –
id
of the document to be retireved - index – elasticsearch index to use, if the
Document
is associated with an index this can be omitted. - using – connection alias to use, defaults to
'default'
Any additional keyword arguments will be passed to
Elasticsearch.get
unchanged.- id –
-
classmethod
init
(index=None, using=None)¶ Create the index and populate the mappings in elasticsearch.
-
classmethod
mget
(docs, using=None, index=None, raise_on_error=True, missing='none', **kwargs)¶ Retrieve multiple document by their
id
s. Returns a list of instances in the same order as requested.Parameters: - docs – list of
id
s of the documents to be retireved or a list of document specifications as per https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-multi-get.html - index – elasticsearch index to use, if the
Document
is associated with an index this can be omitted. - using – connection alias to use, defaults to
'default'
- missing – what to do when one of the documents requested is not
found. Valid options are
'none'
(useNone
),'raise'
(raiseNotFoundError
) or'skip'
(ignore the missing document).
Any additional keyword arguments will be passed to
Elasticsearch.mget
unchanged.- docs – list of
-
save
(using=None, index=None, validate=True, **kwargs)¶ Save the document into elasticsearch. If the document doesn’t exist it is created, it is overwritten otherwise. Returns
True
if this operations resulted in new document being created.Parameters: - index – elasticsearch index to use, if the
Document
is associated with an index this can be omitted. - using – connection alias to use, defaults to
'default'
- validate – set to
False
to skip validating the document
Any additional keyword arguments will be passed to
Elasticsearch.index
unchanged.- index – elasticsearch index to use, if the
-
classmethod
search
(using=None, index=None)¶ Create an
Search
instance that will search over thisDocument
.
-
to_dict
(include_meta=False, skip_empty=True)¶ Serialize the instance into a dictionary so that it can be saved in elasticsearch.
Parameters: - include_meta – if set to
True
will include all the metadata (_index
,_type
,_id
etc). Otherwise just the document’s data is serialized. This is useful when passing multiple instances intoelasticsearch.helpers.bulk
. - skip_empty – if set to
False
will cause empty values (None
,[]
,{}
) to be left on the document. Those values will be stripped out otherwise as they make no difference in elasticsearch.
- include_meta – if set to
-
update
(using=None, index=None, detect_noop=True, doc_as_upsert=False, refresh=False, **fields)¶ Partial update of the document, specify fields you wish to update and both the instance and the document in elasticsearch will be updated:
doc = MyDocument(title='Document Title!') doc.save() doc.update(title='New Document Title!')
Parameters: - index – elasticsearch index to use, if the
Document
is associated with an index this can be omitted. - using – connection alias to use, defaults to
'default'
Any additional keyword arguments will be passed to
Elasticsearch.update
unchanged.- index – elasticsearch index to use, if the
-
Index¶
-
class
elasticsearch_dsl.
Index
(name, doc_type='doc', using='default')¶ Parameters: - name – name of the index
- using – connection alias to use, defaults to
'default'
-
aliases
(**kwargs)¶ Add aliases to the index definition:
i = Index('blog-v2') i.aliases(blog={}, published={'filter': Q('term', published=True)})
-
analyze
(using=None, **kwargs)¶ Perform the analysis process on a text and return the tokens breakdown of the text.
Any additional keyword arguments will be passed to
Elasticsearch.indices.analyze
unchanged.
-
analyzer
(analyzer)¶ Explicitly add an analyzer to an index. Note that all custom analyzers defined in mappings will also be created. This is useful for search analyzers.
Example:
from elasticsearch_dsl import analyzer, tokenizer my_analyzer = analyzer('my_analyzer', tokenizer=tokenizer('trigram', 'nGram', min_gram=3, max_gram=3), filter=['lowercase'] ) i = Index('blog') i.analyzer(my_analyzer)
-
clear_cache
(using=None, **kwargs)¶ Clear all caches or specific cached associated with the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.clear_cache
unchanged.
-
clone
(name=None, doc_type=None, using=None)¶ Create a copy of the instance with another name or connection alias. Useful for creating multiple indices with shared configuration:
i = Index('base-index') i.settings(number_of_shards=1) i.create() i2 = i.clone('other-index') i2.create()
Parameters: - name – name of the index
- using – connection alias to use, defaults to
'default'
-
close
(using=None, **kwargs)¶ Closes the index in elasticsearch.
Any additional keyword arguments will be passed to
Elasticsearch.indices.close
unchanged.
-
create
(using=None, **kwargs)¶ Creates the index in elasticsearch.
Any additional keyword arguments will be passed to
Elasticsearch.indices.create
unchanged.
-
delete
(using=None, **kwargs)¶ Deletes the index in elasticsearch.
Any additional keyword arguments will be passed to
Elasticsearch.indices.delete
unchanged.
-
delete_alias
(using=None, **kwargs)¶ Delete specific alias.
Any additional keyword arguments will be passed to
Elasticsearch.indices.delete_alias
unchanged.
-
doc_type
(document)¶ Associate a
Document
subclass with an index. This means that, when this index is created, it will contain the mappings for theDocument
. If theDocument
class doesn’t have a default index yet (by definingclass Index
), this instance will be used. Can be used as a decorator:i = Index('blog') @i.document class Post(Document): title = Text() # create the index, including Post mappings i.create() # .search() will now return a Search object that will return # properly deserialized Post instances s = i.search()
-
document
(document)¶ Associate a
Document
subclass with an index. This means that, when this index is created, it will contain the mappings for theDocument
. If theDocument
class doesn’t have a default index yet (by definingclass Index
), this instance will be used. Can be used as a decorator:i = Index('blog') @i.document class Post(Document): title = Text() # create the index, including Post mappings i.create() # .search() will now return a Search object that will return # properly deserialized Post instances s = i.search()
-
exists
(using=None, **kwargs)¶ Returns
True
if the index already exists in elasticsearch.Any additional keyword arguments will be passed to
Elasticsearch.indices.exists
unchanged.
-
exists_alias
(using=None, **kwargs)¶ Return a boolean indicating whether given alias exists for this index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.exists_alias
unchanged.
-
exists_type
(using=None, **kwargs)¶ Check if a type/types exists in the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.exists_type
unchanged.
-
flush
(using=None, **kwargs)¶ Preforms a flush operation on the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.flush
unchanged.
-
flush_synced
(using=None, **kwargs)¶ Perform a normal flush, then add a generated unique marker (sync_id) to all shards.
Any additional keyword arguments will be passed to
Elasticsearch.indices.flush_synced
unchanged.
-
forcemerge
(using=None, **kwargs)¶ The force merge API allows to force merging of the index through an API. The merge relates to the number of segments a Lucene index holds within each shard. The force merge operation allows to reduce the number of segments by merging them.
This call will block until the merge is complete. If the http connection is lost, the request will continue in the background, and any new requests will block until the previous force merge is complete.
Any additional keyword arguments will be passed to
Elasticsearch.indices.forcemerge
unchanged.
-
get
(using=None, **kwargs)¶ The get index API allows to retrieve information about the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.get
unchanged.
-
get_alias
(using=None, **kwargs)¶ Retrieve a specified alias.
Any additional keyword arguments will be passed to
Elasticsearch.indices.get_alias
unchanged.
-
get_field_mapping
(using=None, **kwargs)¶ Retrieve mapping definition of a specific field.
Any additional keyword arguments will be passed to
Elasticsearch.indices.get_field_mapping
unchanged.
-
get_mapping
(using=None, **kwargs)¶ Retrieve specific mapping definition for a specific type.
Any additional keyword arguments will be passed to
Elasticsearch.indices.get_mapping
unchanged.
-
get_settings
(using=None, **kwargs)¶ Retrieve settings for the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.get_settings
unchanged.
-
get_upgrade
(using=None, **kwargs)¶ Monitor how much of the index is upgraded.
Any additional keyword arguments will be passed to
Elasticsearch.indices.get_upgrade
unchanged.
-
mapping
(mapping)¶ Associate a mapping (an instance of
Mapping
) with this index. This means that, when this index is created, it will contain the mappings for the document type defined by those mappings.
-
open
(using=None, **kwargs)¶ Opens the index in elasticsearch.
Any additional keyword arguments will be passed to
Elasticsearch.indices.open
unchanged.
-
put_alias
(using=None, **kwargs)¶ Create an alias for the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.put_alias
unchanged.
-
put_mapping
(using=None, **kwargs)¶ Register specific mapping definition for a specific type.
Any additional keyword arguments will be passed to
Elasticsearch.indices.put_mapping
unchanged.
-
put_settings
(using=None, **kwargs)¶ Change specific index level settings in real time.
Any additional keyword arguments will be passed to
Elasticsearch.indices.put_settings
unchanged.
-
recovery
(using=None, **kwargs)¶ The indices recovery API provides insight into on-going shard recoveries for the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.recovery
unchanged.
-
refresh
(using=None, **kwargs)¶ Preforms a refresh operation on the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.refresh
unchanged.
-
save
(using=None)¶ Sync the index definition with elasticsearch, creating the index if it doesn’t exist and updating its settings and mappings if it does.
Note some settings and mapping changes cannot be done on an open index (or at all on an existing index) and for those this method will fail with the underlying exception.
-
search
(using=None)¶ Return a
Search
object searching over the index (or all the indices belonging to this template) and itsDocument
s.
-
segments
(using=None, **kwargs)¶ Provide low level segments information that a Lucene index (shard level) is built with.
Any additional keyword arguments will be passed to
Elasticsearch.indices.segments
unchanged.
-
settings
(**kwargs)¶ Add settings to the index:
i = Index('i') i.settings(number_of_shards=1, number_of_replicas=0)
Multiple calls to
settings
will merge the keys, later overriding the earlier.
-
shard_stores
(using=None, **kwargs)¶ Provides store information for shard copies of the index. Store information reports on which nodes shard copies exist, the shard copy version, indicating how recent they are, and any exceptions encountered while opening the shard index or from earlier engine failure.
Any additional keyword arguments will be passed to
Elasticsearch.indices.shard_stores
unchanged.
-
shrink
(using=None, **kwargs)¶ The shrink index API allows you to shrink an existing index into a new index with fewer primary shards. The number of primary shards in the target index must be a factor of the shards in the source index. For example an index with 8 primary shards can be shrunk into 4, 2 or 1 primary shards or an index with 15 primary shards can be shrunk into 5, 3 or 1. If the number of shards in the index is a prime number it can only be shrunk into a single primary shard. Before shrinking, a (primary or replica) copy of every shard in the index must be present on the same node.
Any additional keyword arguments will be passed to
Elasticsearch.indices.shrink
unchanged.
-
stats
(using=None, **kwargs)¶ Retrieve statistics on different operations happening on the index.
Any additional keyword arguments will be passed to
Elasticsearch.indices.stats
unchanged.
-
upgrade
(using=None, **kwargs)¶ Upgrade the index to the latest format.
Any additional keyword arguments will be passed to
Elasticsearch.indices.upgrade
unchanged.
-
validate_query
(using=None, **kwargs)¶ Validate a potentially expensive query without executing it.
Any additional keyword arguments will be passed to
Elasticsearch.indices.validate_query
unchanged.
Faceted Search¶
-
class
elasticsearch_dsl.
FacetedSearch
(query=None, filters={}, sort=())¶ Abstraction for creating faceted navigation searches that takes care of composing the queries, aggregations and filters as needed as well as presenting the results in an easy-to-consume fashion:
class BlogSearch(FacetedSearch): index = 'blogs' doc_types = [Blog, Post] fields = ['title^5', 'category', 'description', 'body'] facets = { 'type': TermsFacet(field='_type'), 'category': TermsFacet(field='category'), 'weekly_posts': DateHistogramFacet(field='published_from', interval='week') } def search(self): ' Override search to add your own filters ' s = super(BlogSearch, self).search() return s.filter('term', published=True) # when using: blog_search = BlogSearch("web framework", filters={"category": "python"}) # supports pagination blog_search[10:20] response = blog_search.execute() # easy access to aggregation results: for category, hit_count, is_selected in response.facets.category: print( "Category %s has %d hits%s." % ( category, hit_count, ' and is chosen' if is_selected else '' ) )
Parameters: - query – the text to search for
- filters – facet values to filter
- sort – sort information to be passed to
Search
-
add_filter
(name, filter_values)¶ Add a filter for a facet.
-
aggregate
(search)¶ Add aggregations representing the facets selected, including potential filters.
-
build_search
()¶ Construct the
Search
object.
-
execute
()¶ Execute the search and return the response.
-
filter
(search)¶ Add a
post_filter
to the search request narrowing the results based on the facet filters.
-
highlight
(search)¶ Add highlighting for all the fields
-
query
(search, query)¶ Add query part to
search
.Override this if you wish to customize the query used.
-
search
()¶ Returns the base Search object to which the facets are added.
You can customize the query by overriding this method and returning a modified search object.
-
sort
(search)¶ Add sorting information to the request.
Mappings¶
If you wish to create mappings manually you can use the Mapping
class, for
more advanced use cases, however, we recommend you use the Document
abstraction in combination with Index (or IndexTemplate) to define
index-level settings and properties. The mapping definition follows a similar
pattern to the query dsl:
from elasticsearch_dsl import Keyword, Mapping, Nested, Text
# name your type
m = Mapping('my-type')
# add fields
m.field('title', 'text')
# you can use multi-fields easily
m.field('category', 'text', fields={'raw': Keyword()})
# you can also create a field manually
comment = Nested(
properties={
'author': Text(),
'created_at': Date()
})
# and attach it to the mapping
m.field('comments', comment)
# you can also define mappings for the meta fields
m.meta('_all', enabled=False)
# save the mapping into index 'my-index'
m.save('my-index')
Note
By default all fields (with the exception of Nested
) will expect single
values. You can always override this expectation during the field
creation/definition by passing in multi=True
into the constructor
(m.field('tags', Keyword(multi=True))
). Then the
value of the field, even if the field hasn’t been set, will be an empty
list enabling you to write doc.tags.append('search')
.
Especially if you are using dynamic mappings it might be useful to update the mapping based on an existing type in Elasticsearch, or create the mapping directly from an existing type:
# get the mapping from our production cluster
m = Mapping.from_es('my-index', 'my-type', using='prod')
# update based on data in QA cluster
m.update_from_es('my-index', using='qa')
# update the mapping on production
m.save('my-index', using='prod')
Common field options:
multi
- If set to
True
the field’s value will be set to[]
at first access. required
- Indicates if a field requires a value for the document to be valid.