Update By Query

The Update By Query object

The Update By Query object enables the use of the _update_by_query endpoint to perform an update on documents that match a search query.

The object is implemented as a modification of the Search object, containing a subset of its query methods, as well as a script method, which is used to make updates.

The Update By Query object implements the following Search query types:

  • queries

  • filters

  • excludes

For more information on queries, see the Search DSL chapter.

Like the Search object, the API is designed to be chainable. This means that the Update By Query object is immutable: all changes to the object will result in a shallow copy being created which contains the changes. This means you can safely pass the Update By Query object to foreign code without fear of it modifying your objects as long as it sticks to the Update By Query object APIs.

You can define your client in a number of ways, but the preferred method is to use a global configuration. For more information on defining a client, see the Configuration chapter.

Once your client is defined, you can instantiate a copy of the Update By Query object as seen below:

from elasticsearch_dsl import UpdateByQuery

ubq = UpdateByQuery().using(client)
# or
ubq = UpdateByQuery(using=client)

Note

All methods return a copy of the object, making it safe to pass to outside code.

The API is chainable, allowing you to combine multiple method calls in one statement:

ubq = UpdateByQuery().using(client).query("match", title="python")

To send the request to Elasticsearch:

response = ubq.execute()

It should be noted, that there are limits to the chaining using the script method: calling script multiple times will overwrite the previous value. That is, only a single script can be sent with a call. An attempt to use two scripts will result in only the second script being stored.

Given the below example:

ubq = UpdateByQuery().using(client).script(source="ctx._source.likes++").script(source="ctx._source.likes+=2")

This means that the stored script by this client will be 'source': 'ctx._source.likes+=2' and the previous call will not be stored.

For debugging purposes you can serialize the Update By Query object to a dict explicitly:

print(ubq.to_dict())

Also, to use variables in script see below example:

ubq.script(
  source="ctx._source.messages.removeIf(x -> x.somefield == params.some_var)",
  params={
    'some_var': 'some_string_val'
  }
)

Serialization and Deserialization

The search object can be serialized into a dictionary by using the .to_dict() method.

You can also create a Update By Query object from a dict using the from_dict class method. This will create a new Update By Query object and populate it using the data from the dict:

ubq = UpdateByQuery.from_dict({"query": {"match": {"title": "python"}}})

If you wish to modify an existing Update By Query object, overriding it’s properties, instead use the update_from_dict method that alters an instance in-place:

ubq = UpdateByQuery(index='i')
ubq.update_from_dict({"query": {"match": {"title": "python"}}, "size": 42})

Extra properties and parameters

To set extra properties of the search request, use the .extra() method. This can be used to define keys in the body that cannot be defined via a specific API method like explain:

ubq = ubq.extra(explain=True)

To set query parameters, use the .params() method:

ubq = ubq.params(routing="42")

Response

You can execute your search by calling the .execute() method that will return a Response object. The Response object allows you access to any key from the response dictionary via attribute access. It also provides some convenient helpers:

response = ubq.execute()

print(response.success())
# True

print(response.took)
# 12

If you want to inspect the contents of the response objects, just use its to_dict method to get access to the raw data for pretty printing.