-
Notifications
You must be signed in to change notification settings - Fork 85
ES|QL support #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES|QL support #233
Changes from 16 commits
a307db2
9c35f22
6f99055
086a592
e30e0f9
7746c14
76303d8
1fb29f7
5d47f2f
c291e24
22e72e9
af6e24a
4ce6fa4
4ed69ff
0725f98
cfb36f3
a92a71e
d4f559d
65eb675
789f467
fefe6a0
e108c87
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -230,6 +230,106 @@ The next scheduled run: | |
| * uses {ref}/point-in-time-api.html#point-in-time-api[Point in time (PIT)] + {ref}/paginate-search-results.html#search-after[Search after] to paginate through all the data, and | ||
| * updates the value of the field at the end of the pagination. | ||
|
|
||
| [id="plugins-{type}s-{plugin}-esql"] | ||
| ==== ES|QL support | ||
| {es} Query Language (ES|QL) provides a SQL-like interface for querying your {es} data. | ||
|
|
||
| To utilize the ES|QL feature with this plugin, the following version requirements must be met: | ||
| [cols="1,2",options="header"] | ||
| |=== | ||
| |Component |Minimum version | ||
| |{es} |8.11.0 or newer | ||
| |{ls} |8.17.4 or newer | ||
| |This plugin |4.23.0+ (4.x series) or 5.2.0+ (5.x series) | ||
mashhurs marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| |=== | ||
|
|
||
| To configure ES|QL query in the plugin, set the `response_type` to `esql` and provide your ES|QL query in the `query` parameter. | ||
|
||
|
|
||
| IMPORTANT: We recommend understanding https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-limitations.html[ES|QL current limitations] before using it in production environments. | ||
mashhurs marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| The following is a basic scheduled ES|QL query that runs hourly: | ||
| [source, ruby] | ||
| input { | ||
| elasticsearch { | ||
| id => hourly_cron_job | ||
| hosts => [ 'https://..'] | ||
| api_key => '....' | ||
| response_type => 'esql' | ||
| query => ' | ||
| FROM food-index | ||
| | WHERE spicy_level = "hot" AND @timestamp > NOW() - 1 hour | ||
| | LIMIT 500 | ||
| ' | ||
| schedule => '0 * * * *' # every hour at min 0 | ||
| } | ||
| } | ||
|
|
||
| Set `config.support_escapes: true` in `logstash.yml` if you need to escape special chars in the query. | ||
|
|
||
| NOTE: With ES|QL query, {ls} doesn't generate `event.original`. | ||
|
|
||
| [id="plugins-{type}s-{plugin}-esql-event-mapping"] | ||
| ===== Mapping ES|QL result to {ls} event | ||
| ES|QL returns query results in a structured tabular format, where data is organized into _columns_ (fields) and _values_ (entries). | ||
| The plugin maps each value entry to an event, populating corresponding fields. | ||
| For example, a query might produce a table like: | ||
|
|
||
| [cols="2,1,1,1,2",options="header"] | ||
| |=== | ||
| |`timestamp` |`user_id` | `action` | `status.code` | `status.desc` | ||
|
|
||
| |2025-04-10T12:00:00 |123 |login |200 | Success | ||
| |2025-04-10T12:05:00 |456 |purchase |403 | Forbidden (unauthorized user) | ||
| |=== | ||
|
|
||
| For this case, the plugin emits two events look like | ||
| [source, json] | ||
| [ | ||
| { | ||
| "timestamp": "2025-04-10T12:00:00", | ||
| "user_id": 123, | ||
| "action": "login", | ||
| "status": { | ||
| "code": 200, | ||
| "desc": "Success" | ||
| } | ||
| }, | ||
| { | ||
| "timestamp": "2025-04-10T12:05:00", | ||
| "user_id": 456, | ||
| "action": "purchase", | ||
| "status": { | ||
| "code": 403, | ||
| "desc": "Forbidden (unauthorized user)" | ||
| } | ||
| } | ||
| ] | ||
|
|
||
| NOTE: If your index has a mapping with sub-objects where `status.code` and `status.desc` actually dotted fields, they appear in {ls} events as a nested structure. | ||
|
|
||
| [id="plugins-{type}s-{plugin}-esql-multifields"] | ||
| ===== Conflict on multi-fields | ||
mashhurs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ES|QL query fetches all parent and sub-fields fields if your {es} index has https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/multi-fields[multi-fields] or https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/subobjects[subobjects]. | ||
| Since {ls} events cannot contain parent field's concrete value and sub-field values together, the plugin cannot map the result to {ls} event and produces `_elasticsearch_input_failure` tagged failed event. | ||
| We recommend using the `RENAME` (or `DROP`) keyword in your ES|QL query explicitly rename the fields to overcome this issue. | ||
| To illustrate the situation with example, assuming your mapping has a time `time` field with `time.min` and `time.max` sub-fields as following: | ||
| [source, ruby] | ||
| "properties": { | ||
| "time": { "type": "long" }, | ||
| "time.min": { "type": "long" }, | ||
| "time.max": { "type": "long" } | ||
| } | ||
|
|
||
| The ES|QL result will contain all three fields but the plugin cannot map them into {ls} event. | ||
mashhurs marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| To avoid this, you can use the `RENAME` keyword to rename the `time` parent field to get all three fields with unique fields. | ||
| [source, ruby] | ||
| ... | ||
| query => 'FROM my-index | RENAME time AS time.current' | ||
| ... | ||
|
|
||
| For comprehensive ES|QL syntax reference and best practices, see the https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-syntax.html[{es} ES|QL documentation]. | ||
|
|
||
| [id="plugins-{type}s-{plugin}-options"] | ||
| ==== Elasticsearch Input configuration options | ||
|
|
||
|
|
@@ -257,7 +357,7 @@ Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details. | |
| | <<plugins-{type}s-{plugin}-password>> |<<password,password>>|No | ||
| | <<plugins-{type}s-{plugin}-proxy>> |<<uri,uri>>|No | ||
| | <<plugins-{type}s-{plugin}-query>> |<<string,string>>|No | ||
| | <<plugins-{type}s-{plugin}-response_type>> |<<string,string>>, one of `["hits","aggregations"]`|No | ||
| | <<plugins-{type}s-{plugin}-response_type>> |<<string,string>>, one of `["hits","aggregations","esql"]`|No | ||
| | <<plugins-{type}s-{plugin}-request_timeout_seconds>> | <<number,number>>|No | ||
| | <<plugins-{type}s-{plugin}-schedule>> |<<string,string>>|No | ||
| | <<plugins-{type}s-{plugin}-schedule_overlap>> |<<boolean,boolean>>|No | ||
|
|
@@ -498,26 +598,32 @@ environment variables e.g. `proxy => '${LS_PROXY:}'`. | |
| * Value type is <<string,string>> | ||
| * Default value is `'{ "sort": [ "_doc" ] }'` | ||
|
|
||
| The query to be executed. Read the {ref}/query-dsl.html[Elasticsearch query DSL | ||
| documentation] for more information. | ||
| The query to be executed. | ||
| Accepted query shape is DSL or ES|QL (when `response_type => 'esql'`). | ||
| Read the {ref}/query-dsl.html[{es} query DSL documentation] or {ref}/esql.html[{es} ES|QL documentation] for more information. | ||
|
|
||
| When <<plugins-{type}s-{plugin}-search_api>> resolves to `search_after` and the query does not specify `sort`, | ||
| the default sort `'{ "sort": { "_shard_doc": "asc" } }'` will be added to the query. Please refer to the {ref}/paginate-search-results.html#search-after[Elasticsearch search_after] parameter to know more. | ||
|
|
||
| [id="plugins-{type}s-{plugin}-response_type"] | ||
| ===== `response_type` | ||
|
|
||
| * Value can be any of: `hits`, `aggregations` | ||
| * Value can be any of: `hits`, `aggregations`, `esql` | ||
| * Default value is `hits` | ||
|
|
||
| Which part of the result to transform into Logstash events when processing the | ||
| response from the query. | ||
|
|
||
| The default `hits` will generate one event per returned document (i.e. "hit"). | ||
|
|
||
| When set to `aggregations`, a single Logstash event will be generated with the | ||
| contents of the `aggregations` object of the query's response. In this case the | ||
| `hits` object will be ignored. The parameter `size` will be always be set to | ||
| 0 regardless of the default or user-defined value set in this plugin. | ||
|
|
||
| When using the `esql` setting, the query must be a valid ES|QL string. | ||
| When this setting is active, `target`, `size`, `slices` and `search_api` parameters are ignored. | ||
mashhurs marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| [id="plugins-{type}s-{plugin}-request_timeout_seconds"] | ||
| ===== `request_timeout_seconds` | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -74,6 +74,7 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base | |||||||
| require 'logstash/inputs/elasticsearch/paginated_search' | ||||||||
| require 'logstash/inputs/elasticsearch/aggregation' | ||||||||
| require 'logstash/inputs/elasticsearch/cursor_tracker' | ||||||||
| require 'logstash/inputs/elasticsearch/esql' | ||||||||
|
|
||||||||
| include LogStash::PluginMixins::ECSCompatibilitySupport(:disabled, :v1, :v8 => :v1) | ||||||||
| include LogStash::PluginMixins::ECSCompatibilitySupport::TargetCheck | ||||||||
|
|
@@ -96,15 +97,18 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base | |||||||
| # The index or alias to search. | ||||||||
| config :index, :validate => :string, :default => "logstash-*" | ||||||||
|
|
||||||||
| # The query to be executed. Read the Elasticsearch query DSL documentation | ||||||||
| # for more info | ||||||||
| # https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html | ||||||||
| # The query to be executed. DSL or ES|QL (when `response_type => 'esql'`) query shape is accepted. | ||||||||
| # Read the following documentations for more info | ||||||||
| # Query DSL: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html | ||||||||
| # ES|QL: https://www.elastic.co/guide/en/elasticsearch/reference/current/esql.html | ||||||||
| config :query, :validate => :string, :default => '{ "sort": [ "_doc" ] }' | ||||||||
|
|
||||||||
| # This allows you to speccify the response type: either hits or aggregations | ||||||||
| # where hits: normal search request | ||||||||
| # aggregations: aggregation request | ||||||||
| config :response_type, :validate => ['hits', 'aggregations'], :default => 'hits' | ||||||||
| # This allows you to speccify the response type: one of [hits, aggregations, esql] | ||||||||
| # where | ||||||||
| # hits: normal search request | ||||||||
| # aggregations: aggregation request | ||||||||
| # esql: ES|QL request | ||||||||
| config :response_type, :validate => %w[hits aggregations esql], :default => 'hits' | ||||||||
|
||||||||
| config :response_type, :validate => %w[hits aggregations esql], :default => 'hits' | |
| config :response_type, :validate => %w[hits aggregations], :deprecated => "use `query_type`" | |
| config :query_type, :validate => %w[hits aggregations esql] # default depends on query shape |
def register
+ @query_type = normalize_config("query_type") do |normalizer|
+ normalizer.with_deprecated_alias("response_type")
+ end || (@query.start_with?('{') ? 'hits' : 'esql')There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to add the deprecation right after this ES|QL change.
One agreement we need to decide is naming. I personally do not like hits, aggregations along with esql. They indicate different contexts. I had options dsl_search, dsl_aggregation and esql.
Let me please know your opinion: I can either apply with change if we quickly come with agreement or create an issue follow up right after this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to add the deprecation right after this ES|QL change.
If someone starts using this feature, I would rather that their never-possible-before configuration feels "stable" and doesn't require them to go back and deal with deprecation warnings for things that we knew about before shipping the feature.
They indicate different contexts
This is a very good point.
The current response_type only makes sense in the context of DSL-based queries.
So: what if we were to keep response_type, but constrain its use to query_type => dsl?
This would mean:
query_type => dsl: allows use ofresponse_typequery_type => esql: prohibits use ofresponse_type- unspecified
query_typecould have a sensible default based on the shape ofquery:- if it looks like JSON, then it's
dsl - if it looks like ES|QL then it's
esql - else we error helpfully
- if it looks like JSON, then it's
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing query_type and keep using response_type was my initial design and we with @jsvd thinking if we can still simplify without introducing new param (and came to agreement in our 1:1 to support wth response_type and deprecate it in the future).
However, considering the behavior and user experience, I do also strongly support this (introducing query_type at high level which other params follow) structural (query type at the high level, then depth details such as what response shape going to be parsed, etc..) logic.
I have applied it with this commit.
FYI: current CI snapshot unit test steps are broken (CIs with release versions are fine) due to core openssl.jar and uri gem miss but I have run on my local with local LS to verify change and unit/integration tests.
mashhurs marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review note: moved to private area
Uh oh!
There was an error while loading. Please reload this page.