-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jdbc-input bug: clean_run: true and/or record_last_run: false doesn't work #121
Comments
Acc to documentation So this is how it should be used for the :sql_last_run to be reverted to 0 or 0 timestamp (1970...) for each pipeline start!clean_run => true And in here it seems like the clean_run has effect only when record_last_run is true.
And also this patch conditions the file update on record_last_run, but this is NOT NEEDED! Also there is a pleonasm in this code
The if is useless. |
So after some digging, it turns out that the last_sql_value is cached because class ValueTracking is instantiated once and it is not read from the last_run_metadata_path unless logstash is restarted.... |
@s137 for cursor pagination with tracking_column => "id" // or any unique I think the solution would be: WHEN the pipeline exits because there are 0 retrieved rows: if clean_run is true set value to 0 or 1970... to the last_run_metadata_path file UPDATE: |
Update. The scheduler spoils everything up running just 1 query at a time. Without scheduler, the query is repeated until there are no more rows to be ingested... |
So the final solution was to not use clean_run true at all. So I created 2 pipelines, one with cursor paginate and one with offset paginate:
sql WHERE updated_at IS NOT NULL AND updated_at > DATE_SUB(NOW(), INTERVAL 2 MINUTE) LIMIT :size OFFSET :offset Cursor paginate:
sql:
This will execute each statements once, then restart logstash and execute them again until logstash is stopped. If I want to re-ingest all, i just have to delete manually the files from last_run_metadata_path (stop logstash, delete the index, create the mappings, and then restart logstash). The scheduler from logstash is not compatible with cursor paginate and :sql_last_value. |
Logstash information:
JVM (e.g.
java -version
): Bundled JDK:openjdk version "17.0.4" 2022-07-19
OpenJDK Runtime Environment Temurin-17.0.4+8 (build 17.0.4+8)
OpenJDK 64-Bit Server VM Temurin-17.0.4+8 (build 17.0.4+8, mixed mode, sharing)
OS version: Windows 10
Description of the problem including expected versus actual behavior:
According to the docs (https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#_state) setting
clean_run
to true should set the value of:sql_last_value
to 0 or '1970-01-01 00:00:00', if its a datetime value, for every execution.But it only works for the first execution, after that it updates the value to the last execution time, even if I also set
record_last_run
to false.Steps to reproduce:
You can reproduce the issue with this input:
This is the same issue that @palin first encountered and put up on the old jdbc-input-plugin repository, see here for more details:
logstash-plugins/logstash-input-jdbc#373
The text was updated successfully, but these errors were encountered: