Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_cloudwatch_logs missing logs when start/end specified #234

Closed
codertao opened this issue May 7, 2021 · 1 comment
Closed

in_cloudwatch_logs missing logs when start/end specified #234

codertao opened this issue May 7, 2021 · 1 comment

Comments

@codertao
Copy link

codertao commented May 7, 2021

Problem

Due to other problems, we tried running the cloudwatch_logs input plugin over a specific start/end timeframe to get some missing logs.

When we did that, we only got the most recent bits of logs.

I looked into it, and I think the problem is down to a missing option on the get_log_events API. From the the notes on start-from-head

--start-from-head | --no-start-from-head (boolean)

    If the value is true, the earliest log events are returned first. If the value is false, the latest log events are returned first. The default value is false.

    If you are using nextToken in this operation, you must specify true for startFromHead .

Adding start-from-head=true seemed to fix our issues. And, from the wording, it sounds like it should be used whenever nextToken is present, so might be related to #158 ?

Steps to replicate

 <source>
  @type cloudwatch_logs
  tag a_tag.etc-
  log_group_name /aws/containerinsights/removed
  log_stream_name removed-
  use_log_stream_name_prefix true
  region us-east-1
  start_time "2020-05-01 00:00:00"
  end_time "2020-05-13 23:59:59"
  use_aws_timestamp true
  <storage>
    @type local
    path /data/state.json
  </storage>
</source>
<match a_tag.**>
   @type file
   path /data/logs
   <buffer>
     @type memory
     timekey 3600
     chunk_limit_size 1M
     flush_at_shutdown true
   </buffer>
 </match>

On running, we get a few thousand log lines from the end of the time range... and also logs from today, for some reason, and none of the older logs in the time range.

Expected Behavior or What you need to ask

Should get all the logs from the time range

Using Fluentd and CloudWatchLogs plugin versions

Nix OS, running in Docker via fluent/fluentd:v1.7-1 image.

/ $ fluentd --version
fluentd 1.12.3

/ $ fluent-gem list

*** LOCAL GEMS ***

async (1.23.0)
async-http (0.46.3)
async-io (1.27.0)
aws-eventstream (1.1.1)
aws-partitions (1.452.0)
aws-sdk-cloudwatchlogs (1.40.0)
aws-sdk-core (3.114.0)
aws-sigv4 (1.2.3)
bigdecimal (1.3.5)
bundler (2.2.17)
cmath (default: 1.0.0)
concurrent-ruby (1.1.5)
console (1.6.0)
cool.io (1.5.4)
csv (default: 1.0.0)
date (default: 1.0.0)
dig_rb (1.0.1)
elasticsearch (7.12.0)
elasticsearch-api (7.12.0)
elasticsearch-transport (7.12.0)
etc (default: 1.0.0)
excon (0.81.0)
faraday (1.4.1)
faraday-excon (1.1.0)
faraday-net_http (1.0.1)
faraday-net_http_persistent (1.1.0)
fcntl (default: 1.0.0)
fileutils (default: 1.0.2)
fluent-config-regexp-type (1.0.0)
fluent-plugin-cloudwatch-logs (0.13.4)
fluent-plugin-concat (2.4.0)
fluent-plugin-elasticsearch (5.0.3)
fluent-plugin-filter_typecast (0.0.3)
fluent-plugin-grok-parser (2.6.2)
fluent-plugin-prometheus (2.0.1)
fluent-plugin-rewrite-tag-filter (2.4.0)
fluent-plugin-route (1.0.0)
fluentd (1.12.3, 1.7.4)
http_parser.rb (0.6.0)
ipaddr (default: 1.2.0)
jmespath (1.4.0)
json (2.2.0)
msgpack (1.3.1)
multi_json (1.15.0)
multipart-post (2.1.1)
nio4r (2.5.2)
oj (3.3.10)
openssl (default: 2.1.2)
prometheus-client (2.1.0)
protocol-hpack (1.4.1)
protocol-http (0.8.1)
protocol-http1 (0.8.3)
protocol-http2 (0.9.7)
psych (default: 3.0.2)
ruby2_keywords (0.0.4)
scanf (default: 1.0.0)
serverengine (2.2.3, 2.2.0)
sigdump (0.2.4)
stringio (default: 0.0.1)
strptime (0.2.3)
strscan (default: 1.0.0)
timers (4.3.0)
tzinfo (2.0.0)
tzinfo-data (1.2019.3)
webrick (default: 1.4.2)
yajl-ruby (1.4.1)
zlib (default: 1.0.0)

@cosmo0920
Copy link
Member

This should be done by #235.
Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants