Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to process tweets - multiple issues #114

Open
tomakun opened this issue Jan 21, 2023 · 18 comments
Open

Failing to process tweets - multiple issues #114

tomakun opened this issue Jan 21, 2023 · 18 comments
Assignees
Labels
further info needed Further information is needed for assigning category

Comments

@tomakun
Copy link

tomakun commented Jan 21, 2023

Hi @robertoszek,
Posting from Twitter to Mastodon here. I am using a Twitter dev token with Elevated access.

I have 3 users, two of which are working fine. Whenever I try to gather tweets for the third one, I get the following error:

Error log
ℹ 2023-01-21 14:36:18,416 - pleroma_bot - INFO - config path: /home/mastodon/pleroma-bot/config.yml 
ℹ 2023-01-21 14:36:18,416 - pleroma_bot - INFO - tweets temp folder: /home/mastodon/pleroma-bot/tweets 
ℹ 2023-01-21 14:36:18,422 - pleroma_bot - INFO - ====================================== 
ℹ 2023-01-21 14:36:18,422 - pleroma_bot - INFO - Processing user:	user1 (up and running)
✖ 2023-01-21 14:36:19,315 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:717) 
Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 577, in main
    raise Exception(
Exception: Invalid forceDate format, use "YYYY-mm-dd"
ℹ 2023-01-21 14:36:19,315 - pleroma_bot - INFO - ====================================== 
ℹ 2023-01-21 14:36:19,316 - pleroma_bot - INFO - Processing user:	user2 (up and running)
✖ 2023-01-21 14:36:20,066 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:717) 
Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 577, in main
    raise Exception(
Exception: Invalid forceDate format, use "YYYY-mm-dd"
ℹ 2023-01-21 14:36:20,067 - pleroma_bot - INFO - ====================================== 
ℹ 2023-01-21 14:36:20,067 - pleroma_bot - INFO - Processing user:	problematic new user 
ℹ 2023-01-21 14:36:21,980 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account? 
ℹ 2023-01-21 14:36:21,980 - pleroma_bot - INFO - 
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]  
2022-10-01
⚠ 2023-01-21 14:36:30,552 - pleroma_bot - WARNING - Raising max_tweets to the maximum allowed value (_utils.py:606) 
Gathering tweets... 1207
ℹ 2023-01-21 14:36:47,105 - pleroma_bot - INFO - tweets gathered: 	 1207 
Processing tweets... :   0%|                                                                                             | 0/1207 [00:00<?, ?it/s]
✖ 2023-01-21 14:36:47,549 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:717) 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 103, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 280, in _get_rt_text
    tweet_ref = self._get_tweets("v2", tweet_ref_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 548, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 662, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 83, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 654, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable

I can't seem to be able to locate the reason why this would be failing. One difference is that the third user has include_rts set to true - but it failed as well when I tried again with include_rts: false on this user. Here's my (partially redacted) config.yml:

Config
# Global Mapping
#
pleroma_base_url: XXX
max_tweets: 40
twitter_token: XXX
delay_post: 1

# User mapping

users:
  - twitter_username: XXX
    pleroma_username: XXX
    pleroma_token: XXX
    signature: false
    include_rts: false
    include_replies: false
    include_quotes: true
    visibility: "unlisted"
    avoid_duplicates: true
    media_upload: true
    twitter_bio: true
    bio_text: "\U0001F916 BEEP BOOP \U0001F916 \nI'm a bot that mirrors\
    \ {{ twitter_username }} Twitter's account. \nAny issues please\
    \ contact @XXX \n \n "
  - twitter_username: XXX
    pleroma_username: XXX
    pleroma_token: XXX
    signature: false
    include_rts: false
    include_replies: false
    include_quotes: true
    visibility: "unlisted"
    avoid_duplicates: true
    media_upload: true
    twitter_bio: true
    bio_text: "\U0001F916 BEEP BOOP \U0001F916 \nI'm a bot that mirrors\
    \ {{ twitter_username }} Twitter's account. \nAny issues please\
    \ contact @XXX \n \n "
  - twitter_username: XXX
    pleroma_username: XXX
    pleroma_token: XXX
    signature: false
    include_rts: true
    include_replies: false
    include_quotes: true
    visibility: "unlisted"
    avoid_duplicates: true
    media_upload: true
    twitter_bio: true
    bio_text: "\U0001F916 BEEP BOOP \U0001F916 \nI'm a bot that mirrors\
    \ {{ twitter_username }} Twitter's account. \nAny issues please\
    \ contact @XXX \n \n "

Any assistance you could provide is appreciated.

Best,
Thomas

@robertoszek
Copy link
Owner

Hi!

Hmm, I'm thinking perhaps there's a tweet for that user with "referenced_tweets" that has no text field somehow?

Does trying with this version make any difference?:
pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.2.1rc4

I'm curious about what the error message is when you tried it include_rts: false, was it the same one?

Oh, and running the bot in verbose mode could maybe help us pin down what the issue is:

$ pleroma-bot -v

@robertoszek robertoszek self-assigned this Jan 21, 2023
@robertoszek robertoszek added the further info needed Further information is needed for assigning category label Jan 21, 2023
@robertoszek
Copy link
Owner

Nevermind my last comment about missing a text field,
reading the traceback again it seems like the id field in referenced_tweets may be a list instead.

Could you try if that's the issue by running 1.2.1rc5?:
pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.2.1rc5

@tomakun
Copy link
Author

tomakun commented Jan 21, 2023

Thanks @robertoszek for looking into this!
I ran the bot using verbose - A LOT of things got printed. The DEBUG lines I got before the error are as follows:

Error log
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1553362097135706115?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics%2Calt_text&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 200 1100
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): t.co:443
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1549790616996831232?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics%2Calt_text&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94

The rest of the error, printed two times this time.

Processing tweets... :   0%|                                                                                             | 0/2427 [01:09<?, ?it/s]
✖ 2023-01-21 15:51:17,990 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:717) 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 131, in process_tweets
    _get_rt_media_url(self, tweet, media)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 309, in _get_rt_media_url
    tweet_rt = self._get_tweets("v2", tweet_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 548, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 662, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 83, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 654, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable
ERROR:pleroma_bot:Exception occurred for user, skipping...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 131, in process_tweets
    _get_rt_media_url(self, tweet, media)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 309, in _get_rt_media_url
    tweet_rt = self._get_tweets("v2", tweet_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 548, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 662, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 83, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 654, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable

Seems like I'm getting a 429 response at some point before the error?

Will try on 1.2.1rc5 and get back to you.

@tomakun
Copy link
Author

tomakun commented Jan 21, 2023

@robertoszek 1.2.1rc5 gives me the same error:

Error log
1.2.1rc5
mastodon@ip-XXX:~/pleroma-bot$ ../.local/bin/pleroma-bot --forceDate problematic_user

                        `^y6gB@@BBQA{,
                      :fB@@@@@@BBBBBQgU"
                    `f@@@@@@@@BBBBQgg80H~
                    H@@B@BB@BBBB#Qgg&0RNT
                   z@@&B@BBBBBBQgg80RD6HK
                  ;@@@QB@BBBB#Qgg&0RN6WqS
                  q@@@@@BBBBQgg80RN6HAqSo          _             _
                 z@@@@BBBB#Qg8&0RN6WqSUhr         | |           | |
               -H@@@@BBBBQQg80RD6HAqSKh(       ___| |_ ___  _ __| | __
              rB@@@BBBB#6Lm00DN6WqSUhfv       / __| __/ _ \| '__| |/ /
             f@@@@BBBBf= |0RD6HAqSKhfv        \__ \ || (_) | |  |   <
           =g@@@BBBBF=  "RDN6WqSUhff{         |___/\__\___/|_|  |_|\_|
          c@@@@BBgu_   ~WD9HAqSKhfkl`
        _6@@@BBNr     'qN6WqSUhhfXI'     .                           .       .
       rB@@@B0r      `S6HAqSKhfkoCr  ,-. |  ,-. ,-. ,-. ,-,-. ,-.    |-. ,-. |-
     `X@@@BQx       `I6WASShhfXFIy_  | | |  |-' |   | | | | | ,-| -- | | | | |
    _g@@@Q\`        JHAqSKhfXoCwJz_  |-' `' `-' '   `-' ' ' ' `-^    `-' `-' `'
   rB@@#x`         }WASShhfXsIyzuu,  |
 `y@@&|          .IAqSKhfXoCwJzu1lr  '
`D@&|           :KqSUhffXsIyzuu1llc,
ff=            `==:::""",,,,________


ℹ 2023-01-21 16:12:12,427 - pleroma_bot - INFO - config path: /home/mastodon/pleroma-bot/config.yml 
ℹ 2023-01-21 16:12:12,427 - pleroma_bot - INFO - tweets temp folder: /home/mastodon/pleroma-bot/tweets 
ℹ 2023-01-21 16:12:12,433 - pleroma_bot - INFO - ====================================== 
ℹ 2023-01-21 16:12:12,433 - pleroma_bot - INFO - Processing user:	user1 (up and running)  
✖ 2023-01-21 16:12:13,492 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:719) 
Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 578, in main
    raise Exception(
Exception: Invalid forceDate format, use "YYYY-mm-dd"
ℹ 2023-01-21 16:12:13,492 - pleroma_bot - INFO - ====================================== 
ℹ 2023-01-21 16:12:13,492 - pleroma_bot - INFO - Processing user:	user2 (up and running) 
✖ 2023-01-21 16:12:14,205 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:719) 
Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 578, in main
    raise Exception(
Exception: Invalid forceDate format, use "YYYY-mm-dd"
ℹ 2023-01-21 16:12:14,206 - pleroma_bot - INFO - ====================================== 
ℹ 2023-01-21 16:12:14,206 - pleroma_bot - INFO - Processing user:	problematic new user
ℹ 2023-01-21 16:12:16,206 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account? 
ℹ 2023-01-21 16:12:16,206 - pleroma_bot - INFO - 
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]  
2022-07-01
⚠ 2023-01-21 16:12:24,287 - pleroma_bot - WARNING - Raising max_tweets to the maximum allowed value (_utils.py:606) 
Gathering tweets... 2427
ℹ 2023-01-21 16:12:55,219 - pleroma_bot - INFO - tweets gathered: 	 2427 
Processing tweets... :   0%|                                                                                                                                                                                                        | 0/2427 [01:19<?, ?it/s]
✖ 2023-01-21 16:14:14,317 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:719) 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 110, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 292, in _get_rt_text
    tweet_ref = self._get_tweets("v2", tweet_ref_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 548, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 662, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 83, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 655, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable

Changing the config mapping for that user to include_rts: false:

Error log
✖ 2023-01-21 16:18:04,778 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:719) 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 110, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 292, in _get_rt_text
    tweet_ref = self._get_tweets("v2", tweet_ref_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 548, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 662, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 83, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 655, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable

@tomakun
Copy link
Author

tomakun commented Jan 21, 2023

@robertoszek On another note, the very first time I tried to run the bot with that new user I got an additional WARNING before the error message, like so:

WARNING - Media possibly geoblocked? ...

Although the time range was also different here. I figured it would fix the issue if I would retrieve less tweets but it didn't...

Error log
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]
2020-01-01
⚠ 2023-01-21 14:04:49,704 - pleroma_bot - WARNING - Raising max_tweets to the maximum allowed value (_utils.py:606) 
Gathering tweets... 3209
ℹ 2023-01-21 14:05:34,145 - pleroma_bot - INFO - tweets gathered: 	 3209 
Processing tweets... :   0%|                                                                                                                 | 0/3209 [00:00<?, ?it/s]
⚠ 2023-01-21 14:05:42,670 - pleroma_bot - WARNING - Media possibly geoblocked? (403) Skipping... 1521136520803254272 - https://video.twimg.com/amplify_video/1520676951798681600/vid/720x900/QxOYqlJYPAG675pB.mp4?tag=14  (_processing.py:429) 
Processing tweets... :   0%|                                                                                                                 | 0/3209 [01:13<?, ?it/s]
✖ 2023-01-21 14:06:47,773 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:717) 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 103, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 280, in _get_rt_text
    tweet_ref = self._get_tweets("v2", tweet_ref_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 548, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 662, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 83, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 654, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable

I appreciate your patience and help on this case.

@tomakun tomakun changed the title Failing to retrieve tweets for no apparent reason Failing to process tweets for no apparent reason Jan 21, 2023
@robertoszek
Copy link
Owner

robertoszek commented Jan 21, 2023

Ah, thank you for the verbose output.
Looks like I was looking at completely the wrong place.

It seems you're hitting a rate limit (429) but then the logger crashes trying to display the message telling you when it will reset (and waiting until then).
I'm thinking the headers for the Twitter rate limits may have changed their format or perhaps their content?:
X-Rate-Limit-Remaining
X-Rate-Limit-Reset
X-Rate-Limit-Limit

They have been doing sweeping changes on their APIs haphazardly lately, so I wouldn't put it past them.

I've added some extra debug statements to 1.2.1rc7 (and changed a few lines hoping for a more meaningful error message):
pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.2.1rc7

Could you try again with that version and report back with the traceback/verbose output?

@tomakun
Copy link
Author

tomakun commented Jan 22, 2023

Hi @robertoszek,
ran with 1.2.1rc7 verbose, I can see your new debug statements as

Error log
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1542840797925900288?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics%2Calt_text&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:pleroma_bot:x-rate-limit-remaining: 0
x-rate-limit-reset: 1674358688
x-rate-limit-limit: 300
reset_time: 2023-01-22 03:38:08

Here's the full output

Error log
ℹ 2023-01-22 03:25:14,172 - pleroma_bot - INFO - tweets gathered: 	 2440 
INFO:pleroma_bot:tweets gathered: 	 2440
Processing tweets... :   0%|                                                                | 0/2440 [00:00<?, ?it/s]DEBUG:pleroma_bot:1542674272744644608
DEBUG:pleroma_bot:1542853398198091777
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:pleroma_bot:1542674735200235520
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:pleroma_bot:1542853439449088000
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1542840797925900288?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics%2Calt_text&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:pleroma_bot:x-rate-limit-remaining: 0
x-rate-limit-reset: 1674358688
x-rate-limit-limit: 300
reset_time: 2023-01-22 03:38:08
Processing tweets... :   0%|                                                                | 0/2440 [00:00<?, ?it/s]
✖ 2023-01-22 03:25:14,872 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:719) 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 110, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 292, in _get_rt_text
    tweet_ref = self._get_tweets("v2", tweet_ref_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 555, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 669, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 90, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 655, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable
ERROR:pleroma_bot:Exception occurred for user, skipping...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 110, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 292, in _get_rt_text
    tweet_ref = self._get_tweets("v2", tweet_ref_id)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 555, in _get_tweets
    tweets_v2 = self._get_tweets_v2(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 669, in _get_tweets_v2
    response = self.twitter_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_twitter.py", line 90, in twitter_api_request
    logger.info(_(
TypeError: 'list' object is not callable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 655, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
TypeError: 'list' object is not callable

@robertoszek Why do you think I would be reaching a rate limit in that case? Is there anything can be done?

@robertoszek
Copy link
Owner

I see, I think we're close to cracking the root cause of the bug.
Can you test with 1.2.1rc8?:

 pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.2.1rc8

@tomakun
Copy link
Author

tomakun commented Jan 22, 2023

@robertoszek I looked up into the rate limit issue that I was running into.

DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1542840797925900288?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics%2Calt_text&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:pleroma_bot:x-rate-limit-remaining: 0
x-rate-limit-reset: 1674358688
x-rate-limit-limit: 300

Bringing the conclusion on top: I was able to proceed and get pass that rate limit after commenting out the twitter_token line in my config.yml. Here's my reasoning for trying out without the twitter_token:

  1. In the verbose output of my failing run I noticed the v2 tweets endpoint is being called exactly 301 times, with a 429 response on the 301st call.

  2. I looked into the Twitter API rate limit docs for v2 and it said:

Tweet lookup: Per app 300 | Per user 900

The twitter token in the config.yml being the app one, I figured I might be hitting that cap, and that my only recourse was to test as guest, without the twitter app token. And indeed, this worked, probably because as guest from the API's point of view, my new Rate limit was 900.

Now, this a bit confusing to me. What is the point of using a twitter_token if we are going to hit Rate limits faster than guest token? Is there really a bug going on here or is it just a rate limit issue?

I see, I think we're close to cracking the root cause of the bug.
Can you test with 1.2.1rc8?:

I will retry with this version and put back the twitter_token in so that it gives us the same test. I have a process running right now, I'll get back as soon as its done.

@robertoszek
Copy link
Owner

Right, you're indeed correct, the 300 per app lookup limit is listed under the "Requests per 15-minute window".

There's a bug in the sense the bot should not crash and burn when a rate limit is hit using your Twitter token.

The expected behavior is for the bot to gracefully take the time when the cap for the rate limit will be reset (in 15min) and waiting until then, resuming and continuing where it left off.

@tomakun
Copy link
Author

tomakun commented Jan 22, 2023

That would actually be fantastic if it could do that yeah. I'm assuming this was an already known issue, apologies for bothering you with that. I will post the results of 1.2.1rc8 soon.

@tomakun
Copy link
Author

tomakun commented Jan 23, 2023

Ran the 1.2.1rc8 bot with the same settings, using my twitter_token in the config.yml. The bot behaved according to your comment above:

Debug log
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/... HTTP/1.1" 429 94
DEBUG:pleroma_bot:x-rate-limit-remaining: 0
x-rate-limit-reset: 1674438829
x-rate-limit-limit: 300
reset_time: 2023-01-23 01:53:49
ℹ 2023-01-23 01:39:50,989 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-23 01:53:49 UTC 
INFO:pleroma_bot:Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-23 01:53:49 UTC
ℹ 2023-01-23 01:39:50,990 - pleroma_bot - INFO - Sleeping for 840s... 
INFO:pleroma_bot:Sleeping for 840s...

I did several runs and I think the sleeping behavior is working well. 👍

However, I ran into a new issue a couple of times already: The bot appears to be crashing when getting a 504 response from the media endpoint:

Error log
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): t.co:443
DEBUG:urllib3.connectionpool:https://pbs.twimg.com:443 "GET /media/FX1qTjBaUAA3aB8.jpg HTTP/1.1" 200 133790
DEBUG:urllib3.connectionpool:https://pbs.twimg.com:443 "GET /media/FXxQ_9-acAAY3Lq.jpg HTTP/1.1" 504 357
Processing tweets... :   0%|                                                            | 0/2485 [06:06<?, ?it/s]
✖ 2023-01-23 05:09:50,492 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:719) 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 427, in _download_media
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: https://pbs.twimg.com/media/FXxQ_9-acAAY3Lq.jpg

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 145, in process_tweets
    _download_media(self, media, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 446, in _download_media
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: https://pbs.twimg.com/media/FXxQ_9-acAAY3Lq.jpg
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 655, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: https://pbs.twimg.com/media/FXxQ_9-acAAY3Lq.jpg
ERROR:pleroma_bot:Exception occurred for user, skipping...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 427, in _download_media
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: https://pbs.twimg.com/media/FXxQ_9-acAAY3Lq.jpg

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 145, in process_tweets
    _download_media(self, media, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 446, in _download_media
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: https://pbs.twimg.com/media/FXxQ_9-acAAY3Lq.jpg
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 655, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: https://pbs.twimg.com/media/FXxQ_9-acAAY3Lq.jpg

Since I will be doing a quite a bit of imports like this, I guess we will be able to clear a few edges cases like this one. Do you think you can take a look? I'll continue running tests with any new version you would provide if it helps you as well.

@robertoszek
Copy link
Owner

Interesting, perhaps we can mitigate the 504's when downloading media by reusing the custom session adapter we create for Twitter's API queries (which includes additional retries and handling other error codes as well).

Would you mind giving 1.2.1rc11 a try and see if you notice any improvement?:

 pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.2.1rc11

@tomakun
Copy link
Author

tomakun commented Jan 24, 2023

Hi @robertoszek,
It looks like I am not getting the 504 anymore with 1.2.1rc11, I'll keep watch to see if this happens again, so that may be improved with this release already! Thank you.

Meanwhile I have been getting another error, which seems to be while Posting Tweets. I am getting hit with a read timeout 404 error from the GET request on the Mastodon media API, while posting tweets.

Error log
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): myinstanceurl:443
DEBUG:urllib3.connectionpool:https://myinstanceurl:443 "GET /api/v1/media HTTP/1.1" 404 None
✖ 2023-01-24 06:24:10,803 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:722)

The bot crashes here with the following traceback:

Error log
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1374, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.10/ssl.py", line 1274, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.10/ssl.py", line 1130, in read
    return self._sslobj.read(len, buffer)
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 532, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 719, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 447, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 336, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='myinstanceurl', port=443): Read timed out. (read timeout=30)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 689, in main
    post_id = user.post(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 792, in post
    post_id = self.post_pleroma(tweet, poll, sensitive, media, cw=cw)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_pleroma.py", line 267, in post_pleroma
    response = pleroma_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_pleroma.py", line 40, in pleroma_api_request
    response = session.request(
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='myinstanceurl', port=443): Read timed out. (read timeout=30)
ERROR:pleroma_bot:Exception occurred for user, skipping...
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1374, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.10/ssl.py", line 1274, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.10/ssl.py", line 1130, in read
    return self._sslobj.read(len, buffer)
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 532, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 719, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 447, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 336, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='myinstanceurl', port=443): Read timed out. (read timeout=30)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 689, in main
    post_id = user.post(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 792, in post
    post_id = self.post_pleroma(tweet, poll, sensitive, media, cw=cw)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_pleroma.py", line 267, in post_pleroma
    response = pleroma_api_request(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_pleroma.py", line 40, in pleroma_api_request
    response = session.request(
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='myinstanceurl', port=443): Read timed out. (read timeout=30)

Any ideas you might have?
Again, I appreciate you time and support on this, this helps a lot. Thank you.

@tomakun tomakun changed the title Failing to process tweets for no apparent reason Failing to process tweets - multiple issues Jan 24, 2023
@tomakun
Copy link
Author

tomakun commented Jan 24, 2023

Following up on my comment above, I think there's something up with this version. The 504 doesn't occur anymore but it takes forever to process the Tweets (although its probably expected with 3200 tweets ([3200/300]=10 batches with 15 min wait in between, 150min of dead time T_T).

The [00:00<?, ?it/s] counter isn't working/moving like on the stable release, the sleeping info message is duplicated/tripled) - and when it's time to post the tweets I'm getting the read error mentioned above. It hurts lol.

In my use case I expect to bring quite a few users with 3200 tweets, I don't mind the Twitter API wait time but if it crashes after the processing it's not ideal I guess.

Error log
2022-01-01
⚠ 2023-01-24 07:19:16,167 - pleroma_bot - WARNING - Raising max_tweets to the maximum allowed value (_utils.py:615) 
Gathering tweets... 3209
ℹ 2023-01-24 07:20:01,858 - pleroma_bot - INFO - tweets gathered: 	 3209 
Processing tweets... :   0%|                                                                                    | 0/3209 [00:00<?, ?it/s]ℹ 2023-01-24 07:21:15,346 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:35:02 UTC 
ℹ 2023-01-24 07:21:15,347 - pleroma_bot - INFO - Sleeping for 829s... 
ℹ 2023-01-24 07:21:15,656 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:35:02 UTC 
ℹ 2023-01-24 07:21:15,656 - pleroma_bot - INFO - Sleeping for 828s... 
ℹ 2023-01-24 07:21:15,750 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:35:02 UTC 
ℹ 2023-01-24 07:21:15,750 - pleroma_bot - INFO - Sleeping for 828s... 
ℹ 2023-01-24 07:21:28,082 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:35:02 UTC 
ℹ 2023-01-24 07:21:28,082 - pleroma_bot - INFO - Sleeping for 816s... 
ℹ 2023-01-24 07:36:03,626 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:50:04 UTC 
ℹ 2023-01-24 07:36:03,627 - pleroma_bot - INFO - Sleeping for 842s... 
ℹ 2023-01-24 07:36:04,035 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:50:04 UTC 
ℹ 2023-01-24 07:36:04,036 - pleroma_bot - INFO - Sleeping for 842s... 
ℹ 2023-01-24 07:36:04,693 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:50:04 UTC 
ℹ 2023-01-24 07:36:04,693 - pleroma_bot - INFO - Sleeping for 841s... 
ℹ 2023-01-24 07:36:04,737 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 07:50:04 UTC 
ℹ 2023-01-24 07:36:04,737 - pleroma_bot - INFO - Sleeping for 841s... 
ℹ 2023-01-24 07:51:00,880 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:05:06 UTC 
ℹ 2023-01-24 07:51:00,880 - pleroma_bot - INFO - Sleeping for 847s... 
ℹ 2023-01-24 07:51:01,172 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:05:06 UTC 
ℹ 2023-01-24 07:51:01,172 - pleroma_bot - INFO - Sleeping for 847s... 
ℹ 2023-01-24 07:51:01,290 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:05:06 UTC 
ℹ 2023-01-24 07:51:01,290 - pleroma_bot - INFO - Sleeping for 847s... 
ℹ 2023-01-24 07:51:04,408 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:05:06 UTC 
ℹ 2023-01-24 07:51:04,409 - pleroma_bot - INFO - Sleeping for 844s... 
ℹ 2023-01-24 08:06:06,751 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:20:08 UTC 
ℹ 2023-01-24 08:06:06,751 - pleroma_bot - INFO - Sleeping for 843s... 
ℹ 2023-01-24 08:06:06,896 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:20:08 UTC 
ℹ 2023-01-24 08:06:06,897 - pleroma_bot - INFO - Sleeping for 843s... 
ℹ 2023-01-24 08:06:07,341 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:20:08 UTC 
ℹ 2023-01-24 08:06:07,341 - pleroma_bot - INFO - Sleeping for 843s... 
ℹ 2023-01-24 08:06:08,136 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:20:08 UTC 
ℹ 2023-01-24 08:06:08,136 - pleroma_bot - INFO - Sleeping for 842s... 
ℹ 2023-01-24 08:20:54,980 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:35:10 UTC 
ℹ 2023-01-24 08:20:54,980 - pleroma_bot - INFO - Sleeping for 857s... 
ℹ 2023-01-24 08:20:55,049 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:35:10 UTC 
ℹ 2023-01-24 08:20:55,049 - pleroma_bot - INFO - Sleeping for 857s... 
ℹ 2023-01-24 08:20:55,431 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:35:10 UTC 
ℹ 2023-01-24 08:20:55,431 - pleroma_bot - INFO - Sleeping for 857s... 
ℹ 2023-01-24 08:20:58,658 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:35:10 UTC 
ℹ 2023-01-24 08:20:58,658 - pleroma_bot - INFO - Sleeping for 853s... 
ℹ 2023-01-24 08:36:12,944 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:50:12 UTC 
ℹ 2023-01-24 08:36:12,944 - pleroma_bot - INFO - Sleeping for 841s... 
ℹ 2023-01-24 08:36:13,033 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:50:12 UTC 
ℹ 2023-01-24 08:36:13,033 - pleroma_bot - INFO - Sleeping for 841s... 
ℹ 2023-01-24 08:36:17,755 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:50:12 UTC 
ℹ 2023-01-24 08:36:17,755 - pleroma_bot - INFO - Sleeping for 836s... 
ℹ 2023-01-24 08:36:20,916 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 08:50:12 UTC 
ℹ 2023-01-24 08:36:20,916 - pleroma_bot - INFO - Sleeping for 833s... 
ℹ 2023-01-24 08:51:07,559 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 09:05:14 UTC 
ℹ 2023-01-24 08:51:07,560 - pleroma_bot - INFO - Sleeping for 848s... 
ℹ 2023-01-24 08:51:07,726 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 09:05:14 UTC 
ℹ 2023-01-24 08:51:07,727 - pleroma_bot - INFO - Sleeping for 848s... 
ℹ 2023-01-24 08:51:08,119 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 09:05:14 UTC 
ℹ 2023-01-24 08:51:08,119 - pleroma_bot - INFO - Sleeping for 848s... 
ℹ 2023-01-24 08:51:12,048 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 09:05:14 UTC 
ℹ 2023-01-24 08:51:12,048 - pleroma_bot - INFO - Sleeping for 844s... 
Processing tweets... :  25%|█████████████████▌                                                    | 803/3209 [1:45:50<5:17:08,  7.91s/it]ℹ 2023-01-24 09:06:06,018 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 09:20:16 UTC 
ℹ 2023-01-24 09:06:06,019 - pleroma_bot - INFO - Sleeping for 852s... 
ℹ 2023-01-24 09:06:06,214 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 09:20:16 UTC 
ℹ 2023-01-24 09:06:06,214 - pleroma_bot - INFO - Sleeping for 852s... 
ℹ 2023-01-24 09:06:06,622 - pleroma_bot - INFO - Rate limit exceeded. 0 out of 300 requests remaining until 2023-01-24 09:20:16 UTC 
ℹ 2023-01-24 09:06:06,622 - pleroma_bot - INFO - Sleeping for 851s... 
Processing tweets... : 100%|███████████████████████████████████████████████████████████████████████| 3209/3209 [2:01:27<00:00,  2.27s/it]
ℹ 2023-01-24 09:21:29,642 - pleroma_bot - INFO - tweets to post: 	 2411 
Posting tweets... :   0%|                                                                                       | 0/2411 [00:00<?, ?it/s]✖ 2023-01-24 

✖ 2023-01-24 09:22:00,131 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:722) 

@robertoszek
Copy link
Owner

Regarding the multiple lines of "Rate limit exceeded"/"Sleeping":
By default the bot splits the work among as many logical threads as half of your processor cores.

cores / 2 if cores > 4 else 4
If you have more than 4 cores, take half. If not, split it into 4 threads by default.

That way the bot can work in parallel making requests and processing tweets (regex matching, substituting, etc) and take only a portion of the time it would have taken if it did it single-threaded instead.
If you hit a rate limit and have, let's say 4 parallel threads making requests, they all will fail at roughly the same time and have to wait until the reset time.
Hence the multiple messages at once (I may try to make this more friendly/elegant, I haven't figured out yet exactly how).

Ah, I think I see what may have happened here:
DEBUG:urllib3.connectionpool:https://myinstanceurl:443 "GET /api/v1/media HTTP/1.1" 404 None

A rogue debug GET statement wasn't commented out, most likely.

Any better luck with 1.2.1rc17?:
pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.2.1rc17

@tomakun
Copy link
Author

tomakun commented Mar 29, 2023

Hi @robertoszek I hope you are doing well.
I took a break after learning about the API changes brought by Twitter, however I am noticing that the bot seems to be working fine, in most cases with a few hundreds of imports.

However I am still running into some crashes with bigger batches, I have not encountered the 404 issue mentioned above since then, but I will let you know if I do.

Here's the log for my most recent crash, I would greatly appreciate if you could look into it. This is on a batch of 2000 tweets, during the Processing step, at about 50% progress:

Context:

DEBUG:pleroma_bot:1637850100155420673
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pbs.twimg.com:443
DEBUG:urllib3.connectionpool:https://pbs.twimg.com:443 "GET /media/FrrP-3XaEAMQayU.jpg HTTP/1.1" 200 201479
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1640223930597400576?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics%2Calt_text&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 200 615
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pbs.twimg.com:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pbs.twimg.com:443
DEBUG:urllib3.connectionpool:https://pbs.twimg.com:443 "GET /media/FsM--CMaUAA1dpe.jpg HTTP/1.1" 200 207210
DEBUG:urllib3.connectionpool:https://pbs.twimg.com:443 "GET /media/FrrP-3XaQAAiJFy.jpg HTTP/1.1" 200 209466
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pbs.twimg.com:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pbs.twimg.com:443
DEBUG:urllib3.connectionpool:https://pbs.twimg.com:443 "GET /media/FsM--CIaQAAoWd5.jpg HTTP/1.1" 200 184088
DEBUG:pleroma_bot:1640733812996071424
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:https://pbs.twimg.com:443 "GET /media/FrrP-3WaYAIT6bp.jpg HTTP/1.1" 200 180186
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pbs.twimg.com:443
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1640730666861223937?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics%2Calt_text&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 200 203
Processing tweets... :  50%|██████████████████████████                          | 1164/2327 [2:16:14<2:16:07,  7.02s/it]
✖ 2023-03-29 03:06:47,570 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:721)

Error log

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 110, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 298, in _get_rt_text
    text = f"{prefix} {tweet_ref['data']['text']}"
KeyError: 'data'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 657, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
KeyError: 'data'
ERROR:pleroma_bot:Exception occurred for user, skipping...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 110, in process_tweets
    tweet["text"] = _get_rt_text(self, tweet)
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_processing.py", line 298, in _get_rt_text
    text = f"{prefix} {tweet_ref['data']['text']}"
KeyError: 'data'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/cli.py", line 657, in main
    tweets_to_post = process_parallel(
  File "/home/mastodon/.local/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 120, in process_parallel
    for idx, res in enumerate(
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
KeyError: 'data'

Thank you in advance for your help.

@tomakun
Copy link
Author

tomakun commented Apr 12, 2023

Hi @robertoszek, hope you are doing well. I hope you can follow up whenever you get a chance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
further info needed Further information is needed for assigning category
Projects
None yet
Development

No branches or pull requests

2 participants