Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Self-Host] fetch engine does not support proxy settings #1035

Open
mschfh opened this issue Jan 3, 2025 · 1 comment
Open

[Self-Host] fetch engine does not support proxy settings #1035

mschfh opened this issue Jan 3, 2025 · 1 comment

Comments

@mschfh
Copy link

mschfh commented Jan 3, 2025

Describe the Issue
The fetch scraper does not support proxy settings.

Expected Behavior
The fetch scraper should use the same proxy settings as the playwright-service:

PROXY_SERVER=
PROXY_USERNAME=
PROXY_PASSWORD=

Environment (please complete the following information):

  • N/A

Logs

worker-1              | 2025-01-03 16:30:37 info [ScrapeURL:]: Scraping via playwright...
worker-1              | 2025-01-03 16:30:43 info [ScrapeURL:]: An unexpected error happened while scraping with playwright.
worker-1              | 2025-01-03 16:30:43 info [ScrapeURL:]: Scraping via fetch...
worker-1              | 2025-01-03 16:30:43 info [ScrapeURL:]: Scrape via fetch deemed successful.

Configuration
N/A

Additional Context

This should be fixable by adding a ProxyAgent and passing it via the dispatcher parameter here:
https://github.com/mendableai/firecrawl/blob/87757d9b8e6bacc658b48832deb47c51eaf7412a/apps/api/src/scraper/scrapeURL/engines/fetch/index.ts#L17C7-L20

@mr-v-v-v
Copy link

mr-v-v-v commented Jan 21, 2025

Yes, I've been waiting for this feature for a long time. There's also an issue here:
#925

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants