Skip to content

Releases: mendableai/firecrawl

Extract Improvements - v1.4.1

24 Jan 22:50
fa5544a
Compare
Choose a tag to compare

We've significantly enhanced our data extraction capabilities with several key updates:

  • Extract now returns a lot more data due to a new re-ranker system
  • Improved infrastructure reliability
  • Migrated from Cheerio to a high-performance Rust-based parser for faster and more memory-efficient parsing
  • Enhanced crawl cancellation functionality for better control over running jobs

What's Changed

Full Changelog: v1.4.0...1.4.1

Introducing /extract - v.1.4.0

20 Jan 14:17
Compare
Choose a tag to compare

Get structured web data with /extract

We’re excited to announce the release of /extract - get data from any website with just a prompt. With /extract, you can retrieve any information from anywhere on a website without being limited by scraping roadblocks or the typical context constraints of LLMs.

Frame 46557

No more manual copy-pasting, broken scraping scripts, or debugging LLM calls. - it’s never been easier to enrich your data, create datasets, or power AI applications with clean, structured data from any website.

Companies are already using extract to:

  • Enrich CRM data
  • Streamline KYB processes
  • Monitor competitors
  • Supercharge onboarding experiences
  • Build targeted prospecting lists

Instead of spending hours manually researching, fixing broken scrapers, or piecing together data from multiple sources, simply specify what information you need and the target website, and let the Firecrawl handle the entire retrieval process.

Specifically, you can:

  • Extract structured data from entire websites using URL wildcards (https://example.com/*)
  • Define custom schemas to capture exactly what you need—from simple product details to complex organizational structures
  • Guide the extraction with custom prompts to ensure the LLM focuses on your target information
  • Deploy anywhere with comprehensive support for Python, Node, cURL, and other popular tools. For no-code workflows, just connect via Zapier or use our API to set up integrations with other tools.

This versatility translates into a wide range of real-world applications—enabling you to enrich web data for just about any use case.

Limitations - (and the road ahead)

  • Let's be honest - while /extract is pretty awesome at grabbing web data, it's not perfect yet. Here's what we're still working on:
  • Big sites are tricky - It can't (yet!) grab every single product on Amazon in one go
  • Complex searches need work - Things like "find all posts from 2025" aren't quite there
  • Sometimes, it's a bit quirky - Results can vary between runs, though it usually gets what you need
  • But here's the exciting part: we're seeing the future of web scraping take shape

Try it out

Curious to try /extract out for yourself?
Visit our playground to try out /extract - you get 500,000 tokens for free
Dive into our Extract Beta documentation for detailed technical guidance and API reference
Want a no-code solution? Connect /extract to thousands of applications through our enhanced Zapier integration

That's all for now! Happy Extracting from the whole Firecrawl team 🔥

Full Changelog: v.1.3.0...v1.4.0

v1.3 - /extract improvements

14 Jan 22:40
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.2.1...v.1.3.0

v1.2.1 - /extract Beta Improvements

10 Jan 17:54
Compare
Choose a tag to compare

What's Changed

/extract (beta) changes

  • We have updated the /extract endpoint to now be asynchronous. When you make a request to /extract, it will return an ID that you can use to check the status of your extract job. If you are using our SDKs, there are no changes required to your code, but please make sure to update the SDKs to the latest versions as soon as possible.

  • For those using the API directly, we have made it backwards compatible. However, you have 10 days to update your implementation to the new asynchronous model.

  • For more details about the parameters, refer to the docs sent to you.

New Contributors

Full Changelog: v1.2.0...v1.2.1

Changelog: https://www.firecrawl.dev/changelog#/extract-changes

v1.2.0 - v1/search is now available!

02 Jan 23:24
Compare
Choose a tag to compare

/v1/search

The search endpoint combines web search with Firecrawl’s scraping capabilities to return full page content for any query.

Include scrapeOptions with formats: ["markdown"] to get complete markdown content for each search result otherwise it defaults to getting SERP results (url, title, description).

More info here /v1/search docs

What's Changed

Full Changelog: v1.1.1...v1.2.0

v1.1.1

30 Dec 15:30
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1.0...v1.1.1

v1.1.0

27 Dec 18:34
c5b6495
Compare
Choose a tag to compare

Starting today we are going to be posting weekly releases here and on firecrawl.dev/changelog. This release is just a summary of all the improvements and fixes we pushed since v1 release here. Thank you all for the contributions!

v1.1.0

Changelog Highlights

Feature Enhancements

  • New Features:
    • Geolocation, mobile scraping, 4x faster parsing, better webhooks,
    • Credit packs, auto-recharges and batch scraping support.
    • Iframe support and query parameter differentiation for URLs.
    • Similar URL deduplication.
    • Enhanced map ranking and sitemap fetching.

Performance Improvements

  • Faster crawl status filtering and improved map ranking algorithm.
  • Optimized Kubernetes setup and simplified build processes.
  • Sitemap discoverability and performance improved

Bug Fixes

  • Resolved issues:
    • Badly formatted JSON, scrolling actions, and encoding errors.
    • Crawl limits, relative URLs, and missing error handlers.
  • Fixed self-hosted crawling inconsistencies and schema errors.

SDK Updates

  • Added dynamic WebSocket imports with fallback support.
  • Optional API keys for self-hosted instances.
  • Improved error handling across SDKs.

Documentation Updates

  • Improved API docs and examples.
  • Updated self-hosting URLs and added Kubernetes optimizations.
  • Added articles: mastering /scrape and /crawl.

Miscellaneous

  • Added new Firecrawl examples
  • Enhanced metadata handling for webhooks and improved sitemap fetching.
  • Updated blocklist and streamlined error messages.

What's Changed

Read more

Welcome to v1 - A more reliable and developer friendly API

05 Sep 20:28
554a050
Compare
Choose a tag to compare

Firecrawl V1 is here! With that we introduce a more reliable and developer friendly API.

August 29th, 2024

Here is what’s new:

  • Output Formats for /scrape. Choose what formats you want your output in.
  • New /map endpoint for getting most of the URLs of a webpage.
  • Developer friendly API for /crawl/{id} status.
  • 2x Rate Limits for all plans.
  • Go SDK and Rust SDK
  • Teams support
  • API Key Management in the dashboard.
  • onlyMainContent is now default to true.
  • /crawl webhooks and websocket support.

Learn more about it here

Start using v1 right away at https://firecrawl.dev

What's Changed (including v0 + v1)

Read more