Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[7.3.0] - 2022.12.22

Changed

To speed URL parsing, we no longer parse URLs with userinfo "@" in the authority (see URL syntax guide for more details)
- Our reasoning is that userinfo is rarely present
- If you have concerns about this change or would like to see it added back in (it could be optionally enabled), please raise an issue

[7.2.4] - 2022.08.25

Fixed

URL boundary to better respect the conventions of human language regarding quotation marks and parentheses (#130)

[7.2.3] - 2022.07.14

Fixed

Update required version of ioc-fanger which fixes issues with non-http(s) URL schemes (#255)

[7.2.2] - 2022.07.08

Fixed

Poorly designed grammars which were SIGNIFICANTLY slowing down this project (#250)
- 🎉 This update improves mean run-times by ≈70%!
- Thanks to @ptmcg for his contribution!

[7.2.1] - 2022.07.05

Fixed

Removed duplicative function calls

[7.2.0] - 2022.06.20

Changed

Possible breaking change: Update required pyparsing version to v3
- Although there are no public API changes associated with this version, this may be a breaking change if you are using ioc-finder and have pyparsing pinned to a version less than v3
- I've chosen to release this as a new minor version b/c I think requirement version updates w/ no API changes and no system requirement changes constitute a minor version change
Updated parsing of Google Analytics Tracker IDs so that matched must be all lower-cased or all upper-cased (e.g. ua-... and UA-... will be matched, but uA-... will not) (this makes the parsing consistent with how Google Adsense Publisher IDs are parsed)

[7.1.0] - 2022.06.13

Added

included_ioc_types option to only parse specified IOC types (#218)

Changed

Imphashes are no longer parsed as md5s even when parse_imphashes is False (#231)
Authentihashes are no longer parsed as sha256s even when parse_authentihashes is False (#231)

[7.0.0] - 2022.05.27

Added

Support for Python 3.10 (#188)

Removed

Phone number parsing (#155)
Support for Python 3.6 (#187)

[6.0.1] - 2021.06.09

Fixed

ASN grammar improved reduce false positives by not matching on lower-case "as " (#136)

[6.0.0] - 2021.05.20

Changed

Made all boolean arguments keyword-only arguments (#108)
Converting data from lists to tuples (#110)
Made _prepare_text function public (prepare_text) (#114)
Renamed no_urls_without_schemes to parse_urls_without_scheme (#109)
Moved from MIT License to GNU Lesser General Public License v3.0 (#113)

Fixed

Unquoting URLs appropriately (#104)
Pinned specific ioc-fanger version (this prevents an error where ioc-fanger was removing a URL in the query parameter of another URL - see #104)

[5.0.3] - 2021.04.09

Fixed

Unquoting URLs appropriately (#104)
Pinned specific ioc-fanger version (this prevents an error where ioc-fanger was removing a URL in the query parameter of another URL - see #104)

[5.0.2] - 2021.04.02

Changed

Improved URL grammar

Fixed

Updating library such that CIDR ranges are not detected as URLs when parse_urls_without_scheme=True (see #91)
Parse observables from URL path when parse_domain_from_url=False and parse_from_url_path=True (see #90)

[5.0.1] - 2021.01.11

Changed

Improved word boundary (specifically of MAC address and IP address grammars)

[5.0.0] - 2020.09.25

Removed

Concurrency (through the use of concurrent.futures)

[4.0.2] - 2020.09.18

Added

Added parsing Monero addresses (see #94)

Changed

Simplifying _remove_url_paths (a function used behind the scenes by the ioc finder - see #70)
Created a function to update top level domains (see #10)
Updating top level domains (which are used in grammars to find network observables)

[4.0.1] - 2020.09.11

Changed

You can now ingest text using the cli. For example, this now works: cat foo.text | ioc-finder.
We now have 100% code coverage!!!
Adding more keywords so this package is easier to find in pypi

[4.0.0] - 2020.09.09

Changed

We are now parsing observables from URL paths by default (see #87). If you would like to disable this functionality, you may do so by setting the parse_from_url_path keyword argument to False when calling the find_iocs function (e.g. parse_from_url_path=False).

<= 3.1.2 - 2020.08.29

The change log was added for version 3.1.2