-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Manal Shaikh edited this page Oct 27, 2020
·
1 revision
Welcome to the WPExtractor's wiki!
WPExtractor is a python-based tool specifically made for Artificial Intelligence-based projects to make datasets. This helps to collect data from blogs which can be used to train bot in many useful ways.
- Automatically extract all posts from a WordPress website within seconds.
- Saves the data in the JSON file in the directory for you.
- Ability to bypass certain restrictions related to user-agents, with a custom default user-agent.
- Easily understandable JSON format to make your life easier :D
- Responsive developers. Just make an issue, we'll fix it for you :)
Usage :
Downloading posts from a WordPress website -
python main.py -u https://shadowhosting.net/blog
Downloading pages from a WordPress website -
python main.py -u https://itsfoss.com/wp-json -p
Note - Above URLs are just for reference of the wiki and are not to be copied, unless the license/author allows specifically so.
WPExtractor is licensed under GPL v3.0.