Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape all languages of a category by default #27

Open
octopusinvitro opened this issue Aug 10, 2016 · 1 comment
Open

Scrape all languages of a category by default #27

octopusinvitro opened this issue Aug 10, 2016 · 1 comment

Comments

@octopusinvitro
Copy link
Contributor

octopusinvitro commented Aug 10, 2016

For wikidata scrapers that scrape a category (for example Taiwan) it would be nice if it could automatically scrape all versions of that category in all languages available, so that we can get politicians from a category who may present in some languages but not in others.

For example, in the scraper linked above, Lee Ching-hua was removed from the Chinese page for the Category of Members of the 8th Legislative Yuan, but he was still there in the English version of the page. He indeed belonged in that term.

Since at the moment of writing this issue we were only scraping the Chinese version of that category, we lost him. If we were also scraping the English version, we wouldn't.

@octopusinvitro
Copy link
Contributor Author

Since I want to learn to work with the Wikipedia API, and without having taken a look at the code yet, I'm scaringly and adventurously assigning myself to this one.

@octopusinvitro octopusinvitro self-assigned this Aug 10, 2016
@octopusinvitro octopusinvitro removed their assignment Nov 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant