Skip to content

macmillancontentscience/c4meta

Repository files navigation

c4meta

Lifecycle: experimental

The c4meta package provides metadata about the C4 dataset (Google’s Colossal Clean Crawled Corpus)

Installation

You can install the released version of c4meta from CRAN with:

# Not yet.
#install.packages("c4meta")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("macmillancontentscience/c4meta")

Example

Coming soon.

Code of Conduct

Please note that the c4meta project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Disclaimer

This is not an officially supported Macmillan Learning product.

Contact information

Questions or comments should be directed to Jonathan Bratt ([email protected]) and Jon Harmon ([email protected]).

License

The C4 dataset is released under the terms of ODC-BY. By using this package, you are also bound by the Common Crawl terms of use in respect of the content contained in the dataset.

About

Metadata About the Colossal Common Crawl Corpus

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages