Releases
1.1.0
1.1.0: Windows support, Better Multiprocessing, New Datasets
Windows support
Add Windows support (#644 ):
add tests and CI for Windows
fix numerous windows specific issues
The library now fully supports Windows
Dataset changes
New: HotpotQA (#703 )
New: OpenWebText (#660 )
New: Winogrande - add debiased subset (#655 )
Update: XNLI - update download link (#695 )
Update: text - switch to pandas reader, better memory usage, fix delimiter issues (#689 )
Update: csv - add features parameter to CSV (#685 )
Fix: GAP - fix wrong computation of boolean features (#680 )
Fix: C4 - fix manual instruction function (#681 )
Metric changes
Update: ROUGE - Add rouge 2 and rouge Lsum to rouge metric outputs by default (#701 , #702 )
Fix: SQuAD - fix kwargs description (#670 )
Dataset Features
Use multiprocess from pathos for multiprocessing (#656 ):
allow lambda functions in multiprocessed map
allow local functions in multiprocessed map
and more ! As long as functions are compatible with dill
Bug fixes
Datasets: fix possible program hanging with tokenizers - Disable tokenizers parallelism in multiprocessed map (#688 )
Datasets: fix cast with unordered features - fix column order issue in cast (#684 )
Datasets: fix first time creation of cache directory - move cache dir root creation in builder's init (#677 )
Datasets: fix OverflowError when using negative ids - fix negative ids in slicing with an array (#679 )
Datasets: fix empty dictionaries afetr multiprocessing - keep new columns in transmit format (#659 )
Datasets: fix type inference for nested types - handle data alteration when trying type (#653 )
Metrics: fix compute metric with empty input - pass metric features to the reader (#654 )
Documentation
Elasticsearch integration documentation (#696 )
Tests
Use GitHub instead of AWS in remote dataset tests (#694 )
You can’t perform that action at this time.