- performance enhancement
- use mmap() for code generation, to avoid memory copy
- use Slice<> structure to avoid memory copy
- parallel codegen
- auto mkdir when output dir doesn't exist
- fix: minor bugs
- export C-API for easily integration
- remove boost::thread, for better build experience
- add ghc::filesystem instead of writing Path.h myself