v.1.4.4 Minor Release
Changes:
-
New
tsv-sample
option--i|inorder
This option preserves input order when using simple or weighted random sampling. These sampling modes are engaged when a sample size is selected via the
--n|num NUM
option. Documentation was updated to better reflect the distinction between shuffling the full data set and random sampling which selects a subset of lines. (PR #226) -
tsv-summarize
--min
and--max
operators changed to preserve original input stringThe prior behavior of the operators was to read the values to a double, then use numeric formatting to print the recorded double. In some cases this would cause the original input to change, especially if it was a long format number, for example, 16 digits long. (PR #220)
The prior behavior makes sense for calculations like mean and median, but not for min and max. In particular, preserving the original values allows them to be joined with or compared to the original data.
-
Prebuilt binaries have been updated to use the latest LDC compiler (1.17.0).
To download and unpack the prebuilt binaries:
$ # Linux
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v1.4.4/tsv-utils-v1.4.4_linux-x86_64_ldc2.tar.gz | tar xz
$ # MacOS
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v1.4.4/tsv-utils-v1.4.4_osx-x86_64_ldc2.tar.gz | tar xz