From 9924c805a1edaeef3ee2c5361c5bde5630d7ebdf Mon Sep 17 00:00:00 2001 From: Adam Brown Date: Wed, 4 Dec 2024 20:47:05 -0800 Subject: [PATCH] Update README.md grammar changes --- Fdic/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Fdic/README.md b/Fdic/README.md index 37a7645..a14a731 100644 --- a/Fdic/README.md +++ b/Fdic/README.md @@ -27,7 +27,7 @@ FrequencyDictionaryIO.writeFdic(dictionary, "en-80k.fdic") ``` ## Performance -Performance varies greatly depending on the combination of machine and dictionary being decoded. But `fdic` always +Performance varies greatly depending on the combination of machine and dictionary being decoded. But `fdic` is always superior in both size and speed to both a plain text dictionary, and a GZIPed text dictionary. Some Machine/Dictionary combinations show more modest speed improvements, but considering simply GZIPing would have @@ -111,7 +111,7 @@ Integer permission to represent the frequency for `is`. Pretty quickly the terms require fewer than 8 characters to represent their frequency, so representing it as a binary number begins to increase the overall size of each entry. -To solve this I came up with `Variable Length Longs` which only take as many bytes as necisary, the smallest numbers +To solve this I came up with `Variable Length Longs` which only take as many bytes as necessary, the smallest numbers requiring just 1 byte when encoded. ### How @@ -119,7 +119,7 @@ This is achieved by using the **Most Significant Bit** (_MSB_) as a `Continuatio encoded byte only represents 7 bits of data, meaning in worst case scenarios we could use up to 10 bytes to represent an 8 byte Long. In this use case it never occurs, as term frequencies are never that large. -When we begin reading a `vlong` feild, we mask out the continuation bit, take the 7 data bits, shift them depending on +When we begin reading a `vlong` field, we mask out the continuation bit, take the 7 data bits, shift them depending on how many bytes we've read so far for this vlong, and add that to an accumulator Long. ```mermaid