Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DigestInputStream utils #110

Merged
merged 2 commits into from
Feb 13, 2017
Merged

Add DigestInputStream utils #110

merged 2 commits into from
Feb 13, 2017

Conversation

pathikrit
Copy link
Owner

No description provided.

@pathikrit pathikrit merged commit 51c22e8 into master Feb 13, 2017
@pathikrit pathikrit deleted the cleanup-pr-109 branch February 13, 2017 20:05
Copy link
Contributor

@solicode solicode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API changes look fine to me. As for the performance, it looks to be nearly identical with the one exception of the buffer size. 1024 seems like too conservative of a value to me. I think you'll see a decent speed increase by picking a value like 4096 or 8192 as it aligns better with block sizes of most systems. For example, BufferedInputStream.DEFAULT_BUFFER_SIZE is 8192. But if you'd rather tweak that value later after a benchmark is added, that's fine too.

@@ -305,7 +302,6 @@ class FileSpec extends FlatSpec with BeforeAndAfterEach with Matchers {
}

it should "compute correct checksum for non-zero length string" in {
implicit val charset = StandardCharsets.UTF_8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because the default for writeText is Charset.defaultCharset(), which is system dependent. Otherwise, technically this test can fail on somebody else's machine if their default charset is something unusual.

@pathikrit
Copy link
Owner Author

@solicode : Thanks for the feedback. Fixed here: 7bd9e6d

@pathikrit
Copy link
Owner Author

@solicode: May I ask, how are you benchmarking? Can you add it to this repo? I need some help benchmarking another change: #108

The above PR gets around a bug in the JDK by monkey-patching the Charset but I am worried about performance. Thanks!

@solicode
Copy link
Contributor

Unfortunately, I wasn't doing anything too sophisticated. Basically using really large files (also numerous files) to ensure that all the processing time was indeed being spent in digest so that I could optimize that.

I was using already existing files on my system to test with, but I suppose we could just generate large files programmatically and test with that.

I haven't used JMH yet, but I'd like to get familiar with it. If you're not going to be working on #62 in the near future, I could take a look at it if you want. I wouldn't be able to do it right away, but maybe sometime this month.

@pathikrit
Copy link
Owner Author

pathikrit commented Feb 14, 2017

@solicode : I don't have any plans for #62 anytime soon. So please feel free to contribute. Look at the benchmarks sub-project: https://github.com/pathikrit/better-files/tree/master/benchmarks/src/test/scala/better/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants