-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use tfhe_cuda_backend insteaof of concrete_cuda #1064
Conversation
BourgerieQuentin
commented
Sep 24, 2024
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Ubuntu.
|
99 files! it's a lot of changes |
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Ubuntu.
|
d5584ef
to
eb8c467
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Ubuntu.
|
p->output_lwe_dim.val, p->base_log.val, p->level.val, num_samples); | ||
cuda_drop_async(indexes_gpu, s, loc); | ||
free(indexes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is freed here, the CUDA runtime will likely try reading this array much later, once it can actually do the memory transfer. This needs to be deallocated after a synchronization point - see below.
cuda_drop_async(test_vector_idxes_gpu, s, loc); | ||
cuda_drop_async(glwe_ct_gpu, s, loc); | ||
cuda_drop_async(indexes_gpu, s, loc); | ||
free(indexes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
cuda_drop_async(test_vector_idxes_gpu, s, loc); | ||
cuda_drop_async(glwe_ct_gpu, s, loc); | ||
cuda_drop_async(indexes_gpu, s, loc); | ||
free(indexes); | ||
Dependence *dep = | ||
new Dependence(loc, out, out_gpu, false, false, d0->chunk_id); | ||
// As streams are not synchronized, we can only free this vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can''t add comment at right line - just 4 lines below is the way to deallocate data after CUDA synch points - it registers the pointer with the runtime then will deallocate after the next cuda synch.
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Ubuntu.
|
1 similar comment
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Ubuntu.
|
5014daa
to
aa6563e
Compare
6a52c4e
to
ad106f7
Compare
If you're new to commit signing, there are different ways to set it up: Sign commits with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark
Benchmark suite | Current: 7fc2663 | Previous: 2a1f449 | Ratio |
---|---|---|---|
v0 PBS table generation |
59009422 ns/iter (± 1243379 ) |
58930979 ns/iter (± 314241 ) |
1.00 |
v0 PBS simulate dag table generation |
37890203 ns/iter (± 414241 ) |
37554649 ns/iter (± 469151 ) |
1.01 |
v0 WoP-PBS table generation |
52298219 ns/iter (± 447515 ) |
52447087 ns/iter (± 585670 ) |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
e51a9c5
to
b9886d1
Compare
b9886d1
to
7fc2663
Compare