Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling GPU acceleration #10

Open
roshkins opened this issue Jul 1, 2023 · 6 comments
Open

Enabling GPU acceleration #10

roshkins opened this issue Jul 1, 2023 · 6 comments

Comments

@roshkins
Copy link

roshkins commented Jul 1, 2023

I read on the Piper readme that it supports GPU acceleration. I dug through the addon locally, but there doesn't seem a way to enable it easily, since it's all compiled rust. Any idea on how to get GPU acceleration working so it doesn't eat up all my CPU?

@vortex1024
Copy link

it is quite complicated. It can't use cuda in his current form, since vnda is 32 bit and cuda is 64 bit only. I managed to enable direct ml, which is another programatic way of using the GPU, but something is wrong since it frequently crashes with tno error, and it does not react faster than the CPU version.

@mush42
Copy link
Owner

mush42 commented Jul 2, 2023

I can confirm the issue with DirectML, which seams to be the best option for Windows since it is not limited to Cuda.

Still I've some hope to get DirectML working once we upgrade to ONNXRuntime v1.15.

ONNXRuntime supports Intel-specific inference accelerators, such as one-DNN and open-Veno. But they require building ONNXRuntime from source, and I don't know their impact on speed. I'll investigate them when I have some free time.

Also, WindowsML, which is different from DirectML, is under consideration, since it is the native ML platform built into Windows 10 and later.

Best

@roshkins
Copy link
Author

roshkins commented Jul 2, 2023

Is it possible to launch a new process in 64-bit that uses IPC to transmit data? Or is that the complication that you're talking about @vortex1024 ?

@mush42
Copy link
Owner

mush42 commented Jul 3, 2023

Actually ONNXRuntime on 32-bit is 2x slower than it is on 64-bit systems.

The question is: what IPC approach to use? We need an approach with a very low overhead, we talk about milliseconds here.

@vortex1024
Copy link

@mush42 sockets with protobuf?

@beqabeqa473
Copy link

are stdin/stdout slow on windows? what do you think? or maybe COM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants