Replies: 6 comments 5 replies
-
According to your estimates, how much computational resources and video datasets are required to train a Video VAE (not Video-VQVAE)? Approximately, how long would the training process take? |
Beta Was this translation helpful? Give feedback.
-
Does Sora used VAE not VQVAE? |
Beta Was this translation helpful? Give feedback.
-
it is unreal bro,dont waste your time on this project. |
Beta Was this translation helpful? Give feedback.
-
We are now able to support both vae and vqvae, please check our latest code. |
Beta Was this translation helpful? Give feedback.
-
Refer to #93 for more details. |
Beta Was this translation helpful? Give feedback.
-
looking at the intense resource requirements for Video-VAE and the practicality of Video-VQVAE, it seems like a smart compromise for now. eager to see how this compromise plays out 😊 |
Beta Was this translation helpful? Give feedback.
-
In fact, we believe that both VQVAE and VAE can be used, but we have noticed that there is already a Video-VQVAE in the community that can be directly used for testing purposes.
Additionally, the combination of VQVAE and Diffusion works well for Latent-Diffusion. We have attempted to train a Video-VAE, but it was too resource-intensive. In the early stages of the Open-Sora Plan, the open-source Video-VQVAE can serve as a viable alternative.
Beta Was this translation helpful? Give feedback.
All reactions