This model contains the weights of NExT-GPT covering text-image-video-audio (tiva), which is built upon
-
- Vicuna-7B with version 0
-
- Stable Diffusion with version
v1-5
.
- Stable Diffusion with version
-
- AudioLDM with version
l-full
.
- AudioLDM with version
-
- ZeroScope with version
v2_576w
.
- ZeroScope with version
For more details about the usage of the model, please refer to our code repository.