This model contains the weights of NExT-GPT covering text-image-video-audio (tiva), which is built upon
- 
- Vicuna-7B with version 0
 
 - 
- Stable Diffusion with version 
v1-5. 
 - Stable Diffusion with version 
 - 
- AudioLDM with version 
l-full. 
 - AudioLDM with version 
 - 
- ZeroScope with version 
v2_576w. 
 - ZeroScope with version 
 
For more details about the usage of the model, please refer to our code repository.