This model contains the weights of NExT-GPT covering text-image-video-audio (tiva), which is built upon

For more details about the usage of the model, please refer to our code repository.