Dubbed Spoonbill Garuda version used instruction tuned sugiv/garuda-from-llama2-7B-chat as languade model. The above said Spoonbill Garuda is also vision-language model which was also trained on visual instruction datasets (limited from Otter).

