Blip base finetuned on diffusiondb 1k dataset, for image to text tasks