https://github.com/prasanthsagirala/image_to_captions

This model takes a text input and gives a caption as output.

While providing input, use trigger word 'captionize: '

Eg: 'captionize: a boy sitting under a tree'