Im2text: Describing images using 1 million captioned photographs. (2011)

by Vicente Ordonez, Girish Kulkarni, Tamara L Berg
Venue:In Neural Information Processing Systems (NIPS).