Joint video and text parsing for understanding events and answering queries (2014)

by K Tu, M Meng, M W Lee, T E Choe, S-C Zhu
Venue:MultiMedia, IEEE