INK: Intensive Neural Knowledge
- James Park ,
- Qiuyuan Huang ,
- Yonatan Bisk ,
- Jianwei Yang ,
- Subhojit Som ,
- Ali Farhadi ,
- Yejin Choi ,
- Jianfeng Gao
MSR-TR-2022-38 |
Published by Microsoft MSR-TR-2022-38
Knowledge-based vision language systems are increasingly ubiquitous in our everyday lives. However, despite the introduction of numerous benchmarks, the community has siloed models of different types of knowledge rather than building general knowledge-intensive models that encompass both commonsense and factoid knowledge. We introduce INK – Intensive Neural Knowledge – a new task that involves extracting the necessary knowledge to accurately perform image and text retrieval. In particular, INK leverages existing resources to require understanding of factoid, object-commonsense, or social-consciousness knowledge to successfully perform retrieval. Finally, we provide a set of competitive baseline models whose weak performance motivates the need to develop new knowledge understanding models and systems.