Downloads
Tip of the Tongue Known Item Retrieval Dataset for Movie Identification
August 2021
The Tip of the Tongue (ToT) dataset is from the paper Tip of the Tongue Known-Item Retrieval: A Case Study in Movie Identification. It is comprised of 758 question/answer pairs scraped from the website iRememberThisMovie.com between 2013 and 2018. These…
Python Reasoning Challenges
May 2021
A short Python Reasoning Challenge can replace an entire page of English describing a typical programming problem. The goal is to teach computers how to program. This OSS repository will contain a dataset of short Python challenges. Most of them…
Conformer-Kernel Model with Query Term Independence (TREC Deep Learning Quick Start)
March 2021
This is a quick start guide for the document ranking task in the TREC Deep Learning (TREC-DL) benchmark. If you are new to TREC-DL, then this repository may make it more convenient for you to download all the required datasets…
Sepsis Cohort from MIMIC III
December 2020
This repo provides code for generating the sepsis cohort from MIMIC III dataset. Our main goal is to facilitate reproducibility of results in the literature.
Generative Neural Visual Artist (GeNeVA) – Training and Evaluation Code
September 2019
Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task proposed in our ICCV 2019 paper Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction.
MetaLWOz: A Dataset of Multi-Domain Dialogues for the Fast Adaptation of Conversation Models
July 2019
We introduce the Meta-Learning Wizard of Oz (MetaLWOz) dialogue dataset for developing fast adaptation methods for conversation models. This data can be used to train task-oriented dialogue models, specifically to develop methods to quickly simulate user responses with a small…
TextWorld
July 2019
TextWorld is a text-based framework used to generate games used to train artificial intelligent agents for text adventure games. The goal is to have this project be used to advance the state of the art of AI research and to…
AMDIM – Augmented Multiscale Deep InfoMax
June 2019
AMDIM (Augmented Multiscale Deep InfoMax) is an approach to self-supervised representation learning based on maximizing mutual information between features extracted from multiple views of a shared context.