The data contain multimodal features extracted for the TRIPOD dataset and used in the AAAI 2021 paper "Movie Summarization via Sparse Graph Construction". The data contain 122 pickle files, each one corresponding to a movie from the dataset and include sentence-level textual features from the movie screenplays and visual and audio features for the frames/segments of the corresponding videos. All different features/modalities are aligned and given by scene (we consider a scene as manually indicated in the screenplays).
Papalampidi, P; Keller, F; Lapata, M. (2021). multimodal TRIPOD, [dataset]. University of Edinburgh. https://doi.org/10.7488/ds/2974.