Sign Up

135 N Skinker Blvd, St. Louis, MO 63112, USA

View map

Video frame interpolation aims to synthesis an intermediate frame guided by two successive frames. Recently, some work shows the excellent result in scene synthesizing of temporally-varying 3D objects with implicit neural representation (INR), which could be rendered to a video in any temporal resolution. However, these methods require the information of camera poses, need to learn for each specific scene, and achieve sub-optimal results for video frame interpolation. To this end, we propose a new learning-based model, Neural Video Representation (NeVR), to learn a continuous temporal representation of a video, for high-quality video interpolation. Unlike the traditional video interpolation algorithm which directly synthesis the intermediate frame, our model aims to map the temporal-spatial coordinates of pixels to the corresponding pixel value of the interpolated frame. Additionally, NeVR takes as input a latent feature code associated with queried pixels in the intermediate frame as input to enhance the image quality. That feature code contains the information of local feature and bilateral motion in input frames and is obtained by a jointly trained encoder. Our experiments show that the proposed method outperforms the state-of-the-art method in video frame interpolation on several benchmark datasets. 

0 people are interested in this event