Camera Keyframing with Style and Control

Our proposed deep-learning framework for camera keyframing offers both high-level style specification and low-level keyframe control.


In this work we present a tool that enables artists to synthesize camera motions following a learned camera behavior while enforcing user-designed keyframes as constraints along the sequence. To solve this motion in-betweening problem, we train a camera motion generator from a collection of trajectories using an additional conditioning on target keyframes. We also condition the generator with a style code automatically extracted from real film clips through the design of a gating LSTM network. This style code encodes the camera behavior defined as the correlation between the characters and camera motions. We further extend the system by incorporating a fine control of camera speed and direction via a hidden state mapping module. We then evaluate our method on two aspects: i) the capacity to synthesize camera trajectories by extracting camera behaviors from real movie film clips, and constraining them with user defined keyframes; ii) the capacity to ensure that in-between motions still comply with the reference camera behavior while satisfying the keyframe constraints. As a result, our system is the first behavior-aware keyframe in-betweening technique for camera control that balances behavior-driven automation with precise and interactive control..

ACM Transactions on Graphics (Proceedings of SIGGRAPH ASIA 2021)



Our proposed framework for learning camera together with keyframe constraints composed of a camera behavior extractor (Gating LSTM), which extracts camera behaviors from reference clips, and a camera motion generator, which generates camera trajectories that both meet the camera behaviors and required keyframe constraints, speed and directions.
By learning the mapping from a style code and several camera frames (5 in this paper) towards the starting hidden state of our autoregressive generator, we provide an additional degree of control for the users through the specification of camera velocities.
The proposed system enables designers to specify keyframes with initial velocities. In this example, the same keyframe positions and camera style code is used. As displayed, the resulting trajectories are guided by the different velocity directions defined at the starting keyframe (time 001) and the mid keyframe (time 090) respectively. This is achieved by updating the LSTM hidden state through a dedicated network which maps velocities with hidden states.
User-specified keyframes are placed at increasingly larger distances from the trajectory of a given style. As displayed, our system adapts well to the keyframes. We re-extract the style codes from the generated trajectories (shown with crosses in the PCA representation on the right part of the figure). As displayed our system moves from the given style to adapt to the keyframe constraints.
Our camera trajectory editing interface. The scene view (A) displays the animation. In the timeline (B) the user can add, drag, and delete keyframes (inverted triangles), as well as drag the process of animation. The keyframe editing (C) allows the user to select two target characters, shot view and Toric camera pose at the keyframe. The trajectories selection (D) provides generated camera trajectories with different behaviors for the user to choose. The button on the bottom is used to preview a result (Play), generate trajectories (Generate) and save results (Save).
Experiment with same keyframes and different behaviors: different colors represent different camera behaviors. Frames with red camera icon at the corner refer to keyframe constraints. We observe that constraints are well enforced in the 3D content and in the rendered snapshots.
Experiment with different keyframes and same behavior: different colors represent different keyframes fed to the system with the same style code. All three sequences belongs to a same style but their trajectories adjust well in response to the required keyframes (frames with different colors of camera icons at their corner).
This figure displays a result designed by an animation artist using only 10 keyframes for a 24 seconds sequence of a zombie fighting scene. We show the keyframes and camera trajectory simultaneously with the rendered animation snapshots.
In this hockey game scenario, our method is able to generate dynamic and qualitative camera motions using only 10 keyframes. Rendered animation snapshots and the overview trajectory are displayed.




This work was supported in part by the National Key R&D Program of China (2018YFB1403900, 2019YFF0302902). We also thank Anthony Mirabile and Yulong Zhang for the various support and helpful discussions throughout this project, as well as Yu Xiong for his help processing the MovieNet dataset.