Implementation of Animating Pictures
with Stochastic Motion Textures
Vivien Nguyen || cs194-26-afd
Original Paper by Yung-Yu Chuang et. al. 2005
Overview
In this project, our goal is to take a single static photograph or painting, and animate pieces of the image to make it look alive. We can accomplish this in a system consisting of three main steps: layer generation (segmentation, matting, inpainting), motion specification, and rendering. You can read my paper write up here!
Layer Generation: Segmentation
Goal: Select a region to be the "foreground element" of this layer, i.e. the piece to be animated on this layer.
Process: Use photoshop to brush in a foreground region (white), background region (black), and ambiguous or unknown region (gray).
Results: A trimap, such as to the right.
Layer Generation: Matting
Process: Following "A Bayesian Approach to Digital Matting" by Yung-Yu Chuang et. al. 2001, we will alternately solve the MAP estimation for the most likely Foreground/Background color, then the most likely alpha value, given the observed pixel color for each pixel in the unknown region.
Results: A predicted foreground, background, and alpha matte.
L: Naive, R: Matted
Results from Bayesian Matting
Layer Generation: Inpainting
Goal: Now that we've segmented and matted our foreground, we'd like to continue segmenting new layers. However, we have a huge black spot in our background. Moreover, when we animate the foreground, it will reveal new parts of the background. So we'd like to inpaint the missing parts!
Process: We can use a pretty straightforward inpainting algorithm, "Region Filling and Object Removal by Exemplar-Based Image Inpainting", by Criminisi et. al. 2003. The key element of this method is the fill order: we first compute the "fill front", or the edges of the remaining target region. Then, we compute high priority places to inpaint -- regions that are nearby existing structures in the source region, or regions with high gradients. Then, we define an NxN patch around this point, and search through all same sized patches in the source region. We look for the most similar patch (in the pixels available in the target patch), then copy over the needed pixels.
Results: An inpainted background, that we continue to use to segment new layers.
Motion Specification
Goal: We now have N layers, with a clear foreground element. In order to animate these layers, we need to define motions for each of them. We'd like to compute a displacement map, which in this implementation is a function on the time step t, that returns an [x, y] shift.
Process: We bucket our layers into four types of motion - water, boats, clouds, and plants. All of these are sums of sines (with the exception of clouds, which is a simple translation), but applied in different ways. For water and boats, we use the same sine sum within the same image, with the phase offset dependent on the y pixel of the pixel to be translated. This creates a nice rippling effect. We ask the user to click a point at the bottom of the boat, so we can set it in phase with the nearby water. We can also use similar techniques for even non-boat layers, like the circles example below. For that one, we use a dampening sine function to create a bouncing effect.
For plants, we ask the user to define a line running through the plant from up to down. Then, we map all the y values into a [0,1] range relative to this user-defined line. After generating our sinusoid, we apply it using linear interpolation across these y values.
Results: A function that we can call at each time step to generate a new frame.
Rendering and Results
The final step of our pipeline is to render our frames! We proceed with layers from back to front (painter's algorithm) to synthesize each frame of our animation. At each time step, we call the displacement function for each layer, add that shift, and recomposite the layers using our computed alpha mattes.