A team of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and ETH Zurich hope to make drone cinematography more accessible, simple, and reliable.
At the International Conference on Robotics and Automation later this month, the researchers will present a system that allows a director to specify a shot’s framing — which figures or faces appear where, at what distance. Then, on the fly, it generates control signals for a camera-equipped autonomous drone, which preserve that framing as the actors move.
As long as the drone’s information about its environment is accurate, the system also guarantees that it won’t collide with either stationary or moving obstacles.
“There are other efforts to do autonomous filming with one drone,” said Daniela Rus, an Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT and a senior author on the new paper. “They can follow someone, but if you turn 180°, then you will show your back to the drone. This is a much coarser approach than what we are able to do. With our solution, if you turn 180°, our drones are able to circle around and get back to your face. What we are able to do is richer and offers more ways to describe how you would like the scene.”
Joining Rus on the paper are Javier Alonso-Mora, who was a postdoc in her group when the work was done and is now an assistant professor of robotics at the Delft University of Technology; Tobias Nägeli, a graduate student at ETH Zurich and his advisor Otmar Hilliges, an assistant professor of computer science; and Alexander Domahidi, CTO of Embotech, an autonomous-systems company that spun out of ETH.
In the picture
With the new system, the user can specify how much of the screen a face or figure should occupy, what part of the screen it should occupy, and what the subject’s orientation toward the camera should be — straight on, profile, three-quarter view from either side, or over the shoulder. Those parameters can be set separately for any number of subjects; in tests at MIT, the researchers used compositions involving up to three subjects.
Usually, the maintenance of the framing will be approximate. Unless the actors are extremely well-choreographed, the distances between them, the orientations of their bodies, and their distance from obstacles will vary, making it impossible to meet all constraints simultaneously. But the user can specify how the different factors should be weighed against each other. Preserving the actors’ relative locations onscreen, for instance, might be more important than maintaining a precise distance, or vice versa. The user can also assign a weight to minimize occlusion, ensuring that one actor doesn’t end up blocking another from the camera.
The key to the system, Alonso-Mora explained, is that it continuously estimates the velocities of all of the moving objects in the drone’s environment and projects their locations a second or two into the future. This buys it a little time to compute optimal flight trajectories and also ensures that it can get recover smoothly if the drone needs to take evasive action to avoid collision.
The system updates its position projections about 50 times a second. Usually, the updates will have little effect on the drone’s trajectory, but the frequent updates ensure that the system can handle sudden changes of velocity.
The researchers tested the system at CSAIL’s motion-capture studio, using a quadrotor (four-propeller) drone. The motion-capture system provided highly accurate position data about the subjects, the studio walls, and the drone itself.
In one set of experiments, the subjects actively tried to collide with the drone, marching briskly toward it as it attempted to keep them framed within the shot. In all such cases, it avoided collision and immediately tried to resume the prescribed framing.