Artificial Intelligence (AI) developed by researchers at MIT has demonstrated how it can smoothly separate in-image objects, enabling more realistic looking photo edits. The idea is to make the same technology work on video as well.
In a release by MIT, they mentioned that the tech was developed in-house at the MIT- Computer Science and Artificial Intelligence Laboratory (CSAIL). Modern day cinema is often CGI-heavy and as films get increasingly dependent on digital imagery, compositing—the process of seamlessly merging the background with foreground images has become an invaluable process. Examples of composting in film include placing actors atop buildings or in impossible locations without actually going there.
While blue and green screen tech has been around for a long while, making these images look realistic is not simple. Some of the most difficult parts of this process is the subtle aesthetic transitions that editors have to pick and blend between background and foreground. Minute details like hair are easily spotted in bad CGI and this AI, say its makers, works on such details.
"The tricky thing about these images is that not every pixel solely belongs to one object," says Yagiz Aksoy, a visiting researcher at MIT-CSAIL.
"In many cases it can be hard to determine which pixels are part of the background and which are part of a specific person."
In movies that do get it right, it is a tedious, time consuming and massively expensive endeavour, requiring the expertise of the best editors, notes the report. Now Aksoy and his MIT CSAIL team claim to have a way to use AI to automate a good percentage of the editing process for photos. The team believe that soon, this same approach could be used for video editing as well.
The AI takes an image and breaks it down into a number of layers separated by a series of what they are calling, "soft transitions" between them, say the researchers.
Called, "semantic soft segmentation" (SSS), the AI analyses texture and colour from the original image and combines it with image information learned by a neural network. It tries to identify what the objects in the image actually are, notes the release.
"Once these soft segments are computed, the user doesn't have to manually change transitions or make individual modifications to the appearance of a specific layer of an image," says Aksoy. Manual editing tasks like replacing backgrounds and adjusting colours would be made much easier, he said.
SSS is, as of now focused on static images, points out the report, however, the team says that using this same tech for videos is within the foreseeable future.