How Google's neural network will improve YouTube
Researchers from Google unveiled software that analyzes videos using a machine learning technology called a neural network that some argue could lead to truly artificial intelligence.
Researchers from Google unveiled software that analyzes videos using a machine learning technology called a neural network that some argue could lead to truly artificial intelligence.
For many people who post videos on YouTube, an enticing thumbnail can make or break whether viewers decide to click on a video or scroll to the next one. But what if the video sharing site could pick the best image for each video automatically?
That was the question researchers from Google, which owns YouTube, attempted to answer recently by feeding thousands of high-quality images into a computer in order to train it to do what photographers would likely argue is a high subjective task: select the best quality photo from each video on the site.
The company unveiled its automatic 鈥渢humbnailer鈥 in a blog post last Thursday, explaining that the tool analyzes a user鈥檚 video at one frame per second, giving each frame a score. The software then selects the thumbnails with the highest quality scores and displays them.
The technology builds on Google鈥檚 deep neural network of supercomputers that the company has been training to 鈥渢hink鈥 and recognize images 鈥 such as identifying videos of cats on YouTube, even though the computer had no previous information about what a cat looked like.
Neural networks are one part of advances in so-called deep learning, a subset of machine learning that is designed to mimic higher-level thought and abstraction and may be one path toward developing truly artificial intelligence, the Monitor reported in July.
But determining a high quality photo is an additional challenge, the researchers say.
鈥淯nlike the task of identifying if a video contains your favorite animal, judging the visual quality of a video frame can be very subjective - people often have very different opinions and preferences when selecting frames as video thumbnails,鈥 wrote Weilong Yang from Google鈥檚 Video Content Analysis team and Min-hsuan Tsai from the YouTube Creator team.
Part of the issue is that neural networks function in a different manner than the human brain, some researchers say. 聽Like humans, neural networks are designed to learn 鈥渂y example,鈥 wrote Imperial College London researchers Christos Stergiou and Dimitrios Siganos, in a 2011 guide.
But to do this, they apply limited layers of computation to draw conclusions and perform specific tasks 鈥 such as pattern recognition 鈥 which differ from the 鈥渄istributed, varied and compounded approach鈥 used by human brains, Tufts University computer science professor Anselm Blumer told the Monitor in July.
This leads to an issue called 鈥渙verfitting,鈥 where 鈥渢he network has memorized the training examples, but it has not learned to generalize to new situations,鈥 a guide from the software developer Mathworks explains.
For Google鈥檚 network, this has lead to some downright bizarre results when researchers inputted images and video into the system, leading the computer to create new, artistic images of its own that often scarcely resembled the original. In July, researchers unveiled a series of hallucinatory images created by the software, such as horses sprouting dog's heads and brightly-colored glowing temples.
鈥淎 network like that is harder to train, and it鈥檚 much easier for it to come to false conclusions,鈥 Professor Blumer told the Monitor.
The goal for machine learning researchers focused on neural networks is to introduce additional layers of abstraction into the process, better mimicking human brains in understanding concepts 鈥 like photographic composition or what may define an e-mail as spam 鈥 and allowing the computer to 鈥渓earn鈥 how to apply them.
In the case of YouTube鈥檚 thumbnail software, this training process appears to be succeeding.
In order to ensure the computer could distinguish high-quality images from low-quality ones, the Google researchers uploaded custom thumbnails created by YouTube users 鈥 which tended to be well-framed and in-focus, designating these as high quality while contrasting them with 鈥渓ow quality鈥 images selected randomly from a sampling of videos.
This allowed the computer to 鈥渓earn鈥 about nuances of framing and composition, as well as gaining the ability to favor images that emphasized a central character in the video 鈥 such as a music video performer, or a family pet, two examples the researchers showed in their blog post.
They also put the new images to a more subjective test, showing them to human subjects side by side with images from YouTube鈥檚 previous thumbnail software. People who looked at the two sets of images preferred the images selected by the neural network more than 65 percent of the time, the researchers wrote.