Google fights spam with artificial intelligence

Technology

Google fights spam with artificial intelligence

Gmail announced that it had prevented 99.9 percent of spam from reaching inboxes, but will artificial intelligence help bridge that final tenth of a percent?

Google DeepMind

Google's new AI program is capable of learning from experience, much like a human brain.

By Graham Starr Staff writer

July 13, 2015, 6:53 p.m. ET

The robot wars will be won by spambots, unless Google engineers have anything to say about it.

The company announced in its on Thursday that it has been using Google鈥檚 artificial neural network to help with e-mail spam filtering. Already, the company says that it鈥檚 been able to block 99.9 percent of spam from reaching inboxes, while incorrectly classifying legitimate e-mail as spam only 0.05 percent of the time.

And it鈥檚 all thanks to data collection.

For the most part, Google鈥檚 system is based on Gmail鈥檚鈥渞eport spam鈥� and 鈥渘ot spam鈥� buttons. By taking this user input and referencing other user actions, the Internet giant can learn what counts as spam and what doesn鈥檛. For e-mails that were sent with maliciousness intent, the server can learn, parse, and redirect from the inbox.

But spam can still make it past blockers through a variety of ways, the company says. Often, spam succeeds by using previously unaccounted domains (new ones such as .xyz or .horse can get past filters) or by mimicking desired e-mails (or 鈥渉am鈥�). Despite new filters, spammers find ways to circumvent them.

Though we may not have completely eradicated spam as , Internet companies have been able to at least limit its pervasiveness.

The remaining problem lies not in detecting which e-mails are junk. 鈥淏lacklisting is an efficient anti-spam mechanism, but is becoming more and more prone to false positives,鈥� reads a , which brought experts together to discuss the future of spam detection. Often times, the 鈥渃oarse granularity鈥� of blacklists sweep non-malicious addresses into the junk bin, the report says.

And even with whitelists, or lists of approved online addresses, the report asserts that services are just using heuristics to curb spam rather than addressing any computational approach.

So Google is using its 鈥溾€� 鈥� a series of learning supercomputers designed to 鈥渢hink鈥� and identify imagery 鈥� to detect spam and help close that remaining tenth of a percent of error.

This type of artificial intelligence is grown from a type of machine learning known as 鈥渄eep learning.鈥� These types of neural networks attempt to mimic higher-level thought and abstraction, and many see it as one of the roots for development of artificial intelligence.

Google thinks this can stop junk. Instead of utilizing white- or blacklists to identify spam or ham e-mails, its neural network can use natural-language processing and information from other users to draw conclusions about the messages being analyzed.

But neural networks have their own problems, says Anselm Blumer, associate professor of computer science at Tufts University. To Dr. Blumer, these artificial 鈥渘eural networks鈥� approach learning from a perspective that is wholly different from how people actually think.

Neural networks apply limited layers of computation to draw conclusions and learn, which is different from the distributed, varied, and compounded approach that brains take, says Blumer, whose research is in machine learning, artificial intelligence (AI), and human-computer interaction.

鈥淚n a sense, neural networks is a bad name,鈥� Blumer says. The computation simplifies the process, but deep-learning researchers are hoping that their work can pave the way for improved AI. With increased layers of abstraction, computational processes may do better to mimic human thought in understanding and creating concepts. Google鈥檚 AI recently demonstrated (even though it had never seen a cat before). With more layers, computers may soon be able to learn at the same pace as humans.

Even so, decision making using neural networks can lead to what Blumer calls 鈥渙verfitting.鈥�

鈥淎 network like that is harder to train, and it鈥檚 much easier for it to come to false conclusions,鈥� he says.

Like the spam filters creating 鈥渇alse positives鈥� of junk, artificial neural networks are looking to create things out of what it can find. If there isn鈥檛 anything there, it may run into the same problems that Gmail鈥檚 filter is currently facing.

Or it may decide that , like Google鈥檚 neural network did last month.

The problem, Blumer says, lies in the computing. It鈥檚 hard to program a computer that doesn鈥檛 come to false conclusions, without also making sure not to miss any real ones.

Like a human bias, it鈥檚 easy for this artificial intelligence to prefer simpler conclusions. The difficulty lies in understanding and weighing nuance of semantic language 鈥撀燼 problem that has plagued spam researchers for years.

But as the computing power for artificial intelligence improves, so will the spambots.

For right now, Google is focusing on improving its neural network to properly fit the needs of its users, and it will continue to improve data using the 鈥渟pam鈥� versus 鈥渘ot spam鈥� filters in Gmail. And like the human brain, the more it learns, the more accurate its actions will be.

海角大神

Why is 海角大神 Science in our name?

Ready for constructive world news?