How Google tries to keep ‘sneaky’ spam from your inbox
Google is using smarter technology to help keep your Gmail inbox free of spam but at the same time ensuring that legitimate messages get through.
On one end, Gmail is using artificial intelligence to better detect “sneaky” spam that would otherwise get through, Google product manager Sri Harsha Somanchi said Thursday in a blog post outlining the new ways in which smarter technology is being employed to create better spam filters. On the other end, Google has launched a feature called Gmail Postmaster Tools, which provides data that can help email senders better diagnose why certain emails get classified as spam.
Determining which messages are spam and which are not is a neverending battle, especially since a message considered spam to one person could be legitimate to another. Messages from banks, airlines and other companies fall into this category. How does Google tell if a certain email is a sales promotion or an important notice informing you of information on your bank account or an upcoming flight? Spammers have also gotten smarter, using more tricks to better disguise junk mail so that automated spam filters have a tougher time figuring out how to tag it. So what new tools and techniques is Google using in the fight against spam?
Gmail’s spam filter is now using an artificial neural network, aka artificial intelligence, to become smarter. When you tag a message as spam or not spam, you help the spam filter learn from its mistakes, so it can be more accurate and reliable in the future. But the new neural network takes that a step further by better detecting and blocking “the especially sneaky spam,” meaning the kind that can easily pass for legitimate mail. And exactly how does this work? A Google spokesman explained the process as follows:
Our neural net system learns based on a huge collection of example “wanted” messages and a similar body of example spam mails. The system tracks thousands of attributes of each message (for example, the words in the message or the sender’s IP address). The spam filter then uses a technique called clustering analysis to find attribute groupings which differentiate spam from wanted mail. Essentially, the spam filter finds the sneaky spam by ignoring the similarities, and focusing only on the differences. As both spam and wanted mail evolve, the system is constantly relearning this differentiation. When users report spam (or not spam) that content is fed into the system, and it learns more. Ultimately, our spam filter learns from these user reports, which is how it has improved so much in the last few years.
Further, advances in machine learning now can help the spam filter better learn each individual’s preferences. So it can correctly tag a certain message as spam for you but as legitimate for another user. One example is a weekly email newsletter, which one person may want and another may not. The Gmail spam filter has also gotten smarter at spotting email impersonation, meaning messages that came from a source other than the actual sender. Spammers typically use bogus addresses so they can’t be detected, but a more sophisticated spam filter can identify mail from such addresses.
Legitimate email senders also have to step in when their messages are inaccurately tagged as spam. And to aid them in this task, Google is offering its new Gmail Postmaster Tools. These tools can help high-volume mail senders analyze spam reports, delivery errors and other factors to determine why their legitimate messages are being falsely identified as junk.
Somanchi said that less than 0.1 percent of email in the average Gmail inbox is spam, and the amount of legitimate mail stuck in the spam folder is even lower, at under 0.05 percent. But the battle continues.
“Ultimately, we aspire to a spam-free Gmail experience,” Somanchi said. “So please keep those spam reports coming, and if you’re a company that sends email, then check out our new Postmaster Tools. Together we can get the wanted mail to the right place, and keep the spam where it belongs.”
This article was originally appeared on CNET.com.