This article is a summarized translation of a post published on the Google Cloud blog by Neil Kumaran. We draw some conclusions after the translation.
Original title: Spam does not bring us joy - ridding Gmail of 100 million more spam messages with TensorFlow
1.5 billion people use Gmail each month, and 5 million businesses use Gmail via G Suite. For consumers and businesses alike, much of Gmail's appeal is due to its integrated security management.
Good security means always staying ahead of threats, and our current Machine Learning models are very effective. In combination with our other protections, they prevent over 99.9% of spam, phishing and malware from reaching Gmail inboxes.
Recently, we have implemented new protections using TensorFlow, an Opensource Machine Learning framework developed by Google. These new protections complement the existing protections, either via Machine Learning or rule-based. With TensorFlow, approximately 100 million additional spam messages are blocked each day.
Where were these additional 100 million spam messages found? Mostly in spam categories that are very difficult to detect. Using TensorFlow, we were able to block messages using images, emails with hidden content and messages from newly created domains that attempt to hide a small amount of spam in legitimate traffic.
Given that Gmail already blocks the majority of spam, blocking millions more with precision is a feat.TensorFlow makes it possible to block the latest 0.1% without accidentally blocking important messages for users.
One person's spam is another person's treasure
Machine Learning makes spam interception possible by helping to identify patterns in large data sets that the humans creating the rules would not be able to detect. This allows us to quickly adapt to the ever-changing nature of spam attempts.
Need help?
Reading content isn't everything. The best way is to talk to us.
Machine Learning allows for more granular decision making on many parameters. Consider that each email contains several thousand potential signals. Just because an email contains signals commonly considered spam does not necessarily mean that the message is spam. Machine Learning allows you to check all these signals together to make a decision.
Finally, it also helps us to customize our spam protection to each user. What one person considers spam, another person may consider an important message (think of newsletters or regular application notifications).
Some comments " Made in Badsender
Like every time, when we use email for marketing (as you and we probably do), a breakthrough in spam filter intelligence raises the question of the impact it will have on our campaigns. And it will. As Google says, one person's spam can be another's treasure. The same goes for the perception of the legitimacy of a message, which varies greatly from one person to another.
What is very clear, and in the air of the time from the anti-spam side, is that all marketing messages, I say ALL, can potentially be considered as spam. Even if you have collected consent, even if there have been openings, even if you have exemplary database hygiene.
In this evolution, we will not necessarily retain the part concerning the detection of messages included in images, emails using hidden content or messages coming from "fresh" domains. These practices are clearly reserved for spammers... it is advisable to stay as far away as possible from them. What we will remember is the notion of personalizing the anti-spam protection. If in the past we talked about positive signals (opening, clicks, replies, forwarding, tidying) and negative signals (spam complaints, deletion before reading, bounces, ...) that affected your reputation as a sender, it seems that Gmail is now adding a personalized reputation per recipient.
What does this say about the evolution of best practices? All the best practices of the past are probably still valid, but the personalization of your engagement actions must be ever stronger. It is more important than ever to modulate your marketing pressure, to personalize your content according to the individual you are addressing.
Need help to improve your deliverability?
Discover our services of deliverability monitoring anddeliverability audit.