How Spam Filters Work
There are countless articles on the web entitled, “How to Avoid Spam Filters.” The problem with most of these articles and their advice is that they’re based on the false premise that it’s even possible to avoid spam filters.
Spam filters are part of the process. If you send email, it will be filtered—either to the inbox, a categories tab, a spam folder, or it will be blocked completely. Filter technology plays a massive role in the success of your email campaigns. That’s why, at Return Path, we encourage our clients to embrace spam filters, learn how they work, and understand how mailbox providers use them.
Email filters organize email according to specified criteria. Originally, filters were designed primarily to identify spam and block it or place it in the spam folder. Today, some mailbox providers use email filters to categorize messages for inbox organization purposes (e.g., Gmail categories or Microsoft’s Focused inbox).
Mailbox providers have strong motivations to use spam filters, whether they build their own system, leverage third party spam filter technology, or use a combination of home grown and partner anti-spam solutions.
Spam is annoying, no doubt, but it can also be dangerous. Malware and phishing are hugely profitable for scammers and can be costly for mailbox providers’ customers, as well as the mailbox providers who face intense market competition. Practically speaking, spam filters drastically reduce the load on server resources, considering that 70 percent of all mail sent globally is spam.
As a message traverses from the sender to the subscriber’s inbox, various types of filters can influence deliverability and inbox placement:
To find out more about how the top three webmail providers filter email, download the Marketers Guide to Gmail, Outlook.com, and Yahoo.
Spam filter technology may be placed on both inbound email (email entering the system) or outbound email (email leaving the system). Mailbox providers use both methods to help protect their customers. Senders may encounter both types of filters but are mostly concerned with inbound filters.
Both outbound and inbound filter methods use algorithms, heuristics, and the more advanced form of heuristics known as Bayesian as part of their filtering technology. Algorithms in this context are rules that tell a program what to do. Heuristics work by subjecting email messages to thousands of predefined rules (algorithms). Each rule assigns a numerical score to the probability of the message being spam.
Mailbox providers look at four main aspects of mail when making filtering decisions:
Spammers often attempt to “game” reputation systems by using multiple (and constantly changing) IP addresses or domains. However, spam filters look at factors such as sender authentication, sending permanence, and the age of the IP addresses and domains in their filtering decisions.
To that end, email sent from new IP addresses and domains is treated with caution by mailbox providers. Senders that change IP addresses and domains infrequently and use authentication techniques can be seen as more trustworthy, which may lead to a stronger sending reputation.
The reputation of the sender is calculated using algorithms and heuristics, leveraging millions of data points and hundreds of parameters. Based on the strength of the reputation score, mailbox providers will make filtering decisions about the email coming from a sender.
Reputation based filters can automatically apply the mailbox providers’ mail flow policies based on the reputation score of the sender. As the filter receives inbound mail, a threat assessment of the sender is performed. Some of the parameters leveraged to generate a reputation score are:
Each mailbox provider uses their own undisclosed reputation formula, however they all incorporate similar elements into the calculation. Return Path’s Sender Score uses data points and reputation formulas similar to what most mailbox providers use to give a relatively accurate representation of how mailbox providers view your email.
Content analysis technology scans every part of an email, including the header, footer, code, HTML markup, images, text color, timestamp, URLs, subject line, text-to-image ratio, language, attachments, and more. For some content filters, there is not one part of the message that is ignored. Other content filters may look at only the structure of an email, or they might simply parse URLs out of the message and then reference them against blacklists.
In addition to the source, sender reputation, and message content of incoming mail, mailbox providers have begun to look at historical engagement metrics to determine whether or not to place incoming mail into the inbox. This allows them to not only judge whether incoming mail is legitimate, but also whether it is desired by their users.
Proving “desirability” is not as simple as it might sound. While you might believe your content is doing well and generating a valuable ROI, mailbox providers might not agree with you. This is because they use different metrics to measure your desirability than then ones you use to measure your email’s effectiveness. They are also evaluating you against other senders.
Mailbox providers define engagement as the positive and negative actions that users take in their mailboxes. Metrics such as read rate, reply rate, and forward rate show that users are opening, responding to, and sharing your content with others while negative metrics like complaint rate and delete before reading rate are strong indications of disinterest.
Spam filters are an integral part of the email ecosystem. Without them, email simply wouldn’t work—billions of spam messages would overload the system. Filters are our friends; as users we appreciate when they keep unwanted mail out of our inbox but we’re also pleased to get the mail we do want. And filters are the reason that important transactional emails land in our inbox, where we can find them when we need them.