Database cleansing often emerges as a priority action in our deliverability audits. In fact, a list of e-mail addresses that has not been checked for several years can contain spamtraps, generate complaints and repetitive bounces, and accumulate a large volume of inactive addresses. These elements wake up spam filters and lead to risk of blocking deliverability.
In this article, we're going to focus on "mass" cleaning, mainly using dedicated tools. In other words, how to recover quality from a database whose hygiene has been neglected for too long. But there's no magic involved. Maintaining a quality database takes time, both at the time of collection and on a day-to-day basis.
If you find yourself with a list of e-mail addresses to clean up, you haven't done your homework beforehand. So it's not enough to clean up your e-mail database every two years when it's become too "dirty". You have to integrate good data hygiene practices and stick to them!
Why should you clean up your email database?
The main reason is simple and straightforward: reduce signals having a negative impact on your reputation as a sender. In email deliverability, anti-spam filters use various signals. Some of these signals can be improved by cleaning up an email contact base:
- Reduce spam complaints by deleting contacts who don't seem interested in your messages and making it easier for them to unsubscribe.
- Reduce bounces by improving collection sources, detecting poorly formatted email addresses right from registration, or cleaning up bounces when they are detected in your campaigns.
- Reduce spamtraps These include: implementing double opt-in, detecting "typo" domains at registration, working on inactive domains or cleaning them up afterwards.
- Improve open rates and click-through rates This can be achieved by improving your inactivity management, working on targeting, or by cleaning up on the basis of acquisition sources.
In short, the aim is to improve deliverability signals by cleansing your database of all those dubious addresses.
What do we mean by "clean"?
Please note that in this article, when we talk about "cleaning up", we don't mean deleting problematic e-mail addresses. Above all, it means "stop sending them emails". The nuance is subtle but important. In practice, it means identifying these poor-quality addresses to ensure that they are no longer targeted. The risk of deleting these addresses is that, if you import them again into your e-mail database, they will once again be considered contactable.
By adding these problematic addresses to a blacklist, marking them as unsubscribed (or other solutions depending on your emailing tool) you ensure that they remain marked as "non-contactable" in your tool.
3 distinct moments to maintain good email database hygiene
To improve the quality of your email data, there are many good practices and tactics to implement. One interesting way of looking at them is from a temporal point of view. It's interesting to analyze them from this angle, because these "moments" each have their own specificities and rules.
Although this article will focus on the 3rd "moment" (in one-shot), it's a good idea to talk about the other two "moments". In fact, it'll allow us to make links with other articles available on this blog.
Improving the quality of email addresses as soon as they are collected
When it comes to maintaining good data hygiene, it's best to do it right from the start, when collecting data. This will avoid introducing dubious addresses from the outset.
At the collection point, we may encounter various problems:
- People who enter the wrong email address
- Robots that automatically register e-mail addresses
- "I don't know who" deliberately spamming forms
- "I don't know who" voluntarily entering real people's addresses into forms (without their consent)
- Typing errors when an email address is dictated or copied from a paper medium (registration at the checkout, business card collected at an event, etc.).
- Poor-quality collection processes (contests, co-registration, affiliation, etc.)
- Undisclosed collection methods (list buying, growth hacking, site vacuum cleaners, etc.)
- ... and no doubt many other cases.
Just by listing these issues, it's easy to understand the basic processes that can be used to improve the quality of email addresses right from the moment they are collected.
But that's not what this article is about, so we'll just mention the main ones, in order of importance:
- Do not collect using low-quality processes : If you knowingly collect with "border line" procedures, don't be surprised. It's a bit like driving at full speed towards a bay window, knowing full well that it's closed.
- Install anti-spam devices on your forms : This involves devices such as captchas, but also anti-spam solutions dedicated to forms such as Akismet or CleanTalk (there are plenty, we haven't benchmarked).
- Analyze the syntax of email addresses at the time of entry Is there any text before the "@" sign? Is there a domain name behind it? which has an MX record ?
- Use a tool to check email addresses as you type More sophisticated than the previous step, there are services (via API) that allow you to check in real time whether the e-mail address entered in your form poses a risk. Don't hesitate to go further in this article to find a list of these solutions, which are the same as for bulk cleaning.
- Set up double optin This recommendation is a long way down the list, because companies generally refuse to implement double opt-in for reasons of e-mail address acquisition performance. Yet it's one of the most effective ways of drastically increasing the quality of the email addresses collected.
- Set up an acquisition management system based on the performance of the collection sources: Unfortunately, like double opt-in, this is not a widely followed best practice. It simply consists of keeping track in the database of the collection source (form, sale, partner, actions, campaign...) and analyzing the performance of these various sources (opening, conversion, bounces, activity...). This enables you to make informed decisions about problematic collection sources.
By doing so, you can avoid Some bounces, some spamtraps, some addresses inactive from the start of the relationship.
Clean up contacts over the course of campaigns
Once you've collected email addresses, they start to be integrated into your marketing or lifecycle campaigns. Sending your campaigns will generate various signals (clicks, opens, unsubscribes, bounces, conversions...) which have a great value: allowing you to make choices to keep your database hygiene at the top.
Once again, the aim of this article is not to go into this point in depth, but a reminder seems important nonetheless. Here's a list of best practices that will help you improve or maintain the quality of your e-mail database:
- Have a strict cleaning policy for hard and soft bounces.
- Implementing welcome scenarios to target new contacts as soon as the address is collected.
- Segmenting the active and inactive to provide different pressure rules.
- Stop targeting the inactive and encourage them to unsubscribe before they do.
- Honoring unsubscription requests immediately and globally (and don't play the trickster by creating multiple subscription levels)
- Check the setting up List-Unsubscribe in your e-mails
Some of these recommendations seem obvious, but in practice, they are rarely applied. We are still regularly called upon to our emailing audit services to see advertisers who don't allow their customers to unsubscribe from their email communications in one go.
By doing so, you can avoid You'll find it hard to avoid: too many bounces, too many inactives, too many spam complaints, poor open and click rates.
On-shot, mass or batch cleaning
Phew! After that long intro, we're back to the main topic of this article: How to clean and verify your email addresses, and with what tools?
Normally, if you've taken precautions when collecting e-mail addresses and implemented good practices when running your campaigns, you shouldn't need to double-check your addresses en masse. But life doesn't always go according to plan, you may have been in denial, or worse, you may have inherited a situation for which you are not responsible.
In any case, before you start cleaning up your e-mail addresses en masse, you should start by planning the deployment of all the best practices mentioned above.
Once you've done that, you can get on with the job. Please note, however, that there's no magic involved, and none of the e-mail address cleansing tools is perfect.
With mass cleaning, you can avoid : some bounces, some spamtraps, some dubious addresses.
Tip: Combine analyses from cleaning tools with your collection sources!
Why not combine the useful with the useful? With e-mail address verification tools, you'll obtain risk categories (high, low, moderate...). By cross-referencing these categories with your collection sources, you may be able to discover sources of lower quality than others. This will then enable you to arbitrate and possibly stop working with certain acquisition partners and give preference to others.
Use an email address verification tool
Let's get down to the nitty-gritty: how do you use an e-mail address cleansing tool?
Which email address verification tool should you choose?
This is the very first step. As a reminder, most of these tools allow you to verify your email addresses en masse, but they can also be used to implement email verification in your forms. The way they work is basically the same, except that instead of loading a file of addresses, you'll make an API call which will return the results as a unit for each email address entered in the form.
Unfortunately, we haven't carried out any benchmarking on the subject, so it's difficult for us to give you a ranking of the best tools. Nevertheless, here are a few criteria to help you choose an email verification solution that's right for you:
- Choose tools that use real campaign data to build up their repository (and often have partnerships with emailing tools).
- Avoid solutions that claim to query e-mail servers to find out whether an address exists (this is highly frowned upon in deliverability circles, and messaging companies implement strategies to trap these actors).
- Make sure that the solution doesn't overtake revolutionary Artificial Intelligence which would be of no interest here.
- Check the location of these platforms, if the RGPD is important to you, stay in Europe.
- Check available integrations with your campaign management tools.
Examples of tools you can use to test address validity :
- NeverBounce: https://www.neverbounce.com/
- ZeroBounce: https://www.zerobounce.net/
- Captain Verify (French) : https://captainverify.com/
- Kickbox: https://kickbox.com/
- Capency (French) : https://www.capency.com/
- Mailnjoy (French) : https://check.mailnjoy.com/fr
- BriteVerify: https://www.validity.com/briteverify/
There are many more, so don't hesitate to scour the Internets for them.
How do I use these e-mail address verification tools?
These address verification tools generally operate in 4 main modes:
Need help?
Reading content isn't everything. The best way is to talk to us.
- When loading a file This is the classic way to use our solutions. You collect all your e-mail addresses in a file and upload it to the verification platform for analysis.
- Using the API in batch mode : If you have developers on hand, or if you've developed customized tools, you can also submit your e-mail addresses en masse via an API.
- Using the API as a single unit : Here, it's generally used to check the e-mail addresses submitted to a form, one by one. It can also work for other systems, such as cash register software, a CRM solution... it's up to you to decide on the use cases.
- By using a connector with your emailing tool : Some email cleansing tools connect directly to your campaign management tools or CRM via a connector. This is often the case for HubSpot, Mailchimp or Salesforce, for example.
If we take the simplest case (loading addresses into the platform), here are the various stages:
- Having credit on one of the tools available on the market (sounds obvious, but purchasing processes aren't always that simple in large organizations đ ).
- Export your email addresses to a file (often in CSV format): the scope of the exported data obviously depends on the reason for which you wish to check these addresses. Bounces, inactives, complaints and unsubscribed addresses that are no longer targeted can be excluded, so as not to spend credits for nothing.
- Import your e-mail addresses into the platform The next step is to load this file into the platform and press the big button. It will grind for a while, depending on the volume of addresses you wish to analyze.
- Analyze results and perform actions based on them: that's what we'll see in the next chapter.
So far, so simple.
Tip: Don't just export email addresses!
We've already talked about the value of analyzing collection sources. But perhaps other data can be cross-referenced with the results of your email quality analysis. Are there differences in quality between your customers and prospects? Are there quality differences between your active and inactive customers?
If you want to analyze this information, we recommend that you include it in other columns of your file. This way, after analysis by the tool, these columns will still be present, enabling you to cross-reference the data.
How do you analyze and exploit the results?
Once the analysis is complete, you'll be presented with a file enriched with new columns. These columns will show you a whole range of information for each email address tested.
Each tool has its own way of analyzing and categorizing e-mail addresses. But on the whole, we'll find the same logic everywhere.
Risk classificationwith variable scales depending on the tools used:
- Very high risk
- High risk
- Moderate risk
- Very low risk
- Valid
- Unknown
In the end, it's up to you to decide what is acceptable and what isn't in your emailing tool. This decision is generally based on the volume of each category and the severity of the deliverability problems you're exposed to.
Classification by type of problem or address :
- Catch-all These are complicated addresses to validate, as the automatic verification of these addresses will never return an error. Only experience (bounces, opens, clicks) will enable you to decide whether these e-mail addresses are valid or not.
- Spamtraps Some email address verification solutions are able to detect certain spamtraps. Please note thatnone of these solutions will be able to detect ALL spamtraps in your e-mail lists.
- Role addresses These are generic addresses (sales@, info@...) that are generally used by companies. It's up to you to decide whether these addresses are relevant to your e-mail databases.
- Disposable addresses These are addresses created on services that allow you to create temporary addresses. People using these services are unlikely to read your e-mails over time.
- Incorrect syntax When the only structure of the email address (a first part separated by an @ of a valid domain name) is not correct.
Unfortunately, we can't cover all the categories of all the tools, so we've concentrated on the most common ones. In general, the tool you have chosen will provide you with full documentation to help you understand its meaning.
Now you have to decide what to do with the results of this analysis.. For each level of risk and each category, you will have to make a decision:
- Do I continue to target them? In general, there is no specific action to be taken in this case. The addresses were targeted, and you want to keep them that way.
- Do I decide to stop targeting them? In this case, you'll have to see how you do it technically (forced unsubscription, block listing...) depending on your tools.
If you have any doubts, don't hesitate to contact an expert.
Above is an example of statistics from an analyzed file. Email addresses in red categories have been deactivated, while green categories have been retained for targeting.
B2B: Role address cases
B2B and B2C results are not necessarily to be interpreted in exactly the same way. This is particularly true of role addresses. In B2C, you can deactivate these addresses without asking too many questions. There's no reason to have many info@, sales@, contact@ addresses in your database.
In B2B, it's already much more legitimate to have these addresses. However, bear in mind that these generic addresses are of lesser quality than a nominative business address.
Develop your own email analysis tools?
If it's comfortable to use paid email address verification tools. It can sometimes be a good idea to create your own analysis tools. This is particularly the case when you're part of a high-tech company with developers on hand.
What types of analysis can you easily perform in-house?
If you take the time to develop your own email verification tools, it's best to concentrate on analyses that don't require access to large data repositories.
Here are three ideas for analysis that are realistic important investments:
- Analysis of email address structure : Clearly the simplest, an e-mail address is divided into two parts separated by the "@" sign. The first part is the user's name (local-part in the standard), the second part is the domain. For more details on the official syntax, see refer to RFC 5322 3.4.1.
- Checking the presence of an MX server For an e-mail to reach its destination, the domain of the e-mail address must have an MX record in its DNS server. Without this, it's impossible to find the server to send the email to. It's a simple check to set up.
- Typo spamtraps detection A little more ambitious, but still affordable, is to build a tool to detect typo spamtraps (with a forgery in the domain name). We'll take a closer look at this in the next chapter.
Example: cleaning typo spamtraps
In the spamtraps familyI call it typo spamtrap. This is the most vicious one, the one you can have in your database even with the best practices (double optin should largely protect you from it).
A "typo" spamtrap is an e-mail address containing a typo in the domain name. Here's a long list of examples: gmaim.com, gmqil.com, wandoo.fr, msm.com, oulook.com, iclou.com, data-backup-store.com, lapost.net, hotlook.com, gmsil.com, hotmai.fr, ootlook.com, wanado.fr, gmail0.com, outook.com, gmailm.com, gmzil.com, hotlail.com, hotmail.no, hotmial.com, gmail.fr, lposte.net, gmaio.com, iclud.com, testetam.com, gmlail.com, adsl.fr, jmail.com, gmail.com.com, hotlail.fr, outloo.com, iclound.com, wahoo.fr, hotmailo.com, laspote.net, yhoo.fr, 6gmail.com, yayoo.fr, bluewing.ch, lapiste.netetc...
And yes, this means that one of your customers may have ripped off his keyboard and generated a nice typo spamtrap himself.
How can I detect typo spamtraps?
Some of these domain names containing a typo still have an MX record in their DNS. This means that the e-mails will arrive on a server. Some of these servers are owned by organizations active in the fight against spam and will use these signals to analyze your practices. It's a good idea to hunt them down!
To detect these typo spamtraps, you need to react in two stages:
- Create a file with all domain names contained in the email addresses of your contact base (Ideally, we add a column with a counter to find out how many email addresses use each domain name. This is super useful for many deliverability analyses).
- Enrich with MX servers each of the domain names in your
- Visual detection of typo domain names which are obviously typos in the fields
- Use the MX of these domains hand-detected to detect other domains using the same MX servers
- Transform this list of MX servers into a repository issues that you can gradually add to as you discover them.
It's not 100% automatic, but it allows you to create an interesting repository for cleaning up your email addresses and/or doing automatic detection on collection.
Conclusion
When we started writing this article, we had absolutely no idea we'd end up with such a long result! But we gave it our all to serve you đ
In any case, before thinking "clean-up", think best practices. It's one thing if you're not ready for double opt-in, or if you don't have the analytical tools to monitor your collection sources, but the other best practices suggested in this article must be implemented at the time of collection and during campaign execution.
If necessary, get a tool to clean your addresses en masse once or twice a year. But it won't be miraculous, and in the event of a deliverability incident, it will already be too late.
And if you need help, maybe it's time to call on the experts at a committed emailing agency.