Testing and Monitoring Email Serivce Provider Speed

One day I was programming on a project that sends emails and was testing that function. I noticed it was taking a long time (sometimes minutes) to get the messages, and that was slowing down my progress. I blamed it on Gmail, switched projects and came back to it another day… but noticed it again. I quickly switched to another email account, sent some test messages, and determined the email sending service I was using was the problem. In this case, they were account creation confirmation emails. They needed to be delivered quickly – before users lost interest and moved on to something else.

Over the next few weeks, I started poking at Email Service Providers. The whole point of using an ESP is to offload the complexities of quick delivery, so I was really surprised that there would be delays.

I set up an experiment to monitor the situation. I signed up for paid accounts with several ESPs, set up an MX server with no other mail traffic, no load, and no spam filter. Then I started dropping one email per minute into each ESP and timing how long it took to get the results back. I was floored.

The delivery time was all over the place. No provider was clearly better than the others all of the time; and when they went sideways, they did so in different ways. Some would perform just a little poorly a lot of the time… others were great most of the time but were really awful when they were having problems.

Every provider lost mail, too. For the most part, providers lose about 1 in every 128 messages. I received an OK signal from the provider's API, and never received the message. Remember, I don’t have a spam filter configured.

MailJet went through a bad spell where they failed to deliver more than half of all messages… but after recovering, they became the most reliable in my group. I’m confident that with a long enough experiment window more ESPs will have a bad day, though MailJet did nothing to notify me that there was a problem and downplayed it when I asked. That said, they did answer my questions about it, which not all ESPs will do. I’m also impressed that they delivered mail days later instead of just dropping it.

What about Mandrill?

I couldn’t get Mandrill to work. I signed up for a paid MailChimp account, and validated my domains, but kept getting error messages. I had used Mandrill before, so I knew how to integrate with it.

I contacted their support department and they informed me that because my domain had the word “test” in it, it would not send. I found that odd, but pointed a new domain. Neither domain worked, but MailChimp stopped answering my support emails. The ticket is still open months later.

I’m sad to see this because MailChimp is a really cool company. They are local to me and I know quite a few people there. But their specialty is newsletters, not transactional emails… and, for what they focus on, they knock it out of the park. It is clear to see this isn’t their interest.

Talking to ESPs

I was able to contact some of the ESPs. MailJet confirmed that they were having difficulties, but didn’t seem to communicate that their difficulties were as substantial as they were. I should point out that I have seen similar problems with other ESPs, just not within this survey window. Everyone has a bad time sometimes, and I suspect I caught them at a bad time.

I did have a nice conversation with SendGrid’s deliverability department. By chance, I was talking with a friend of mine who just happened to know someone there. We had a nice conversation. I was impressed that they had all the email data in Splunk (an expensive product) and were able to look up some of my emails by subject line. Each subject had a UUID, which is how I kept track of them and timed them. No explanation was offered.

A few notes about my experiments

I setup a MX server on a third-level domain just for this experiment. There was no other load. I never received spam.
The MX server was just a procmail script to my parser. Each message sent had a UUID on the subject. I used the subject line UUID to match up the message with the outgoing one and establish the elapsed delivery time.
My times were recorded with a resolution of one second – no fractions were recorded, which distorts smaller times.
I used web APIs, except for Amazon. The web API gives me more assurance that the transaction made it to their server and reduces outgoing relays and proxies.
While ESPs I talked with said they make no quality of service differentiation between paid and free accounts, I paid for service at each ESP.

Conclusions

Service variation seems common at all ESPs. If your business relies on quick delivery you should monitor your service and have at least two ESPs.

If you send a lot of mail, it makes sense to have a constant drip of emails going to both in order to measure in advance.

If you only occasionally send mail, consider sending a test on each ESP when it appears likely that a user will need an email. For example, when a user loads the "registration" or "forgot password" page, send a test while the user is filling out the form. Send the actual email through whichever ESP wins the race.

If you have a "didn’t get the message, send again" feature, consider sending the subsequent message with an alternate ESP.

In general, you can expect delivery in about 10 seconds, plus the time it takes the recipient' spam filter and MX server to process on their end.

In addition to monitoring your ESP, you should monitor your open times. While this isn’t as accurate, it can give you some insight to other delivery delays such as MX load and spam filters. Maybe someday I will be able to warn users that their email service is experiencing delays.