Parsing Email with Mailgun
Be first in line when we do.
I wrote a few days ago about why and how Carom will work with email. Today, I want to provide some details about how Mailgun makes that as easy as working with a simple POST request.
Accepting incoming email is hard. If you do it yourself, you need spam filtering so you’re not overrun with garbage, you need to connect your mail server to your application layer, and you need to learn a bunch of email formatting standards so you can properly interpret the messages you receive—even though many mail clients don’t respect those standards.
In short: running a mail server is hard. And since we don’t have to do it, we chose not to.
This post provides some technical details about how we’re working with Mailgun. If you’re not a developer, you might not be interested; if you are, read on!
Routing incoming emails to Mailgun
Routing email to Mailgun—and in turn to our servers—requires two steps, both very easy.
First, create a DNS MX record that points to Mailgun’s servers. We’re forwarding mail send to *@dropbox.carom.io to Mailgun, which allows us to handle mail sent to the naked domain (e.g., email@example.com) using our own mail provider.
Next, tell Mailgun where to route the email it receives. For now, we have a simple matching rule:
Emails sent to address matching our rule are forwarded to an address we specify, with the first match group representing the account’s subdomain. Even though we check on the server side to make sure that the email corresponds to an existing dropbox (using the 12-character key included in the dropbox address), we also have Mailgun send an authorization code to our servers so we don’t bother with malicious POSTs to Carom from elsewhere.
We created these routes using the Mailgun control panel, but you can also set routes via API. If we find ourselves receiving a lot of spam sent to invalid email addresses (since our routing rule is fairly simple), we can instead use the API
Parsing the incoming request
After receiving an email that matches one of the routes we’ve specified, Mailgun sends a POST request to the URL we specify with
It’s worth noting that we’re choosing to parse the email on our end, rather than rely on the
stripped-text message that Mailgun provides. There are three major reasons we’re making that choice:
- Given that email is a big part of what Carom provides, we want a lot of flexibility in how we handle it. Mailgun’s parsing could well be better than ours right now—but we want to be able to improve our parser over time as we receive emails in the wild.
- Mailgun provides a stripped message without the previous replies in the email thread. But since we display these replies in our interface, we need to process them further to split replies, identify senders, and so on. Once we’re writing a parser anyway, splitting the most recent message from the rest is fairly simple.
- Most importantly, we expect that most of the emails we receive will be forwarded by our users. That means that the
stripped-textmessage Mailgun provides will be the text forwarded by our users, not the text of the email they’re forwarding.
Having said all of that, based on our limited testing the
stripped-text parsing seems robust, and certainly Mailgun has plenty of experience to draw upon in developing it. For many uses, it’s probably a better option than parsing on the application side.
Storing emails for processing
Mailgun also provides a convenient option to store emails and retrieve them for processing later. For now, I think we can process email quickly enough (and our volumes will be small enough) that it’s easier to simply receive a POST from Mailgun and process it in the moment than it would be to store an email ID and retrieve and process the email in a background process.
The option is a useful one, though—if we find ourselves inundated with email at some point, it’s nice to know that we can offload some of the work to our background workers, which can scale more or less indefinitely and don’t affect the performance of the main web server.
Perhaps best of all, Mailgun’s pricing is great for startups. Mailgun provides 10,000 free incoming or outgoing emails per day—a limit we might not pass for a while. When we do hit that threshold, Mailgun will cost us $1.50 per 1,000 emails, which should be affordable.
At a certain scale (and depending on the depth of Mialgun’s volume discounts), it might make sense to take email parsing in-house. But we’re a long way from that point, and for now Mailgun is making our email processing cheap (actually, free) and easy.