Inspecting emails and extracting data in Django using regex named groups

Marco Chiappetta
2 min readDec 6, 2018

--

A while ago I found myself wanting to test a flow, in a Django (+ Rest Framework) based API, that included sending an email to a client containing a so called magic link, similar to the ones used by Slack.

We normally rely on services such as AWS SES and fire an async task using Celery. Obviously relying on a 3rd party service and, more importantly, on an async task for testing is very, very bad.

Let’s talk about settings

So the first step consists of overriding the default settings for your Django application to do 2 things: 1) tell Celery to execute tasks in a synchronous manner (i.e. they will act exactly as a normal function call, even when using delay ), 2) use a local backend to send/receive emails that allows easy and fast access.

The answer to the former problem is the two settings below (here’s a mapping between Django settings and Celery settings):

CELERY_TASK_ALWAYS_EAGER = True
CELERY_TASK_EAGER_PROPAGATES = True

The first tells Celery to treat any task invocation as a “normal” function call. In reality the execution is an emulation of the Celery API, i.e. the task will not be put in the queue but the returned value will be a EagerResult object, which emulates the AsyncResult object normally returned by a task invocation.

The docs clearly warn against using task_always_eager, if you don’t know what you’re doing.

The second tells Celery to propagate exceptions to the caller. Obviously this will only work for eagerly executed tasks.

The answer to the latter problem is to change email backend to the following:

EMAIL_BACKEND = 'django.core.mail.backends.console.EmailBackend'

This will store any sent/received email in django.core.mail.outbox or inbox and print the text on the standard output.

Fishing for links

Now that we have an easy way to access emails sent by the system we need a way to parse the link and send the relevant token(s) to the API. This assumes that links are constructed so that native clients (e.g. iOS/Android) can intercept the link without having to open a web browser (known as Universal Link).

As usual regexes are our friends. In particular we’re going to use a feature called named groups. Imagine we send the following email:

Hi there! Here's your magic link:https://example.com/magic/MzIzNjQ2MTE2_MTY5NzQ0NDg1NQ/Have a nice day!

We can access the body of the email, find and parse the link in just a few lines (the HOSTNAME_REGEX is just too long to include here):

from django.core import mail
...
email_body = mail.outbox[0].bodyinvitation_link_re = r'(?P<host>{})/(?P<path>magic)/(?P<token>\d+_[\w\-]+)/'.format(HOSTNAME_REGEX)link_match = re.search(invitation_link_re, email_body)host = link_match.group('host') # example.com
path = link_match.group('path') # magic
token = link_match.group('token') # MzIzNjQ2MTE2_MTY5NzQ0NDg1NQ

Now we can use the parsed components of the URL as we please. The power of regex!

--

--