Python Hijinks: Outlook Safelink *Decoder*

published-date: 21 Jan 2023 23:12 +0700
categories: python-hijinks
tags: python

Synopsis

There were times when me and my colleague were asked to regularly update and copy links out of Outlook 365 mail to be inserted into certain form. But the url we copied were always too long, and the form they gave us has a character limit. For some reason (read: good), these links always redirect to Microsoft Office Defender unsafe malicious link detection system.

Here’s a sample on how the link would look like if you copied directly from Outlook.

https://your.company.windows.defender.domain.com/?url=https%3A//en.wikipedia.org/wiki/Sergeant_Reckless&data=05%7C01%7Cmy.mail%40domain.com%7C46575da2930c2aa97e550b27b1090595%7Cd0dd9bc880d23c27c6e5821a10edc04e%7C1%7C0%7C844353527029278340%7CUnknown%7Cpegytgkototbwiuhiqkbodlblaacaxulmzsqdwdtgekwfuableebpvucgfhdobzygxpohqqcare%3D%7C3000%7C%7C%7C&sdata=iXZKYIVxymHZqRIuefqinvbeXKHv%2BnRxdXMrYexvfRh%3D&reserved=0

Note all obfuscated data are manually randomized in this link. There’s no use on decoding anything out of it.

It’s pretty obvious our target link is embedded and formatted as escape characters in the query url param. On top of that, the query goes through the defender system alongside additional payload, presumably the information of the receiver and the mail where it was copied. idk honestly.

Knowing it’s there, I did a quick googling and found a tool online to extract the og link (site: http://www.o365atp.com/, which looks dodgy asf but I tested it and its safe 🤞). I decided to write myself a little python script just in case the web went offline (it’s just a python implementation of the website).

The Script Tidbits

You can get the full version in my github repo: Outlook Safelink Decoder utilizing urllib.parse in python. Although the title itself is a bit misleading because it doesn’t do any decoding at all.

This is a snippet of the only functional part of the script.

def safelinkDecode(url):
    from urllib import parse
    data = parse.urlparse(url)
    query = data.query
    if not query:
        raise ValueError(f'tried to parse {url}: \n\tno valid query string in the given url')
    queryfragment = [i.split('=') for i in query.split('&')]
    qkeys, qvals = tuple(zip(*queryfragment))
    qvals = map(parse.unquote, qvals)
    queryfragment = dict(zip(qkeys, qvals))
    return queryfragment

In summary, urllib.parse.urlparse is used to get the query parameter from the url. Because all of the characters are escaped or quoted, urllib.parse.unquote is used to unquote the quoted quote (try saying that out loud). The other stuff is just fancy way to unpack, map the unquote function into the value list, and wraps the result back into dictionary.

Passing previous link into this function will get this result.

{
    "url": "https://en.wikipedia.org/wiki/Sergeant_Reckless",
    "data": "05|01|my.mail@domain.com|46575da2930c2aa97e550b27b1090595|d0dd9bc880d23c27c6e5821a10edc04e|1|0|844353527029278340|Unknown|pegytgkototbwiuhiqkbodlblaacaxulmzsqdwdtgekwfuableebpvucgfhdobzygxpohqqcare=|3000|||",
    "sdata": "iXZKYIVxymHZqRIuefqinvbeXKHv+nRxdXMrYexvfRh=",
    "reserved": "0"
}

Aaand presto! the url now can be accessed from the url key.

P.S. it seems some of the additional payload data sent to Windows Defender is encoded in 32 character long (presumably base64), the sender email in plain text, other 75 characters mumbo jumbo in lowercase, and uses | as delimiter.