Android Email Extraction to .eml

Sometimes the Android ecosystem is a little lacking with tool support; for instance I needed to extract a set of sent items from a POP3 mailbox – the stock mail client only allows you to perform 3 actions: delete, mark as unread or favourite.

Armed with the Android SDK, some SQL queries and a Groovy script we’ll see how it’s possible to recover email to RFC822 .eml files.

Email storage

The stock com.android.email client stores message headers and their bodies in two sqlite databases and the AttachmentProvider stores attachments on disk.

The source code is available from https://android.googlesource.com/platform/packages/apps/Email – in this case it was for a Samsung phone running 4.1.2 so I checked out the jb-mr0-release branch to inspect the code when required.

Backup with ADB

ADB is the Android Debug Bridge, a debugging tool that is part of the Android SDK platform tools. Executing ‘adb usb‘ (re)starts the adbd daemon listening to USB connections.

The next step is to select Developer options from the System section of the settings list (Figure 1) and then enable USB debugging on the device (Figure 2).

Figure 1 - System settings

Figure 1 – System settings

Figure 2 - Developer options

Figure 2 – Developer options

Then connect to the device using a USB cable and execute ‘adb backup -f mybackup.ab -all‘ (you can also be more selective with the package you want, e.g. ‘adb backup -f mybackup.ab com.android.email‘). ADB will prompt you to unlock the phone and permit the backup to proceed.

Inflating the backup

An Android backup is a Zlib deflated tar file – thanks to http://nelenkov.blogspot.jp/2012/06/unpacking-android-backups.html for the hint that it can be reinflated with the following command:
dd if=mybackup.ab bs=24 skip=1|openssl zlib -d > mybackup.tar

Note that the install of openssl I had on OSX wasn’t compiled with zlib support, so I ran this through an Ubuntu Vagrant VM to inflate the tar file.

Examining the backup

Unpacking the tar file will give you a folder structure as shown in Figure 3; in this case the sqlite databases are in db and photo attachments taken with the phone are in f.

Figure 3 - Expanded tar

Figure 3 – Expanded tar

Exploring the databases

Sqlitebrowser is a good tool for browsing sqlite database and features in the following screenshots.

The current script restricts the messages processed to a single mailbox for one account. To determine the mailboxKey requires locating the target account from the Account table (Figure 4) and then isolating the desired mailbox for that account within the Mailbox table (Figure 5).

Figure 4 - Account table structure

Figure 4 – Account table structure

Figure 5 - Mailbox table structure

Figure 5 – Mailbox table structure

You can browse the data or execute the following query to get a list of mailboxes and their corresponding accounts:

SELECT a.displayName as accountName, m._id as mailboxKey, m.displayName as mailboxName
FROM Account a, Mailbox m
WHERE m.accountKey = a._id

If you don’t want to limit the extraction to a single mailbox or account then you can remove the WHERE clause from the first SELECT query.

Email Extraction

Being comfortable with the RFC822 specs & JavaMail (I built an IMAP extension for Alfresco in 2006), I decided it would be easier to reconstruct a MimeMessage using SQL & Groovy than attempt to adapt the Android source code to run off the device or build a custom app to run in the emulator.

Shortcomings

As a first cut that was good enough for my purposes, there are a few deficiencies:
1. It is set to default the Sender field rather than converting the value of Message.fromList and using the addFrom method
2. Addresses only use the address rather than the label
3. The body handling only uses plain text rather than alternative multiparts
4. Body.textReply is separated from the Body.textContent with a separator line of 25 dashes; it does not attempt to reconstruct the header information of the preceding message in the thread
5. The script does not handle attachments – this was a conscious decision as whilst the camera photos were in com.android.email/f, the other ‘RAW’ attachments were not in com.android.email/1.db_att/ as per the javadoc of the AttachmentProvider class

As an initial hint, to reconstruct the attachments you would need to use MimeMultipart, create the MimeBodyPart objects starting with a query like:
"""SELECT fileName, mimeType, size, contentId, contentUri, encoding
FROM Attachment
WHERE messageKey = ${msgKey}"""

Addresses

The script provides one utility method to convert a String-ified list where records are separated by the SOH character (ASCII 1) and email addresses and their corresponding display name are separated by the STX (ASCII 2) character.

The processing consists of iterating through the rows of the message table result set, for each row the message body is obtained from a separate sqlite database and then a JavaMail MimeMessage is constructed and output to a file.

Troubleshooting

The script cannot cause data loss on your device as it is operating on a backup. If you want to experiment first you may like to set a limit (e.g. LIMIT 10) on the first SELECT query to reduce the number of messages that it retrieves.
Also, a simple way of viewing the output within e.g. the Groovy Console is to use msg.writeTo(System.out) instead of creating the .eml file.

Get the script

The usual disclaimers apply that the script is without warranty, support etc. – get it from GitHub: https://gist.github.com/rbramley/65261127dfb857b03bb6

Advertisements

3 responses to “Android Email Extraction to .eml

  1. Informative post and nicely explained. Thanks for sharing a great post.

  2. sorry but this script seems just to be useful for experts. i copied the db files from com.android.email/databases. no idea how to run the script in sqlitebrowser. isnt there any executable “Android Email Extraction to eml” Tool for inexperienced windows xp users like me?

    • Sorry there isn’t an executable tool…

      The main script needs to be run using Groovy (download from http://www.groovy-lang.org/download.html); sqlitebrowser is only used for a database query to determine which account and mailbox to process (the IDs are configured on lines 38 & 39 of the AndroidEmailExtractor.groovy script).

      To simplify this, I’ve just added another Groovy script ‘AndroidEmailAccounts.groovy‘ which performs the database query for you and prints out the list of accounts / mailboxes.

      Once you’ve installed Groovy and downloaded the two Groovy scripts, you then need to use a command line / terminal window in the directory containing the EmailProvider.db and EmailProviderBody.db files to execute e.g. groovy c:\Users\Alecxs\Downloads\AndroidEmailAccounts.groovy

      From the output, modify lines 37-39 of the AndroidEmailExtractor.groovy script to set the default sender email address, the chosen account ID and mailbox ID.
      If you want to process all mailboxes for an account, then remove “and mailboxKey = ${mailboxKey}” from line 85, or remove line 85 altogether to process all accounts and mailboxes – but note the caveats around shortcomings (the script was good enough to fulfil my need).
      Then execute e.g. groovy c:\Users\Alecxs\Downloads\AndroidEmailExtractor.groovy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s