Feb 01 2011

Migrating From Zimbra to Google: How We’re Doing It (Part 2 – Folder Renaming)

Published by under Uncategorized

Once the notifications were ironed out, the second challenge we faced was working around the restrictions put on Google’s labels:

  • Can only be 40 characters in length
  • Can not have spaces adjacent to one another

This presented us with the following challenges:

  • How do we check if a user has these folders?
  • What notifications, if any, do we send?
  • If the user doesn’t fix their folders before the migration process runs, do we rename the folders for them

Disclaimer!! I’m not a programmer by any stretch of the imagination!

Challenge 1: Checking for bad folders

I hate to post snippets of code without giving much context around it, especially since it’s part of a much larger application, but hopefully I can explain it enough to give the basic idea behind our methods.

def checkForBadFolders(username):
    """ Command to get the folders """
    command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu gaf | egrep -i ' mess | conv ' | egrep -v ' /Trash/.*| /Trash ()' \
                | sed -e 's/^.* \///1' " % username
    p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    (output, error) = p.communicate()
    if error != '':
        logging.error('Problem getting list of Bmail folders')

    """ Crazy way to get folder names. All folders end with () or (.*)
        It was tricky getting only the last occurrence of (.*). So, we
        * reverse the string
        * find the index of the first (
        * get the string from this index + 2 to end
        * reverse the string again """
    folder_list = []
    for folder in output.splitlines():
        tmp = folder[::-1][folder[::-1].index('( ') + 2:len(folder)]
        folder_list.append(tmp[::-1])
    folders = { 'long' : [], 'space' : [] }

    """ Check for folders with adjacent white space, > 40 chars long, or reserved folder name """
    r = re.compile(r'\s\s+')    # more than 1 white space
    for folder in folder_list:
        if len(folder) > 40:
            folders['long'].append(folder)
        if r.search(folder):
            folders['space'].append(folder)

    """ If bad folders were found, return this count and the folders """
    if len(folders['long']) > 0 or len(folders['space']) > 0:
        logging.info('Found bad folders for %s' % username)
        return len(folders['long']) + len(folders['space']), folders
    else:
        logging.info('Folders are OK')
        return 0, folders

In the above code, I use the zmmailbox command (bundled with Zimbra) to get the list of folders for a particular user. Each folder ends with (.*). The trick here is to strip this portion off and get only the folder name. Presumably if my regex-foo were better, this could be handled in a much simpler way. Instead, I:

  1. Reversed the string
  2. Grabbed the index of the first (
  3. Grabbed all characters from the index in step 2 + 2 until the end
  4. Reversed the string again

Convoluted, I know, but it did the trick! As an alternative, I could have connected to the account over IMAP and listed the folders that way. Pick your poison I guess.

This folder checking process runs 3 times from when the user is first added to our migration tool, up until their migration actually takes place. As you can see from the code above, this function returns the number of folders, along with their names, of folders which are over 40 characters or folders which have 2 or more adjacent spaces next to one another. This list of folders is then sent to the user, explaining why they are receiving the notification, and how to fix their folders.

Challenge 2: Renaming Folders

Eventually, if the user still has not fixed their folders, we rename the folders right before the migration takes place. This was a tough decision for us as there are a number of challenges associated with renaming a user’s e-mail folders:

  • Coming up with a naming convention. We chose ‘_renamed-X-abdc…wxyz’, where X is a chronological number, and abcd…wxyz are the first four and last four letters of the original folder name. Prepending this with an underscore helped bring this label to the top of the list when viewed in an e-mail client, bringing it to the user’s attention as soon as possible.
  • Sometimes users had hundreds of folders which needed to be renamed. In our tool we could see how many folders were illegal, and if the user had a bunch, we’d contact them individually and suggest ways to shorten the folders. Often times when a folder was too long, it was a series of nested folders which could be remedied by renaming just the parent folder.
def renameBadFolders(username, uid):
    """ Command to get the folders """
    command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu gaf | egrep -i ' mess | conv ' |  egrep -v ' /Trash/.*| /Trash ()' \
                | sed -e 's/^.* \///1' " % username
    p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    (output, error) = p.communicate()
    if error != '':
        logging.error('Problem getting list of Bmail folders')

    """ Crazy way to get folder names. All folders end with () or (.*)
        It was tricky getting only the last occurence of (.*). So, we
        * reverse the string
        * find the index of the first (
        * get the string from this index + 2 to end
        * reverse the string again """
    folder_list = []
    for folder in output.splitlines():
        tmp = folder[::-1][folder[::-1].index('( ') + 2:len(folder)]
        folder_list.append(tmp[::-1])
    folders = { 'long' : [], 'space' : [] }

    """ Check for folders with adjacent white space or > 40 chars long """
    r = re.compile(r'\s\s+')    # more than 1 white space
    for folder in folder_list:
        if len(folder) > 40:
            folders['long'].append(folder)
        elif r.search(folder):
            folders['space'].append(folder)
        else:
            pass

    renamed_count = 0
    final_folders = []
    if len(folders['long']) > 0 or len(folders['space']) > 0:
        logging.info('Found bad folders for %s. Renaming...' % username)
        for folder in folders['long']:
            stripped_folder = folder.replace('/', '-').replace(' ', '_')
            folder_start = stripped_folder[0:5]
            folder_end = stripped_folder[len(stripped_folder) - 5:len(stripped_folder)]
            renamed_folder = '_renamed-%s-%s...%s' % (str(renamed_count), folder_start, folder_end)
            command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu rf \"%s\" \"/%s\" " % (username, folder, renamed_folder)
            p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
            (output, error) = p.communicate()
            if error != '':
                logging.error('Problem renaming folder for %s : %s' % (username, error))
            else:
                final_folders.append((folder, renamed_folder))
                renamed_count += 1
        for folder in folders['space']:
            if len(folder) < 10:
                 renamed_folder = '_renamed-%s-%s' % (str(renamed_count), folder.replace(' ', '_').replace('/', '-'))
            else:
                 stripped_folder = folder.replace('/', '-').replace(' ', '_')
                 folder_start = stripped_folder[0:5]
                 folder_end = stripped_folder[len(stripped_folder) - 5:len(stripped_folder)]
                 renamed_folder = '_renamed-%s-%s...%s' % (str(renamed_count), folder_start, folder_end)
             command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu rf \"%s\" \"/%s\" " % (username, folder, renamed_folder)
             p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
             (output, error) = p.communicate()
             if error != '':
                 logging.error('Problem renaming folder for %s : %s' % (username, error))
             else:
                 final_folders.append((folder, renamed_folder))
                 renamed_count += 1
     else:
         logging.info('Folders are OK')
     if len(final_folders) > 0:
        logging.debug(final_folders)
        details = 'Renamed bad folders for %s\n' % username
        details += str(final_folders)
        emailAdmins(others=[''], details=details)
        notify(username, final_folders, uid)

Not much to this code. Basically all it does here is take all the bad folders (both too long and with 2+ adjacent spaces) and uses the zmmailbox rf command to rename them.

2 responses so far

Jul 10 2010

Migrating From Zimbra to Google: How We’re Doing It (Part 1 – Notifications)

A little while back I wrote up a short evaluation of the Google Migration Tool – highlighting the pros and cons I found. After a ton of iterations, I think we may have stumbled upon success. (Knowing my luck the process will now fail miserably because of this post).

The migration process is comprised of 3 main components:

  • End-user notifications
  • Folder renaming
  • Migrating mail

The notification and folder renaming processes are pretty straightforward. The actual migrating of mail is rather complex and I’ll cover the details to that in part 3.

Since Brandeis has chosen to take a rolling migration approach, we have the luxury of moving only a select number of users at a time. When we made the move to Bmail, we found that moving a single department at a time worked well. At least with this method, we can catch a set of users who work in close proximity to each other and can look to one another for some first tier support.

The secret to moving departments on a one-by-one basis is to first create a master spreadsheet containing all users who have a Bmail account, the department, Brandeis affiliation, the number of messages in each account, the number of folders as well as number of Google illegal folders, and the size of the mailbox (we have no quota so this # is important). Sure creating this spreadsheet only captures numbers at a moment in time but it at least gets us a rough idea of how large a user’s mailbox is or how many e-mails they have.

Next we choose a department and work with them on selecting a migration date. Not knowing any one department’s work schedule, this coordination has proven invaluable. It also helps to catch the one off cases where a user may be traveling and having to change to a new e-mail system while on the road would cause some confusion.

Once the department or group has agreed upon a date, a CSV list of their Brandeis UNet IDs is uploaded to our Google Migration Application. Doing so makes the following things happen:

  1. Once uploaded, key dates are calculated.
    Dates from the Google MIgration Application

    Dates from the Google MIgration Application

    1. Add Date – the date they are added to the application.
    2. Reminder Date – 2 business days before they are scheduled to move
    3. Migrate Date – the actual day the migration will take place
    4. Close Date – the date their Bmail account will close
  2. A kick-off notification is sent at 4 PM on the Add Date. This e-mail contains general information about the move, links to documentation, as well as a pre-migration checklist of things to do.
  3. On the evening of the Add Date, a separate process is run which checks a user’s folders for length, illegal characters, Google specific system labels, and folders with 2 or more adjacent spaces. If any are found, a notification is sent to the user highlighting each of these folders and why they will not migrate to Google. (More on this process will be covered in Part 2).
  4. On the Reminder Date, a reminder e-mail (surprise!) is sent to the users at 9:30 AM. This is more a less the same e-mail as the kick-off notification, with some slightly different verbiage since the migration is only 2 business days away.
  5. At 11 PM on the Reminder Date, the folder check process is run again, basically repeating the same e-mail as before. However, if the user fixed all folders from the initial notification, a ‘Thanks for renaming your folders!’ e-mail is sent instead.
  6. At 5 PM on the Migrate Date, the folder check process is run last time, only this time it renames any folders which are still illegal. An e-mail with each folder, before and after being renamed, is sent to the user.
  7. At 5:30 PM on the Migrate Date, the migration begins. We do some magic to calculate the number of e-mails a user has and provide them a rough estimate of how long the migration will take. Last I checked we assume a rate of 2,000 e-mails per hour – so if you have 4,000 e-mails the notification will state 2 – 3 hours to complete.
  8. During the migration, the user may be notified of e-mails which couldn’t be migrated to Google. If this is the case, they will be sent an e-mail with a link to download all e-mails in a folder which could not be migrated, along with the date, subject, from address, and message ID of each individual e-mail which did not get migrated.
  9. Once the migration is complete, a final notification is sent with the next steps – most importantly how to point your mail clients or mobile device to Google instead of Bmail.
  10. No notifications are sent on the Close Date, rather, the Bmail account is silently closed via an automated process.

All notifications and their subject

All notifications are stored in our Google Migration App, making it easy to see who got notified and when. The app also allows users to be contacted directly from the interface, again storing the notification in the database.

Notifications are all made possible by:

  • Turbogears, the Python based web framework which the Google Migration Application was written in
  • Cron to automate and schedule when the notifications run
  • MySQL which stores all the users, dates, notifications, etc. from the Google Migration Application
  • Python + MySQLdb and smtplib (and other misc. modules) which all the scripts are written in

In Part 2 I will explore in a little more detail the folder renaming process, which like everything else, has undergone a few iterations before we got it ‘right’. Finally, in Part 3, when I have a few *free* hours, I’ll document the actual mail migration process.

5 responses so far

May 12 2010

Google App Status Page

You may have noticed the side bar feed from Google’s app status page, but the feed doesn’t do the status page any justice.

The status page provides the current status for all of Google’s services and up to date information on any issues they are having. This is a good place to check first if you are having problems accessing any of the Google services.

http://www.google.com/appsstatus

Comments Off

May 11 2010

People’s name in Google

The Google account sync gets information about people (such as their name and email address) from the Brandeis directory and stores it in Google. Unfortunately there seem to be some issues with names. Primarily what’s noticeable is that when someone has a space in their first name, all that’s ending up in Google are the characters before the space!

Doing a quick search, there are literally thousands of people at Brandeis with a space in their first name! Sounds like something we should rectify, eh?

I’m working with my co-workers on improving how names gets from Brandeis to Google. The good news is it’s getting us thinking and working towards improving some identity management items. While it shouldn’t be a huge deal to correct, I didn’t anticipate this problem and the time/effort to correct it!

If your name looks incorrect in Google please be patient as we work to setup a better way of sending Google our names.

UPDATE: We are now syncing names in a better manner! If your name still looks incorrect let us know.

Comments Off

May 07 2010

Google Apps Update – Copy Sheets from one spreadsheet to another

Published by under Docs

Posted: 06 May 2010 01:51 AM PDT
You now have the ability to copy a sheet from one spreadsheet to the other, when using the new version of Google spreadsheets

http://googleappsupdates.blogspot.com/2010/05/copy-sheets-from-one-spreadsheet-to.html

Get these product update alerts by email

Comments Off

May 06 2010

Good help isn’t hard to find

LTS is laying the groundwork for some special support services that will ease the Brandeis community’s transition to Google Apps.

Skilled assistance and support. LTS will shortly establish a dedicated Google help center in the Goldfarb Library.  Members of the community will be able to get help with Google Apps by phone, via email, and in-person.  The LTS Help Desk staff will also provide additional expert technical assistance.

Online documentation. Google provides a wealth of documentation for Google Apps.  LTS staff will create a special website that organizes these materials and makes it easy for Brandeis community members to find needed information.

Workshops.  LTS will offer weekly workshops to help community members become skilled with Google Calendar and Google Mail.  Starting in late May, these workshops will be offered throughout the coming summer and academic year as needed.

4 responses so far

May 05 2010

Multiple Google Notifiers

One of my favorite features of Bmail was the Zimbra Toaster. It was a simple, but helpful pop-up notification whenever I received a new e-mail. Super useful when not browsing my e-mail client, but annoying (and potentially embarrassing) when giving a presentation.

Google’s toaster equivalent is the Google Notifier (also available for Windows I believe). I’ve been using this nifty little menu bar app for my personal account for as long as I can remember but was concerned I couldn’t run a second instance for my Brandeis Google Apps account. After a quick Google search, I stumbled upon this article from macosxhints.com. From the site:

  1. Duplicate the Google Notifier application.
  2. Select the duplicate, control-click on its icon, and choose Show Package Contents from the pop-up menu.
  3. Navigate into Contents, and then open Info.plist in an editor.
  4. There is a property in that file called CFBundleIdentifier, with the value com.google.GmailNotifier. Change the property’s value tocom.google.GmailNotifierMysociety instead.
  5. Save your change and quit the editor.

Apparently you can also change the icon, but you need developer tools to do so. I think I can live with the same icon for now.

Comments Off

May 01 2010

Maximum Label Length: 40!?

Published by under Gmail,Technical details

Are you a user with tons of folders, with folders within those folders, and folders within those, and so on? Or maybe you prefer to keep a flat folder structure, but use-hyphens-in-your-folders-creating-really-long-folder-names?

User beware!

Google Mail has a 40 character label limit. Plus, users with sub-folders have to eat a character for each ‘/’ in the name. Users have voiced their concerns with the limit, but I think a better approach would be to suggest it.

One response so far

Apr 30 2010

Google Migration Tool. Nice, but …

Published by under Gmail,Technical details

It seems like yesterday I was writing a python program and web based account verification system to migrate our user’s mail from a legacy, UW-IMAP system to Zimbra NE. Is it deja vu?

I am now tasked with migrating mail from Zimbra to Google. How hard could it be? I’ve done it once, how different could this be?

For starters, Google offers a slew of methods for getting our mail over there. The question is, which?
Continue Reading »

One response so far

Apr 27 2010

Beware invitations in gmail!!

Published by under Calendar

Gmail has a nice link that allows you to quickly add an invitation to your email. COOL!!! I get a lot of mail asking if I can meet with people.

I sent 8 of these yesterday! – too bad they don’t work correctly.

If you use the gmail interface they seem to work properly, however if you do not the link sent will take you to a non hosted google page.. Bad…

Invites from the calendar work correctly.

If your email client handles the ICS file well it also should work…

See this Google discussion post by me.

Comments Off

Next »

Protected by Akismet
Blog with WordPress

Welcome Guest | Login (Brandeis Members Only)