Archive for the 'Uncategorized' Category

Feb 01 2011

Migrating From Zimbra to Google: How We’re Doing It (Part 2 – Folder Renaming)

Published by under Uncategorized

Once the notifications were ironed out, the second challenge we faced was working around the restrictions put on Google’s labels:

  • Can only be 40 characters in length
  • Can not have spaces adjacent to one another

This presented us with the following challenges:

  • How do we check if a user has these folders?
  • What notifications, if any, do we send?
  • If the user doesn’t fix their folders before the migration process runs, do we rename the folders for them

Disclaimer!! I’m not a programmer by any stretch of the imagination!

Challenge 1: Checking for bad folders

I hate to post snippets of code without giving much context around it, especially since it’s part of a much larger application, but hopefully I can explain it enough to give the basic idea behind our methods.

def checkForBadFolders(username):
    """ Command to get the folders """
    command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu gaf | egrep -i ' mess | conv ' | egrep -v ' /Trash/.*| /Trash ()' \
                | sed -e 's/^.* \///1' " % username
    p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    (output, error) = p.communicate()
    if error != '':
        logging.error('Problem getting list of Bmail folders')

    """ Crazy way to get folder names. All folders end with () or (.*)
        It was tricky getting only the last occurrence of (.*). So, we
        * reverse the string
        * find the index of the first (
        * get the string from this index + 2 to end
        * reverse the string again """
    folder_list = []
    for folder in output.splitlines():
        tmp = folder[::-1][folder[::-1].index('( ') + 2:len(folder)]
        folder_list.append(tmp[::-1])
    folders = { 'long' : [], 'space' : [] }

    """ Check for folders with adjacent white space, > 40 chars long, or reserved folder name """
    r = re.compile(r'\s\s+')    # more than 1 white space
    for folder in folder_list:
        if len(folder) > 40:
            folders['long'].append(folder)
        if r.search(folder):
            folders['space'].append(folder)

    """ If bad folders were found, return this count and the folders """
    if len(folders['long']) > 0 or len(folders['space']) > 0:
        logging.info('Found bad folders for %s' % username)
        return len(folders['long']) + len(folders['space']), folders
    else:
        logging.info('Folders are OK')
        return 0, folders

In the above code, I use the zmmailbox command (bundled with Zimbra) to get the list of folders for a particular user. Each folder ends with (.*). The trick here is to strip this portion off and get only the folder name. Presumably if my regex-foo were better, this could be handled in a much simpler way. Instead, I:

  1. Reversed the string
  2. Grabbed the index of the first (
  3. Grabbed all characters from the index in step 2 + 2 until the end
  4. Reversed the string again

Convoluted, I know, but it did the trick! As an alternative, I could have connected to the account over IMAP and listed the folders that way. Pick your poison I guess.

This folder checking process runs 3 times from when the user is first added to our migration tool, up until their migration actually takes place. As you can see from the code above, this function returns the number of folders, along with their names, of folders which are over 40 characters or folders which have 2 or more adjacent spaces next to one another. This list of folders is then sent to the user, explaining why they are receiving the notification, and how to fix their folders.

Challenge 2: Renaming Folders

Eventually, if the user still has not fixed their folders, we rename the folders right before the migration takes place. This was a tough decision for us as there are a number of challenges associated with renaming a user’s e-mail folders:

  • Coming up with a naming convention. We chose ‘_renamed-X-abdc…wxyz’, where X is a chronological number, and abcd…wxyz are the first four and last four letters of the original folder name. Prepending this with an underscore helped bring this label to the top of the list when viewed in an e-mail client, bringing it to the user’s attention as soon as possible.
  • Sometimes users had hundreds of folders which needed to be renamed. In our tool we could see how many folders were illegal, and if the user had a bunch, we’d contact them individually and suggest ways to shorten the folders. Often times when a folder was too long, it was a series of nested folders which could be remedied by renaming just the parent folder.
def renameBadFolders(username, uid):
    """ Command to get the folders """
    command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu gaf | egrep -i ' mess | conv ' |  egrep -v ' /Trash/.*| /Trash ()' \
                | sed -e 's/^.* \///1' " % username
    p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    (output, error) = p.communicate()
    if error != '':
        logging.error('Problem getting list of Bmail folders')

    """ Crazy way to get folder names. All folders end with () or (.*)
        It was tricky getting only the last occurence of (.*). So, we
        * reverse the string
        * find the index of the first (
        * get the string from this index + 2 to end
        * reverse the string again """
    folder_list = []
    for folder in output.splitlines():
        tmp = folder[::-1][folder[::-1].index('( ') + 2:len(folder)]
        folder_list.append(tmp[::-1])
    folders = { 'long' : [], 'space' : [] }

    """ Check for folders with adjacent white space or > 40 chars long """
    r = re.compile(r'\s\s+')    # more than 1 white space
    for folder in folder_list:
        if len(folder) > 40:
            folders['long'].append(folder)
        elif r.search(folder):
            folders['space'].append(folder)
        else:
            pass

    renamed_count = 0
    final_folders = []
    if len(folders['long']) > 0 or len(folders['space']) > 0:
        logging.info('Found bad folders for %s. Renaming...' % username)
        for folder in folders['long']:
            stripped_folder = folder.replace('/', '-').replace(' ', '_')
            folder_start = stripped_folder[0:5]
            folder_end = stripped_folder[len(stripped_folder) - 5:len(stripped_folder)]
            renamed_folder = '_renamed-%s-%s...%s' % (str(renamed_count), folder_start, folder_end)
            command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu rf \"%s\" \"/%s\" " % (username, folder, renamed_folder)
            p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
            (output, error) = p.communicate()
            if error != '':
                logging.error('Problem renaming folder for %s : %s' % (username, error))
            else:
                final_folders.append((folder, renamed_folder))
                renamed_count += 1
        for folder in folders['space']:
            if len(folder) < 10:
                 renamed_folder = '_renamed-%s-%s' % (str(renamed_count), folder.replace(' ', '_').replace('/', '-'))
            else:
                 stripped_folder = folder.replace('/', '-').replace(' ', '_')
                 folder_start = stripped_folder[0:5]
                 folder_end = stripped_folder[len(stripped_folder) - 5:len(stripped_folder)]
                 renamed_folder = '_renamed-%s-%s...%s' % (str(renamed_count), folder_start, folder_end)
             command = "/opt/zimbra/bin/zmmailbox -z -m %s@brandeis.edu rf \"%s\" \"/%s\" " % (username, folder, renamed_folder)
             p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
             (output, error) = p.communicate()
             if error != '':
                 logging.error('Problem renaming folder for %s : %s' % (username, error))
             else:
                 final_folders.append((folder, renamed_folder))
                 renamed_count += 1
     else:
         logging.info('Folders are OK')
     if len(final_folders) > 0:
        logging.debug(final_folders)
        details = 'Renamed bad folders for %s\n' % username
        details += str(final_folders)
        emailAdmins(others=[''], details=details)
        notify(username, final_folders, uid)

Not much to this code. Basically all it does here is take all the bad folders (both too long and with 2+ adjacent spaces) and uses the zmmailbox rf command to rename them.

2 responses so far

Protected by Akismet
Blog with WordPress

Welcome Guest | Login (Brandeis Members Only)