Google Chat History Downloader
Based on the comments, this doesn’t work anymore. I’d recommend checking out this thread for solutions: http://www.google.com/support/forum/p/gmail/thread?tid=7a7d2d6da5be047f
A couple weeks ago, I decided to migrate from one Google Account to another. I was able to transfer all of my emails from one to the other without too much difficulty. However, I looked around for a while and have not found any way to export all of my Google Talk Chat history. I don’t think there is any way to access saved chats from either IMAP or POP. I did notice though, that through the Gmail web interface, you can view saved chats as a raw message. There happens to be an old python library for interacting with the Gmail web interface called libgmail. I found however that it does not scale very well to large amounts of messages, so I had to write my own method to only process results one page at a time. Also, I found that I was easily blocked using this method over a long time, so I added 13 second delays after every request so as not to get my account suspended. It took me a day and a half to actually export all of the messages. I’m not sure if this is over kill or not, but I am tired of getting my account blocked.
Anyway, This program goes through and saves each chat history message as an .eml file. One they are in that format, it is not super hard to get them into a different Gmail account, but I’ll save that for another post.
import os import time import libgmail # http://libgmail.sourceforge.net/ def thread_search(ga, searchType, **kwargs): index = 0 while (index == 0) or index < threadListSummary[libgmail.TS_TOTAL]: threadsInfo =  items = ga._parseSearchResult(searchType, index, **kwargs) try: threads = items[libgmail.D_THREAD] except KeyError: break else: for th in threads: if not type(th) is libgmail.types.ListType: th = [th] threadsInfo.append(th) threadListSummary = items[libgmail.D_THREADLIST_SUMMARY] threadsPerPage = threadListSummary[libgmail.TS_NUM] index += threadsPerPage yield libgmail.GmailSearchResult(ga, (searchType, kwargs), threadsInfo) ga = libgmail.GmailAccount("firstname.lastname@example.org", "password") ga.login() for page in thread_search(ga, "query", q="is:chat"): print "New Page" time.sleep(13) for thread in page: if thread.info == thread.info: # Common case: Chats that only span one message filename = "chats/%s_%s.eml" % (thread.id, thread.id) #only download the message if we don't have it already if os.path.exists(filename): print "already have %s" % filename continue print "Downloading raw message: %s" % filename, message = ga.getRawMessage(thread.id).decode('utf-8').lstrip() print "done." file(filename, 'wb').write(message) time.sleep(13) continue # Less common case: A thread that has multiple messages print "Looking up messages in thread %s" % thread.id time.sleep(13) for message in thread: filename = "chats/%s_%s.eml" % (thread.id, message.id) #only download the message if we don't have it already if os.path.exists(filename): print "already have %s" % filename continue print "Downloading raw message: %s" % filename, file(filename, 'wb').write(message.source.lstrip()) print "done." time.sleep(13)