SaltyCrane Blog — Notes on Javascript and web development

Notes on migrating this blog from SQLite to PostgreSQL using Django

Here are my notes on migrating this blog from SQLite to PostgreSQL. For the parts of my database that were in proper order, the migration was made very easy using Django's ./manage.py dumpdata and ./manage.py loaddata commands. However, the database tables used for storing the comments on this blog were kind of screwed up because I had previously migrated them from my old Blogger blog. So I had to write another (not so pretty) script for that.

Thanks to this article for showing me how to use Django's dumpdata and loaddata for this migration.

Create a new Postgres database

See my previous notes for creating a Postgres database. I named the new database "saltycrane_db" owned by "django_user" with password "my_password".

Clone and modify Django project

  • Clone my SaltyCrane project
    hg clone SaltyCrane SaltyCraneNew
    or
    cp -rp SaltyCrane SaltyCraneNew
  • Edit settings.py:
    DATABASE_ENGINE = 'postgresql_psycopg2'
    DATABASE_NAME = 'saltycrane_db'
    DATABASE_USER = 'django_user'
    DATABASE_PASSWORD = my_password'
  • Create database tables
    python manage.py syncdb

Migrate data

  • Create JSON dumps from the existing SQLite database for my 3 Django apps (myblogapp, comments, and tagging):
    cd /srv/SaltyCrane/iwiwdsmi
    ./manage.py dumpdata myblogapp > dump_myblogapp.json
    ./manage.py dumpdata comments > dump_comments.json
    ./manage.py dumpdata tagging > dump_tagging.json
  • Load the JSON data dumps to the new Postgres database:
    cd /srv/SaltyCraneNew/iwiwdsmi
    ./manage.py loaddata ../../SaltyCrane/iwiwdsmi/dump_tagging.json
    ./manage.py loaddata ../../SaltyCrane/iwiwdsmi/dump_myblogapp.json

However, loading comments didn't work because I had some missing fields so...

Migration script

...I wrote a migration script:

import setup_environment

import simplejson
from pprint import pprint
from django.contrib.comments.models import Comment
from django.contrib.contenttypes.models import ContentType
from iwiwdsmi.myblogapp.models import Post

JSON_FILENAME = "/srv/SaltyCrane/iwiwdsmi/dump_comments.json"
ct = ContentType.objects.get(name='post', app_label='myblogapp')
N_OLD_COMMENTS = 4000

def main():
    delete_all_comments()
    create_dummy_comments()
    pydata = open_json_file(JSON_FILENAME)
    save_items_to_database(pydata)
    delete_dummy_comments()

def delete_all_comments():
    """ Clears the database of all comments.
    """
    comments = Comment.objects.all()
    comments.delete()
    print "All comments deleted."

def create_dummy_comments():
    """ Create a bunch of filler dummy comments
    """
    for i in range(N_OLD_COMMENTS):
        c = Comment()
        c.comment = "Filler comment."
        c.content_type = ct
        c.ip_address = None
        c.is_public = False
        c.is_removed = False
        c.object_pk = 243
        c.site_id = 1
        c.user_email = "filler@email.com"
        c.user_name = "Filler"
        c.save()
    print "Filler comments created."

def delete_dummy_comments():
    comments = Comment.objects.filter(is_public=False)
    comments.delete()
    print "Dummy comments deleted."

def open_json_file(filename):
    """ open the json file and return the python data structure
    """
    json_fp = open(filename)
    pydata = simplejson.load(json_fp)
    return pydata

def save_items_to_database(pydata):
    """ Process the python data structure and save to the new database
    """
    for item in pydata:
        pprint(item)
        pk = item['pk']
        item = item['fields']
        if item['is_public'] and not item['is_removed']:
            c = Comment.objects.get(id=pk)
            c.comment = item['comment']
            c.content_type = ct
            c.ip_address = get_ip_address(item['ip_address'])
            c.is_public = item['is_public']
            c.is_removed = item['is_removed']
            c.object_pk = item['object_pk']
            c.site_id = 1
            c.submit_date = item['submit_date']
            c.user_id = item['user']
            c.user_email = item['user_email']
            c.user_name = item['user_name']
            c.user_url = item['user_url']
            c.save()

def get_ip_address(ip):
    """ Handle bad input for IP addresses
    """
    if ip == "" or ip == "unknown":
        return None
    else:
        return ip

if __name__ == '__main__':
    main()

Comments


#1 Thomas E. Glasgow commented on :

Hi,

Any reason why you're migrating away from sqlite? Did you hit any performance wall?