SaltyCrane Blog — Notes on JavaScript and web development

Migrating Excel data to SQLite using Python

In a previous post, I described how I designed a SQLite relational database from an Excel table. It was a small example, so I hardcoded the data into the Python script. For my actual problem, I need to convert my Excel data into a SQLite database automatically. To do this, I used the win32com module and the sqlite3 module included in Python 2.5.

Here is the table from my previous post. It shows some variables in in my C program. It shows the variable name, type, the module it belongs to, and a short description. Here is the table from my previous post. It shows some variables in in my C program. It shows the variable name, type, the module it belongs to, and a short description.

idnamemoduletypedesc
1fooModuleExtdoubleDescription of foo
2barModuleExtdoubleDescription of bar
3knarkModule1intDescription of knark
4wertModule1doubleDescription of wert
5jibModule1doubleDescription of jib
6lazModule2doubleDescription of laz
7kewModule2doubleDescription of kew

After installing the win32com module from http://sourceforge.net/project/platformdownload.php?group_id=78018, I used the following code.

import os
import sqlite3
from win32com.client import constants, Dispatch

#----------------------------------------
# get data from excel file
#----------------------------------------
XLS_FILE = os.getcwd() + "\\example.xls"
ROW_SPAN = (14, 21)
COL_SPAN = (2, 7)
app = Dispatch("Excel.Application")
app.Visible = True
ws = app.Workbooks.Open(XLS_FILE).Sheets(1)
exceldata = [[ws.Cells(row, col).Value 
              for col in xrange(COL_SPAN[0], COL_SPAN[1])] 
             for row in xrange(ROW_SPAN[0], ROW_SPAN[1])]

#----------------------------------------
# create SQL table and fill it with data
#----------------------------------------
conn = sqlite3.connect('example.db')
c = conn.cursor()
c.execute('''CREATE TABLE exceltable (
   id INTEGER,
   name TEXT,
   module TEXT,
   type TEXT,
   desc TEXT
)''')
for row in exceldata:
    c.execute('INSERT INTO exceltable VALUES (?,?,?,?,?)', row)
conn.commit()

#----------------------------------------
# display SQL data
#----------------------------------------
c.execute('SELECT * FROM exceltable')
for row in c:
    print row

The Excel filename is example.xls and the table data begins at B14 (2nd column, 14th row) and ends at F20 (6th column, 20th row) in Sheet 1. The script uses a declarative approach to store the data in a Python list of lists. It creates a SQLite database named example.db and creates a connection to it. It then fills the database using the Python list data structure. Finally, it displays the newly created database. The screen output is shown below.

(1, u'foo', u'ModuleExt', u'double', u'Description of foo')
(2, u'bar', u'ModuleExt', u'double', u'Description of bar')
(3, u'knark', u'Module1', u'int', u'Description of knark')
(4, u'wert', u'Module1', u'double', u'Description of wert')
(5, u'jib', u'Module1', u'double', u'Description of jib')
(6, u'laz', u'Module2', u'double', u'Description of laz')
(7, u'kew', u'Module2', u'double', u'Description of kew')

If I want to interact with the newly created database using SQLite directly, I can run sqlite3 example.db from my Cygwin bash command line. (Note the conn.commit() line in my Python script is very important for this step to work. For some reason, I didn't see this in the Python documentation.) Here is an example session using SQLite directly with the new database.

$ sqlite3 example.db
SQLite version 3.5.1
Enter ".help" for instructions
sqlite> .schema
CREATE TABLE exceltable (
   id INTEGER,
   name TEXT,
   module TEXT,
   type TEXT,
   desc TEXT
);
sqlite> select * from exceltable;
1|foo|ModuleExt|double|Description of foo
2|bar|ModuleExt|double|Description of bar
3|knark|Module1|int|Description of knark
4|wert|Module1|double|Description of wert
5|jib|Module1|double|Description of jib
6|laz|Module2|double|Description of laz
7|kew|Module2|double|Description of kew
sqlite>

Comments


#1 Chris commented on :

You just saved me hours of time, and it worked straight off, and having not used Python before I'm very impressed by that too. Took me about 20 minutes to import 140 lines into SQLite which would have been loads of typing. thank you!


#2 rowinggolfer commented on :

nice work, and fine for .xls files where you know the format, or everything is text.

You'd quickly come unstuck with dates/floats etc..

So if you have a more complex spreadsheet, and want to have fine control over the datatypes in your sqlite database, consider using the 3rd party python module xlrd.

http://pypi.python.org/pypi/xlrd


#3 sanjeev commented on :

Thanks man.. thanku very much


#4 Marty commented on :

Could you send me a machine readable copy of this code? I found a few small errors in what is listed on the web page and when I execute the code, it opens the Excel file "example.xls" OK but then leaves it open and the output from the last lines of code all produce 'None' for each piece of data. Either its not reading the data correctly from Excel or not printing it correctly. As you might have guessed, I am somewhat of a newbie with Python.

Thanks,


#5 Urban commented on :

Thanks a lot! really helped a lot.. :)


#6 Schuster commented on :

Thank you for that tutorial. It helps me a lot


#7 Becky commented on :

I am leanring Python and this code and explanaiton is very helpful. Thanks a lot!


#8 Will commented on :

If you are trying to run this in Python 3 you need to change xrange to range.

The code as written expects the data to start in cell B13. If you want your data to start in A1 you need to change:

ROW_SPAN = (14, 21)
COL_SPAN = (2, 7)

to

ROW_SPAN = (2, 9)
COL_SPAN = (1, 6)

If you want to run the code several times you will need to add the following lines to the end of the code:

c.execute('DROP TABLE exceltable')
conn.close()

And if you want to edit your xls file after running the script you will need to add:

ws = app.Workbooks.Close()

If your file path contains backslashes, preface the string with the letter 'r' to indicate that it is a Raw string.

Nice work, Eliot, saved me a whole bunch of time!

disqus:2439278363