Migrating Excel data to SQLite using Python
In
a previous
post, I described how I designed a SQLite relational database
from an Excel table. It was a small example, so I hardcoded the data
into the Python script. For my actual problem, I need to convert my
Excel data into a SQLite database automatically. To do this, I used
the win32com module and the sqlite3 module
included in Python 2.5.
Here is the table from my previous post. It shows some variables in in my C program. It shows the variable name, type, the module it belongs to, and a short description. Here is the table from my previous post. It shows some variables in in my C program. It shows the variable name, type, the module it belongs to, and a short description.
| id | name | module | type | desc |
| 1 | foo | ModuleExt | double | Description of foo |
| 2 | bar | ModuleExt | double | Description of bar |
| 3 | knark | Module1 | int | Description of knark |
| 4 | wert | Module1 | double | Description of wert |
| 5 | jib | Module1 | double | Description of jib |
| 6 | laz | Module2 | double | Description of laz |
| 7 | kew | Module2 | double | Description of kew |
After installing the win32com module
from http://sourceforge.net/project/platformdownload.php?group_id=78018,
I used the following code.
import os
import sqlite3
from win32com.client import constants, Dispatch
#----------------------------------------
# get data from excel file
#----------------------------------------
XLS_FILE = os.getcwd() + "\\example.xls"
ROW_SPAN = (14, 21)
COL_SPAN = (2, 7)
app = Dispatch("Excel.Application")
app.Visible = True
ws = app.Workbooks.Open(XLS_FILE).Sheets(1)
exceldata = [[ws.Cells(row, col).Value
for col in xrange(COL_SPAN[0], COL_SPAN[1])]
for row in xrange(ROW_SPAN[0], ROW_SPAN[1])]
#----------------------------------------
# create SQL table and fill it with data
#----------------------------------------
conn = sqlite3.connect(example.db')
c = conn.cursor()
c.execute('''CREATE TABLE exceltable (
id INTEGER,
name TEXT,
module TEXT,
type TEXT,
desc TEXT
)''')
for row in exceldata:
c.execute('INSERT INTO exceltable VALUES (?,?,?,?,?)', row)
conn.commit()
#----------------------------------------
# display SQL data
#----------------------------------------
c.execute('SELECT * FROM exceltable')
for row in c:
print row
The Excel filename is example.xls and the table data begins at B14 (2nd column, 14th row) and ends at F20 (6th column, 20th row) in Sheet 1. The script uses a declarative approach to store the data in a Python list of lists. It creates a SQLite database named example.db and creates a connection to it. It then fills the database using the Python list data structure. Finally, it displays the newly created database. The screen output is shown below.
(1, u'foo', u'ModuleExt', u'double', u'Description of foo') (2, u'bar', u'ModuleExt', u'double', u'Description of bar') (3, u'knark', u'Module1', u'int', u'Description of knark') (4, u'wert', u'Module1', u'double', u'Description of wert') (5, u'jib', u'Module1', u'double', u'Description of jib') (6, u'laz', u'Module2', u'double', u'Description of laz') (7, u'kew', u'Module2', u'double', u'Description of kew')
If I want to interact with the newly created database using SQLite
directly, I can run sqlite3 example.db from my Cygwin
bash command line. (Note the conn.commit() line in my
Python script is very important for this step to work. For some
reason, I didn't see this in the Python documentation.) Here is an
example session using SQLite directly with the new database.
$ sqlite3 example.db SQLite version 3.5.1 Enter ".help" for instructions sqlite> .schema CREATE TABLE exceltable ( id INTEGER, name TEXT, module TEXT, type TEXT, desc TEXT ); sqlite> select * from exceltable; 1|foo|ModuleExt|double|Description of foo 2|bar|ModuleExt|double|Description of bar 3|knark|Module1|int|Description of knark 4|wert|Module1|double|Description of wert 5|jib|Module1|double|Description of jib 6|laz|Module2|double|Description of laz 7|kew|Module2|double|Description of kew sqlite>
2
Comments
—
Comments feed for this post
#2 rowinggolfer commented on 2010-07-01:
nice work, and fine for .xls files where you know the format, or everything is text.
You'd quickly come unstuck with dates/floats etc..
So if you have a more complex spreadsheet, and want to have fine control over the datatypes in your sqlite database, consider using the 3rd party python module xlrd.
http://pypi.python.org/pypi/xlrd
Post a comment
About
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »
Search Blog
Tags
-
algorithms
(4)
-
aws
(8)
-
blogproject
(20)
-
c_cplusplus
(12)
-
cardstore
(8)
-
colinux
(2)
-
concurrency
(9)
-
conkeror
(2)
-
cygwin
(18)
-
datastructures
(15)
-
datetime
(3)
-
dell
(3)
-
django
(39)
-
emacs
(20)
-
files_directories
(10)
-
install_setup
(7)
-
javascript
(3)
-
keyboard
(6)
-
matplotlib
(5)
-
mercurial
(4)
-
nginx
(2)
-
preferences
(8)
-
processes
(3)
-
pyqt
(18)
-
python
(122)
-
ratpoison
(3)
-
regexes
(5)
-
rsync
(3)
-
softwaretools
(17)
-
sql
(13)
-
ssh
(7)
-
subversion
(6)
-
twisted
(6)
-
ubuntu
(60)
-
urxvt
(5)
-
vxworks
(25)
-
webservices
(4)
-
wmii
(7)
Blogroll
- Adam Gomaa
- Alex Clemesha
- Amir Salihefendic
- Armin Ronacher
- David Beazley
- David Ziegler
- Duncan McGreggor
- Gareth Rushgrave
- Glyph Lefkowitz
- Guido van Rossum
- Ian Bicking
- Jacob Kaplan-Moss
- James Bennett
- James Tauber
- Jesper Noehr
- Matt Harrison
- Nikolay Kolev
- Parand Darugar
- Peter Baumgartner
- Peter Bengtsson
- Rob Hudson
- Simon Willison
- Will McGugan
#1 Chris commented on 2009-05-18:
You just saved me hours of time, and it worked straight off, and having not used Python before I'm very impressed by that too. Took me about 20 minutes to import 140 lines into SQLite which would have been loads of typing. thank you!