Relational database introduction with Python and SQLite

Date: 2007-11-14 | Modified: 2008-11-03 | Tags: persistence, python, sql | 3 Comments

I mentioned in a previous post that I have an Excel spreadsheet containing a bunch of information about the development C code I'm working on. It is a large table showing which variables are input and output from which functions. The variable names are in the first column and the function names are in the last several columns of the first row. I use "i" or "o" to denote if a variable is an input to or an output from a particular function. I also have a few columns for the variable type and description. A simplified example is shown in the table below.

ORIGINAL TABLE

name	module	type	desc	ExtSource	DoThis	CalcThis	CalcThat	ModifyStuff	ExtSink
foo	ModuleExt	double	Description of foo	o	i		i
bar	ModuleExt	double	Description of bar	o	i	i
knark	Module1	int	Description of knark		io	i
wert	Module1	double	Description of wert		o		i	i
jib	Module1	double	Description of jib			o		i
laz	Module2	double	Description of laz			o		o	i
kew	Module2	double	Description of kew				o	o	i

As shown above, the table also includes the variable descriptions, type, and scope. In my real, unsimplified Excel table, I also include function-related information such as the function description, and the function prototype. This additional information makes the table very messy. Try to imagine where you would put the function description and function prototype information in the table above. I came up with a kludge solution, but it involved repeated information, and rows and columns with inconsistent meaning. I was realizing the limitations of a 2-dimensional table.

I had been interested in learning about relational databases, so this seemed like a good opportunity. Relational databases imply the SQL query language-- relational databases that use SQL are about the only kind around. They include: Oracle, Microsoft SQL Server, PostgreSQL, MySQL, and SQLite among others. I chose SQLite for my task because it is lightweight (no separate server), free, popular, and fast for small databases. (See http://www.sqlite.org/different.html for other distinctive features.) Also Python 2.5 now includes an interface module, sqlite3, as part of the standard distribution. See http://docs.python.org/lib/module-sqlite3.html for the documentation.

Relational model design

I could create a relational database using my original table in the Excel spreadsheet. However, this organization is not in the spirit of relational database design. The [relational] approach is to create many smaller tables and relationships between those tables. For my example, I created a variable table, a function table, and a variable_function table to indicate the relationship between the variable and function tables. Though it looks more complicated than the original single table form, inputting this into a relational database, such as SQLite, will allow for more advanced manipulation of the data. It is also much more [conducive] to maintenance and expansion. For example, now there is a natural place to put the additional function-related information which was so clumsy in the original single table. To do this, I add desc and prototype columns to the function table. If I needed to add more information to about the different modules, I could create a separate module table and include the module_id as a column in the variable table.

VARIABLE TABLE

id	name	module	type	desc
1	foo	ModuleExt	double	Description of foo
2	bar	ModuleExt	double	Description of bar
3	knark	Module1	int	Description of knark
4	wert	Module1	double	Description of wert
5	jib	Module1	double	Description of jib
6	laz	Module2	double	Description of laz
7	kew	Module2	double	Description of kew

FUNCTION TABLE

id	name
1	ExtSource
2	DoThis
3	CalcThis
4	CalcThat
5	ModifyStuff
6	ExtSink

VARIABLE_FUNCTION TABLE

variable_id	function_id	type
1	1	output
1	2	input
1	4	input
2	1	output
2	2	input
2	3	input
3	2	input-output
3	3	input
4	2	output
4	4	input
4	5	input
5	3	output
5	5	input
6	3	output
6	5	output
6	6	input
7	4	output
7	5	output
7	6	input

SQLite Implementation (w/ Python)

To implement this database, all I needed was Python 2.5. If you have an older version of Python, you can install the pysqlite module. Here is the Python code to implement the database above.

import sqlite3

# data
VARIABLES = (
    (1, 'foo', 'ModuleExt', 'double', 'Description of foo'),
    (2, 'bar', 'ModuleExt', 'double', 'Description of bar'),
    (3, 'knark', 'Module1', 'int', 'Description of knark'),
    (4, 'wert', 'Module1', 'double', 'Description of wert'),
    (5, 'jib', 'Module1', 'double', 'Description of jib'),
    (6, 'laz', 'Module2', 'double', 'Description of laz'),
    (7, 'kew', 'Module2', 'double', 'Description of kew'),
)
FUNCTIONS = (
    (1, 'ExtSource'),
    (2, 'DoThis'),
    (3, 'CalcThis'),
    (4, 'CalcThat'),
    (5, 'ModifyStuff'),
    (6, 'ExtSink'),
)
VAR_FUNC = (
    (1, 1, 'output'),
    (1, 2, 'input'),
    (1, 4, 'input'),
    (2, 1, 'output'),
    (2, 2, 'input'),
    (2, 3, 'input'),
    (3, 2, 'input-output'),
    (3, 3, 'input'),
    (4, 2, 'output'),
    (4, 4, 'input'),
    (4, 5, 'input'),
    (5, 3, 'output'),
    (5, 5, 'input'),
    (6, 3, 'output'),
    (6, 5, 'output'),
    (6, 6, 'input'),
    (7, 4, 'output'),
    (7, 5, 'output'),
    (7, 6, 'input'),
)

# get connection and cursor objects
conn = sqlite3.connect('iodatabase')
c = conn.cursor()

# create tables
c.execute('''create table variable (
    id integer,
    name text,
    module text,
    type text,
    desc text
)''')
c.execute('''create table function (
    id integer,
    name text
)''')
c.execute('''create table var_func (
    variable_id integer,
    function_id integer,
    type text
)''')

# fill tables with data
for row in VARIABLES:
    c.execute('insert into variable values (?,?,?,?,?)', row)
for row in FUNCTIONS:
    c.execute('insert into function values (?,?)', row)
for row in VAR_FUNC:
    c.execute('insert into var_func values (?,?,?)', row)

Querying the Database

Now that I have created the database, I can get answers to interesting questions, such as What variables are output by CalcThis? Here is the Python/SQLite code to answer this question:

c.execute(''.join([
            'SELECT variable.name, variable.module, variable.type, variable.desc ',
            'FROM variable, var_func, function ',
            'WHERE variable.id=var_func.variable_id ', 
            'AND function.id=var_func.function_id ',
            'AND function.name="CalcThis" ',
            'AND var_func.type="output" ',
            ]))
FORMAT = '%-6s%-10s%-8s%-20s'
print FORMAT % ('name', 'module', 'type', 'desc')
print '-' * 44
for row in c:
    print FORMAT % row

Here is the output.

name    module    type    desc
----------------------------------------------
jib     Module1   double  Description of jib
laz     Module2   double  Description of laz

I can see this is consistent with my original table. The query works. For such a small example, the original method may seem easier, but as the number of entries grows, the benefit of the relational database grows as well. Here is another example which asks Which functions use the variable wert as an input?

c.execute(''.join([
            'SELECT function.name ',
            'FROM variable, var_func, function ',
            'WHERE variable.id=var_func.variable_id ', 
            'AND function.id=var_func.function_id ',
            'AND variable.name="wert" ',
            'AND var_func.type="input" ',
            ]))
print 'name'
print '------------'
for row in c:
    print '%s' % row

Output:

name
---------
CalcThat
ModifyStuff

I would like to do even more complicated things like determine the prerequisite input variables across all functions for a given output variable. However, I still need to read more about that, so that will have to wait.

Comments

#1 jean commented on 2007-11-15:

Spreadsheet? python?
You should take a look at our product:
www.resolversystems.com

jean viry-babel

#2 sofeng commented on 2007-11-16:

jean viry-babel:
looks like a cool product-- wish it was my own.

#3 Florin Burdea commented on 2013-01-31:

I have a question it might be a lil bit out of topic, but do you have such an example for "playing" with an oracle db?

Thanks, Florin.

SaltyCrane Blog — Notes on JavaScript and web development

Relational database introduction with Python and SQLite

Relational model design

SQLite Implementation (w/ Python)

Querying the Database

Comments

Links