SaltyCrane: persistencehttps://www.saltycrane.com/blog/2011-09-28T19:03:19-07:00Notes on sqlalchemy w/ pyodbc, freetds on Ubuntu
2011-09-28T19:03:19-07:00https://www.saltycrane.com/blog/2011/09/notes-sqlalchemy-w-pyodbc-freetds-ubuntu/<p>These are my notes for connecting to a MS SQL Server using Python on Linux.
My developement environment is Ubuntu 10.10 Maverick (64-bit). The production environment
is Redhat/Centos-based Scientific Linux 6. Here is the layers of stuff:
</p>
<ul>
<li><a href="http://www.freetds.org/">FreeTDS</a>: an open source ODBC driver. Started with version
0.82 and moved to 0.91. Other options:
<a href="http://blogs.technet.com/b/dataplatforminsider/archive/2011/11/28/available-today-preview-release-of-the-sql-server-odbc-driver-for-linux.aspx">
Microsoft ODBC driver for Linux</a> Free, currently only for 64-bit Redhat.
<a href="http://www.easysoft.com/products/data_access/odbc-sql-server-driver/index.html">
Easysoft ODBC driver
</a> Commercial, not free.
</li>
<li><a href="http://www.unixodbc.org/">unixODBC</a>: driver manager, a thin wrapper around the ODBC driver.
Version 2.2 (may move to version 2.3 if we use the Microsoft driver).
</li>
<li><a href="http://code.google.com/p/pyodbc/">pyodbc</a>: python library.
Started with 2.1.11, currently using 3.0.5.
I chose pyodbc over pymssql because, after minimal
research, it seemed to be the preferred choice.
Other options: <a href="http://code.google.com/p/pymssql/">pymssql</a> (GPL, doesn't use ODBC),
<a href="http://www.egenix.com/products/python/mxODBC/">mxODBC</a>
(commercial, much better documentation, may be more robust)
</li>
<li><a href="http://www.sqlalchemy.org/">SQLAlchemy</a>: wraps pyodbc, provides an ORM and better interface.
I chose to use sqlalchemy because it supported named parameters.
pyodbc only supported the 'qmark' paramstyle and pymssql returns
'pyformat' as the paramstyle, but it seems to actually be 'format'.
See <a href="http://www.python.org/dev/peps/pep-0249/">PEP-0249</a>
for more information on paramstyle.
See <a href="http://www.sqlalchemy.org/docs/core/expression_api.html#sqlalchemy.sql.expression.text">
http://www.sqlalchemy.org/docs/core/expression_api.html#sqlalchemy.sql.expression.text</a>
for more information on using named parameters.
</li>
</ul>
<h4 id="contents">Contents</h4>
<ul>
<li><a href="#install">Install (on Ubuntu)</a></li>
<li><a href="#connect-tsql">Try to connect using tsql (FreeTDS)</a></li>
<li><a href="#connect-python">Try to connect using Python</a></li>
<li><a href="#unicode-error">Error inserting unicode</a></li>
<li><a href="#select-minus-1-bug">Pyodbc SELECT -1 bug</a></li>
<li><a href="#list-tables">How to list tables in the database with sqlalchemy</a></li>
<li><a href="#pyodbc-freetds-truncates-varchar-instead-of-error">
pyodbc/FreeTDS truncates a varchar(max) upon insert instead of raising an error
<em>(Install FreeTDS 0.91 from source)</em>
</a></li>
<li><a href="#segfault">Segfault error when using Free TDS 0.91 and pyodbc 2.1.11</a></li>
<li><a href="#sql-server-equivalent-of-mysqls-limit">What is the MS SQL server equivalent of MySQL's LIMIT in a SELECT</a></li>
<li><a href="#versions">How to get the versions of stuff</a></li>
<li><a href="#microsoft-driver">Install the Microsoft ODBC driver for Linux</a></li>
<li><a href="#install-mxodbc">Install mxodbc</a></li>
<li><a href="#unixodbc-gui-tools">unixODBC GUI tools</a></li>
<li><a href="#sql-developer-ubuntu">Install SQL Developer on Ubuntu</a></li>
<li><a href="#calling-sprocs">Calling sprocs from sqlalchemy and pyodbc</a></li>
<li><a href="#errors">Other misc errors</a></li>
<li><a href="#install-pyodbc-sl6">Install pyodbc on Scientific Linux 6</a></li>
<li><a href="#see-also">See also</a></li>
<li><a href="#location-odbcinst-file">How to find the location of the odbcinst.ini file</a></li>
</ul>
<h4 id="install">Install (on Ubuntu)</h4>
<p>Set up odbc, freetds. Notes thanks to
<a href="http://www.pauldeden.com/2008/12/how-to-setup-pyodbc-to-connect-to-mssql.html">
http://www.pauldeden.com/2008/12/how-to-setup-pyodbc-to-connect-to-mssql.html
</a>
</p>
<pre class="console">$ sudo apt-get install unixodbc unixodbc-dev freetds-dev tdsodbc </pre>
<p>Edit <code>/etc/odbcinst.ini</code>:</p>
<pre>[FreeTDS]
Description = TDS driver (Sybase/MS SQL)
Driver = /usr/lib/odbc/libtdsodbc.so
Setup = /usr/lib/odbc/libtdsS.so
CPTimeout =
CPReuse =
TDS Version = 7.2
client charset = utf-8</pre>
<p><em>UPDATE 2012-07-09:</em> On Ubuntu 12.04 (64-bit), the libraries are in a different location.
Also, Ubuntu 12.04 includes FreeTDS version 0.91.
</p>
<pre>[FreeTDS]
Description = TDS driver (Sybase/MS SQL)
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
CPTimeout =
CPReuse =
TDS Version = 7.2
client charset = utf-8</pre>
<p>Install pyodbc</p>
<pre class="console">$ pip install pyodbc </pre>
<p>Install sqlalchemy</p>
<pre class="console">$ pip install SQLAlchemy </pre>
<p>This installed the following versions:</p>
<pre class="console">$ apt-cache show tdsodbc | grep Version:
Version: 0.82-7</pre>
<pre class="console">$ odbcinst --version
unixODBC 2.2.14</pre>
<pre class="console">$ pip freeze | grep pyodbc
pyodbc==2.1.11</pre>
<pre class="console">$ pip freeze | grep Alchemy
SQLAlchemy==0.7.3</pre>
<h4 id="connect-tsql">Try to connect using tsql (FreeTDS)</h4>
<p>Run the following command to ensure you can connect using tsql.
For more information see: <a href="http://www.freetds.org/userguide/confirminstall.htm">http://www.freetds.org/userguide/confirminstall.htm</a>
See also: <a href="http://www.freetds.org/userguide/serverthere.htm">FreeTDS User Guide: Chapter 8. Troubleshooting: Is the server there?</a>
</p>
<pre class="console">$ tsql -S 10.2.14.18 -U myusername -P mypassword
locale is "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=en_US.UTF-8;LC_TIME=en_US.UTF-8;LC_COLLATE=C;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=en_US.UTF-8;LC_ADDRESS=en_US.UTF-8;LC_TELEPHONE=en_US.UTF-8;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=en_US.UTF-8"
locale charset is "UTF-8"
using default charset "UTF-8"
1>
</pre>
<h4 id="connect-python">Try to connect using Python</h4>
<p>pyodbc - this works</p>
<pre class="python">
import pyodbc
conn = pyodbc.connect(
'DRIVER={FreeTDS};SERVER=dev01.myserver.com;'
'DATABASE=MyDatabase;UID=MyUsername;PWD=mypassword;port=1234;'
'TDS_Version=8.0;')
print conn
</pre>
<p>sqlalchemy - this doesn't work</p>
<pre class="python">
import sqlalchemy
engine = sqlalchemy.create_engine(
'mssql+pyodbc://MyUsername:mypassword@dev01.myserver.com:1234/MyDatabase')
conn = engine.connect()
print conn
</pre>
<pre>
Traceback (most recent call last):
File "debug_sqlalchemy.py", line 22, in <module>
conn = engine.connect()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 2310, in connect
return self._connection_cls(self, **kwargs)
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 872, in __init__
self.__connection = connection or engine.raw_connection()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 2396, in raw_connection
return self.pool.unique_connection()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/pool.py", line 169, in unique_connection
return _ConnectionFairy(self).checkout()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/pool.py", line 370, in __init__
rec = self._connection_record = pool._do_get()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/pool.py", line 696, in _do_get
con = self._create_connection()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/pool.py", line 174, in _create_connection
return _ConnectionRecord(self)
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/pool.py", line 255, in __init__
self.connection = self.__connect()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/pool.py", line 315, in __connect
connection = self.__pool._creator()
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/engine/strategies.py", line 80, in connect
return dialect.connect(*cargs, **cparams)
File "/home/saltycrane/.virtualenvs/myproject/lib/python2.6/site-packages/sqlalchemy/engine/default.py", line 275, in connect
return self.dbapi.connect(*cargs, **cparams)
sqlalchemy.exc.DBAPIError: (Error) ('IM002', '[IM002] [unixODBC][Driver Manager]Data source name not found, and no default driver specified (0) (SQLDriverConnectW)') None None
</pre>
<p>sqlalchemy - this works</p>
<pre class="python">
import sqlalchemy
import urllib
engine = sqlalchemy.create_engine(
'mssql+pyodbc:///?odbc_connect=%s' % (
urllib.quote_plus(
'DRIVER={FreeTDS};SERVER=dev01.myserver.com;'
'DATABASE=MyDatabase;UID=MyUsername;PWD=mypassword;port=1234;'
'TDS_Version=8.0;')))
conn = engine.connect()
print conn
</pre>
<p>sqlalchemy - this works also</p>
<pre class="python">
import pyodbc
import sqlalchemy
def connect():
return pyodbc.connect(
'DRIVER={FreeTDS};SERVER=dev01.myserver.com;'
'DATABASE=MyDatabase;UID=MyUsername;PWD=mypassword;port=1234;'
'TDS_Version=8.0;')
engine = sqlalchemy.create_engine('mssql://', creator=connect)
conn = engine.connect()
print conn
</pre>
<p>You can also connect by specifying the path to the driver directly and bypass /etc/odbcinst.ini.
Not sure of the disadvantages of doing this.<p>
<pre class="python">
SQL_SERVER_HOST = '10.5.18.11'
SQL_SERVER_UID = 'myusername'
SQL_SERVER_PWD = 'mypassword'
FREETDS_DRIVER = '/usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so'
before_quoted = ';'.join(
[
'DRIVER={}'.format(FREETDS_DRIVER),
'SERVER={}'.format(SQL_SERVER_HOST),
'DATABASE=mydatabase',
'UID={}'.format(SQL_SERVER_UID),
'PWD={}'.format(SQL_SERVER_PWD),
'PORT=1433',
'TDS_VERSION=7.2'
]
)
print before_quoted + '\n'
connection_string = 'mssql+pyodbc:///?odbc_connect={}'.format(
urllib.quote_plus(before_quoted))
print connection_string + '\n'
engine = sqlalchemy.create_engine(connection_string)
print engine
</pre>
<pre>DRIVER=/usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so;SERVER=10.5.18.11;DATABASE=mydatabase;UID=myusername;PWD=mypassword;PORT=1433;TDS_VERSION=7.2
mssql+pyodbc:///?odbc_connect=DRIVER%3D%2Fusr%2Flib%2Fx86_64-linux-gnu%2Fodbc%2Flibtdsodbc.so%3BSERVER%3D10.5.18.11%3BDATABASE%3Dmydatabase%3BUID%3Dmyusername%3BPWD%3Dmypassword%3BPORT%3D1433%3BTDS_VERSION%3D7.2
Engine(mssql+pyodbc:///?odbc_connect=DRIVER=/usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so;SERVER=10.5.18.11;DATABASE=mydatabase;UID=myusername;PWD=mypassword;PORT=1433;TDS_VERSION=7.2)
</pre>
<h5>References</h5>
<ul>
<li><a href="http://www.pauldeden.com/2008/12/how-to-setup-pyodbc-to-connect-to-mssql.html">
http://www.pauldeden.com/2008/12/how-to-setup-pyodbc-to-connect-to-mssql.html
</a></li>
<li><a href="http://www.sqlalchemy.org/docs/dialects/mssql.html#id1">
http://www.sqlalchemy.org/docs/dialects/mssql.html#id1
</a></li>
<li><a href="http://kipb7.wordpress.com/2008/06/12/pyodbc-unixodbc-freetds-config/">
http://kipb7.wordpress.com/2008/06/12/pyodbc-unixodbc-freetds-config/
</a></li>
<li><a href="http://stackoverflow.com/questions/4493614/sqlalchemy-equivalent-of-pyodbc-connect-string-using-freetds">
http://stackoverflow.com/questions/4493614/sqlalchemy-equivalent-of-pyodbc-connect-string-using-freetds
</a></li>
</ul>
<h4 id="unicode-error">Error inserting unicode</h4>
<p>If you get this "Error: The incoming tabular data stream (TDS)
remote procedure call (RPC) protocol stream is incorrect." error during insert, it
may be due to unicode in your data.</p>
<pre>Traceback (most recent call last):
File "tmp.py", line 10, in <module>
(unicode(u'some unicode data: \xa1').encode("utf-8"), '123', '123456'),
pyodbc.ProgrammingError: ('42000', '[42000] [FreeTDS][SQL Server]The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 3: The parameter name is invalid. (8005) (SQLExecDirectW)')</pre>
<p>The fix for me was to change the TDS version number. "7.0" and "8.0" did not work. "7.1", "7.2", "9.0", and some others worked as well.
There is
<a href="http://www.freetds.org/userguide/tdshistory.htm">this FreeTDS user guide page on version numbers</a>,
and <a href="http://www.freetds.org/userguide/choosingtdsprotocol.htm">also this other user guide page</a>,
but I'm not sure I'm clear on it. The following works for me:
</p>
<pre class="python">import pyodbc
conn = pyodbc.connect(
'DRIVER={FreeTDS};SERVER=dev01.myserver.com;'
'DATABASE=MyDatabase;UID=MyUsername;PWD=mypassword;port=1234;'
'TDS_Version=7.2;')
cursor = conn.cursor()
cursor.execute(
"""exec MyStoredProc @Param1 = ?, @Param2 = ?, @Param3 = ?""",
(unicode(u'some unicode data: \xa1').encode("utf-8"), '123', '123456'),
)
conn.commit()</pre>
<h5><code>Unrecognized server version info</code> warning</h5>
<p>When I changed <code>TDS_Version</code> to 7.2, I got the following warning:</p>
<pre>/myvirtualenv/lib/python2.6/site-packages/sqlalchemy/connectors/pyodbc.py:119: SAWarning: Unrecognized server version info '95.10.13055'. Version specific behaviors may not function properly. If using ODBC with FreeTDS, ensure server version 7.0 or 8.0, not 4.2, is configured in the FreeTDS configuration.</pre>
<p>This bothered me for a while, but later, when I
<a href="/blog/2011/09/notes-sqlalchemy-w-pyodbc-freetds-ubuntu/#pyodbc-freetds-truncates-varchar-instead-of-error">
upgraded freetds to 0.91</a>, this warning went away.</p>
<h5>See also these pages for more on unicode and pyodbc:</h5>
<ul>
<li><a href="http://stackoverflow.com/questions/947077/using-pyodbc-on-linux-to-insert-unicode-or-utf-8-chars-in-a-nvarchar-mssql-field">
http://stackoverflow.com/questions/947077/using-pyodbc-on-linux-to-insert-unicode-or-utf-8-chars-in-a-nvarchar-mssql-field
</a></li>
<li><a href="http://stackoverflow.com/questions/2192982/django-sql-server-2005-text-encoding-problem">
http://stackoverflow.com/questions/2192982/django-sql-server-2005-text-encoding-problem
</a></li>
</ul>
<h4 id="select-minus-1-bug">Select -1 bug</h4>
<p>Lin alerted me to this bug in pyodbc 2.1.11:
<a href="http://code.google.com/p/pyodbc/issues/detail?id=157">Issue 157: Negative integers are retrieved as positive integers</a>
Executing a <code>SELECT -1</code> returns 2^32-1 (4294967295).
Solution is to apply
<a href="http://code.google.com/p/pyodbc/issues/attachmentText?id=157&aid=1570005000&name=p&token=c5ff21e481c7cdf536d8e138dbab03be">
this patch</a> or upgrade to 3.0.x of pyodbc. I had trouble installing some versions of
pyodbc. This is what worked for me:
<pre class="console">$ pip install --upgrade -e git+https://code.google.com/p/pyodbc#egg=pyodbc </pre>
This gave me:
<pre class="console">$ pip freeze | grep pyodbc
-e git+https://code.google.com/p/pyodbc@404a3ba414ab15863c34db9c49ff0a02caa63600#egg=pyodbc-dev </pre>
<pre>$ python
Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyodbc
>>> pyodbc.version
'3.0.4-beta01'</pre>
</p>
<h4 id="list-tables">How to list the tables in the database</h4>
<pre class="python">import urllib
import sqlalchemy
# Recommended to be a module-level variable
# http://www.sqlalchemy.org/docs/core/connections.html
engine = sqlalchemy.create_engine(
'mssql+pyodbc:///?odbc_connect=%s' % (
urllib.quote_plus(
'DRIVER={FreeTDS};SERVER=myserver.mydomain.com;'
'DATABASE=MyDatabase;UID=myusername;PWD=mypassword;port=12345;'
'TDS_Version=7.2;')))
metadata = sqlalchemy.MetaData()
metadata.bind = engine
metadata.reflect()
for t in metadata.sorted_tables:
print t.name</pre>
<h4 id="pyodbc-freetds-truncates-varchar-instead-of-error">pyodbc/FreeTDS truncates a varchar(max) upon insert instead of raising an error</h4>
<br><p>When trying to insert a string with more than 100 characters into a varchar(100) field, pyodbc/FreeTDS truncated the string instead of raising an error.
This occurred because I was using FreeTDS version 0.82. I found this out from
<a href="http://comments.gmane.org/gmane.comp.db.tds.freetds/13751">this FreeTDS mailing list thread</a>.
The solution is to upgrade to FreeTDS version 0.91. I could not find any APT packages for Ubuntu 10.10.
Looks like there is a package for Ubuntu Precise (12.04). So I compiled it from source.
This assumes you've installed unixodbc as described above:
<code>$ sudo apt-get install unixodbc unixodbc-dev </code>
</p>
<pre class="console">$ cd /tmp
$ wget http://ibiblio.org/pub/Linux/ALPHA/freetds/stable/freetds-stable.tgz
$ tar zxvf freetds-stable.tgz
$ cd freetds-0.91
$ ./configure
$ make
$ sudo make install </pre>
<p>Then edit odbcinst.ini to point at the new library /usr/local/lib/libtdsodbc.so:</p>
<pre>[FreeTDS]
Description = TDS driver (Sybase/MS SQL)
Driver = /usr/local/lib/libtdsodbc.so
Setup = /usr/lib/odbc/libtdsS.so
CPTimeout =
CPReuse =
TDS Version = 7.2
client charset = utf-8</pre>
<p>After upgrading, pyodbc now gives me the following error instead of failing silently.</p>
<pre>pyodbc.DataError: ('22001', '[22001] [FreeTDS][SQL Server]String or binary data would be truncated. (8152) (SQLExecDirectW)')</pre>
<p>References:</p>
<ul>
<li><a href="http://www.unixodbc.org/doc/FreeTDS.html">http://www.unixodbc.org/doc/FreeTDS.html
</a></li>
<li><a href="http://serverfault.com/questions/312895/compiling-freetds-0-91-on-ubuntu-11-04-x64">
http://serverfault.com/questions/312895/compiling-freetds-0-91-on-ubuntu-11-04-x64
</a></li>
<li><a href="http://www.freetds.org/userguide/config.htm#EVERYONE">
http://www.freetds.org/userguide/config.htm#EVERYONE
</a></li>
</ul>
<h4 id="segfault">Segfault error when using FreeTDS 0.91 and pyodbc 2.1.11</h4>
<p>Got this error trying to connect to a database on production (Scientific Linux 6)</p>
<pre>*** glibc detected *** python: free(): invalid next size (fast): 0x00000000019ed440 ***</pre>
<p>Solution was to upgrade pyodbc to 3.0.5 because there was a bug.</p>
<ul>
<li><a href="http://code.google.com/p/pyodbc/issues/detail?id=170">
http://code.google.com/p/pyodbc/issues/detail?id=170
</a></li>
<li><a href="http://code.google.com/p/pyodbc/issues/detail?id=212">
http://code.google.com/p/pyodbc/issues/detail?id=212
</a></li>
</ul>
<h4 id="sql-server-equivalent-of-mysqls-limit">What is the MS SQL server equivalent of MySQL's LIMIT in a SELECT</h4>
<p>TOP
See <a href="http://www.w3schools.com/sql/sql_top.asp">http://www.w3schools.com/sql/sql_top.asp</a>
</p>
<pre>SELECT TOP * FROM mytable</pre>
<h4 id="versions">How to get the versions of stuff</h4>
<ul>
<li>unixODBC
<pre>$ odbcinst --version
unixODBC 2.3.0</pre>
or
<pre>$ odbc_config --version
2.3.0</pre>
</li>
<li>freetds
<pre>$ tsql -C
Compile-time settings (established with the "configure" script)
Version: freetds v0.91
freetds.conf directory: /etc
MS db-lib source compatibility: yes
Sybase binary compatibility: yes
Thread safety: yes
iconv library: yes
TDS version: 4.2
iODBC: no
unixodbc: yes
SSPI "trusted" logins: no
Kerberos: no</pre>
</li>
<li>pyodbc
<pre>$ python
Python 2.7.0+ (r27:82500, Sep 15 2010, 18:14:55)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyodbc
>>> pyodbc.version
'3.0.5'</pre>
</li>
</ul>
<h4 id="microsoft-driver">Trying out the Microsoft ODBC driver for Linux</h4>
<p>This was installed on Scientific Linux 6. Ubuntu is not supported.</p>
<ul>
<li><a href="http://blogs.technet.com/b/dataplatforminsider/archive/2011/11/28/available-today-preview-release-of-the-sql-server-odbc-driver-for-linux.aspx">
http://blogs.technet.com/b/dataplatforminsider/archive/2011/11/28/available-today-preview-release-of-the-sql-server-odbc-driver-for-linux.aspx
</a></li>
<li><a href="http://www.microsoft.com/download/en/details.aspx?id=28160">
http://www.microsoft.com/download/en/details.aspx?id=28160
</a></li>
<li><a href="http://msdn.microsoft.com/en-us/library/hh568455%28v=sql.110%29.aspx">
http://msdn.microsoft.com/en-us/library/hh568455%28v=sql.110%29.aspx
</a></li>
<li><a href="http://www.connectionstrings.com/sql-server-2008">
http://www.connectionstrings.com/sql-server-2008
</a></li>
</ul>
<p>Install make</p>
<pre>$ yum install make </pre>
<p>Followed instructions to install the unixODBC driver manager and the driver.
It installed OK.</p>
<pre class="console">
$ sudo su root
$ tar xvf sqlncli-11.0.1720.0.tar.gz
$ cd sqlncli-11.0.1720.0
$ ./build_dm.sh --help
$ ./build_dm.sh --download-url=file://../unixODBC-2.3.0.tar.gz
$ cd /tmp/unixODBC.8922.6568.26033/unixODBC-2.3.0; make install
$ sudo su root
$ tar xvf sqlncli-11.0.1720.0.tar.gz
$ cd sqlncli-11.0.1720.0
$ ./install.sh --help
$ cp /etc/odbcinst.ini /etc/odbcinst.ini.bak
$ ./install.sh verify
Microsoft SQL Server ODBC Driver V1.0 for Linux Installation Script
Copyright Microsoft Corp.
Starting install for Microsoft SQL Server ODBC Driver V1.0 for Linux
Checking for 64 bit Linux compatible OS ..................................... OK
Checking required libs are installed ........................................ OK
unixODBC utilities (odbc_config and odbcinst) installed ..................... OK
unixODBC Driver Manager version 2.3.0 or later installed .................... OK
unixODBC Driver Manager configuration correct .............................. OK*
Microsoft SQL Server ODBC Driver V1.0 for Linux already installed .... NOT FOUND
$ ./install.sh install
...
Enter YES to accept the license or anything else to terminate the installation: YES
Checking for 64 bit Linux compatible OS ..................................... OK
Checking required libs are installed ........................................ OK
unixODBC utilities (odbc_config and odbcinst) installed ..................... OK
unixODBC Driver Manager version 2.3.0 or later installed .................... OK
unixODBC Driver Manager configuration correct .............................. OK*
Microsoft SQL Server ODBC Driver V1.0 for Linux already installed .... NOT FOUND
Microsoft SQL Server ODBC Driver V1.0 for Linux files copied ................ OK
Symbolic links for bcp and sqlcmd created ................................... OK
Microsoft SQL Server ODBC Driver V1.0 for Linux registered ........... INSTALLED
Install log created at /tmp/sqlncli.31091.8029.6563/install.log.
One or more steps may have an *. See README for more information regarding
these steps.
$ odbcinst -q -d -n "SQL Server Native Client 11.0"
[SQL Server Native Client 11.0]
Description=Microsoft SQL Server ODBC Driver V1.0 for Linux
Driver=/opt/microsoft/sqlncli/lib64/libsqlncli-11.0.so.1720.0
UsageCount=1
</pre>
<p>Connection string: Driver is "SQL Server Native Client 11.0".
Port is specified with a comma after the host. Thanks to
<a href="http://www.connectionstrings.com/sql-server-2008">
http://www.connectionstrings.com/sql-server-2008
</a></p>
<pre> connection = pyodbc.connect(
';'.join([
'DRIVER={SQL Server Native Client 11.0}',
'SERVER=%s,%s' % (db_profile['host'], db_profile['port']),
'DATABASE=%s' % db_profile['name'],
'UID=%s' % db_profile['user'],
'PWD=%s' % db_profile['password'],
]) + ';')</pre>
<p>Ran my app and got the following error message:</p>
<pre>Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib '/opt/microsoft/sqlncli/lib64/libsqlncli-11.0.so.1720.0' : file not found (0) (SQLDriverConnect)")</pre>
<p>That's a dumb error message.
<code>/opt/microsoft/sqlncli/lib64/libsqlncli-11.0.so.1720.0</code> existed and it had the correct
permissions. Turns out it was missing dependencies. Google help:
<a href="http://mailman.unixodbc.org/pipermail/unixodbc-support/2011-July/002902.html">
http://mailman.unixodbc.org/pipermail/unixodbc-support/2011-July/002902.html
</a> and
<a href="http://www.linuxquestions.org/questions/fedora-35/how-to-install-libcrypto-so-4-a-594511/">
http://www.linuxquestions.org/questions/fedora-35/how-to-install-libcrypto-so-4-a-594511/
</a>.
</p>
<pre>$ ldd /opt/microsoft/sqlncli/lib64/libsqlncli-11.0.so.1720.0
linux-vdso.so.1 => (0x00007fff5258d000)
libcrypto.so.6 => not found
libdl.so.2 => /lib64/libdl.so.2 (0x00007feec91fd000)
libodbc.so.1 => /usr/lib64/libodbc.so.1 (0x00007feec8f90000)
librt.so.1 => /lib64/librt.so.1 (0x00007feec8d88000)
libssl.so.6 => not found
libuuid.so.1 => /lib64/libuuid.so.1 (0x00007feec8b83000)
libodbcinst.so.1 => /usr/lib64/libodbcinst.so.1 (0x00007feec896c000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007feec868c000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007feec844b000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007feec8144000)
libm.so.6 => /lib64/libm.so.6 (0x00007feec7ebf000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007feec7ca9000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007feec7a8d000)
libc.so.6 => /lib64/libc.so.6 (0x00007feec76eb000)
/lib64/ld-linux-x86-64.so.2 (0x00007feec97b0000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007feec74bf000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007feec72bc000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007feec70b1000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007feec6eaf000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007feec6c95000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007feec6a75000)</pre>
<p>libcrypto.so.6 and libssl.so.6 are missing.</p>
<pre>$ yum install openssl098e
$ ldd /opt/microsoft/sqlncli/lib64/libsqlncli-11.0.so.1720.0
linux-vdso.so.1 => (0x00007fff5c7fe000)
libcrypto.so.6 => /usr/lib64/libcrypto.so.6 (0x00007f7ebcdd8000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f7ebcbd4000)
libodbc.so.1 => /usr/lib64/libodbc.so.1 (0x00007f7ebc966000)
librt.so.1 => /lib64/librt.so.1 (0x00007f7ebc75e000)
libssl.so.6 => /usr/lib64/libssl.so.6 (0x00007f7ebc50f000)
libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f7ebc30a000)
libodbcinst.so.1 => /usr/lib64/libodbcinst.so.1 (0x00007f7ebc0f3000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f7ebbe14000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f7ebbbd1000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f7ebb8ca000)
libm.so.6 => /lib64/libm.so.6 (0x00007f7ebb646000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f7ebb42f000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7ebb213000)
libc.so.6 => /lib64/libc.so.6 (0x00007f7ebae72000)
libz.so.1 => /lib64/libz.so.1 (0x00007f7ebac5c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7ebd4e3000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f7ebaa59000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f7eba82d000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f7eba621000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f7eba41f000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f7eba205000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f7eb9fe5000)</pre>
<p>Good. It works.</p>
<h4 id="install-mxodbc">Install mxodbc</h4>
<pre class="console">$ python -c 'print("UCS%s"%len(u"x".encode("unicode-internal")))'
UCS4
$ pip install http://downloads.egenix.com/python/egenix-mx-base-3.2.3.linux-x86_64-py2.7_ucs4.prebuilt.zip
$ pip install http://downloads.egenix.com/python/egenix-mxodbc-3.1.1.linux-x86_64-py2.7_ucs4.prebuilt.zip
$ #copy license files to "site-packages" directory
</pre>
<h4 id="unixodbc-gui-tools">unixODBC GUI tools</h4>
<p>I didn't end up actually using this. I think this is useful if you need to set up DSNs (e.g. if you're using mxodbc).</p>
<pre class="console">$ sudo apt-get install unixodbc-bin </pre>
<p>Description: Graphical tools for ODBC management and browsing
This package contains three graphical applications for use with
unixODBC, the Open DataBase Connectivity suite: ODBCConfig, a graphical
configuration tool for managing database drivers and access to
individual databases; DataManager, a simple browser and query tool for
ODBC databases; and odbctest, a tool for testing the ODBC API itself.
</p>
<p>Provides:</p>
<pre>/usr/bin/DataManager
/usr/bin/DataManagerII
/usr/bin/ODBCConfig
/usr/bin/odbctest</pre>
<h4 id="sql-developer-ubuntu">Installing SQL Developer on Ubuntu</h4>
<p>For a GUI interface to MS SQL Server, I like Oracle's SQL Developer better than Eclipse. Thanks Kris. (Too bad it doesn't execute stored procedures.)</p>
<pre>
- $ sudo apt-get install openjdk-6-jdk
- Create an Oracle account
- Download "Oracle SQL Developer for other platforms" from http://www.oracle.com/technetwork/developer-tools/sql-developer/downloads/index.html
- $ unzip sqldeveloper-3.1.07.42-no-jre.zip
- $ cd sqldeveloper
- $ sh sqldeveloper.sh
Type the full pathname of a J2SE installation (or Ctrl-C to quit), the path will be stored in ~/.sqldeveloper/jdk
usr/lib/jvm/java-6-openjdk
http://djiang.wordpress.com/2007/02/27/using-oracle-sqldeveloper-to-access-sqlserver/
Download jTDS from here: http://sourceforge.net/projects/jtds/
mkdir jtds-1.2.5
mv jtds-1.2.5-dist.zip jtds-1.2.5
cd jtds-1.2.5
unzip jtds-1.2.5-dist.zip
- Start Oracle’s SQLDeveloper, Tools->Preferences->Database->Third Party JDBC Drivers. Click “Add Entry” and point to the jtds-1.2.jar
- File -> New... -> Database Connection -> choose SQLServer tab, type in hostname, port, username and password. It appears that the initial connection name has to be the same as the database and you can click the “Retrieve database” button. Once you found the database, you can rename the connection.
</pre>
<h4 id="calling-sprocs">Calling stored procs from sqlalchemy and pyodbc</h4>
<pre class="python">
connection_string = ';'.join([
'DRIVER={FreeTDS}',
'SERVER=myhost.com',
'DATABASE=my_database',
'UID=myusername',
'PWD=mypassword',
'port=1433',
'TDS_Version=7.2',
]) + ';'
print connection_string
def use_pyodbc():
import pyodbc
connection = pyodbc.connect(connection_string)
print connection
sql = " ".join([
"EXEC [dbo].my_sproc",
"@my_param = ?",
])
print sql
cursor = connection.cursor()
result = cursor.execute(sql, ('my_value',))
for r in result.fetchall():
print r
print r.a_param
connection.commit()
def use_sqlalchemy():
import urllib
import sqlalchemy
engine = sqlalchemy.create_engine(
'mssql+pyodbc:///?odbc_connect=%s' % (
urllib.quote_plus(connection_string)))
print engine
sql = sqlalchemy.sql.text(" ".join([
"EXEC [dbo].my_sproc",
"@my_param = :my_param",
]))
print sql
result = engine.execute(sql, {'my_param': 'my_value'})
for r in result.fetchall():
for k in r.keys():
try:
print '%50s %s' % (k, getattr(r, k))
except sqlalchemy.exc.InvalidRequestError as e:
print e
break
</pre>
<h4 id="install-pyodbc-sl6">Install pyodbc on Scientific Linux 6</h4>
<p>you will also need freetds installed</p>
<pre class="console">$ yum install unixODBC-devel
$ yum install gcc-c++
$ pip install pyodbc </pre>
<h4 id="errors">Other misc errors</h4>
<ul>
<li>
<pre>ProgrammingError: (ProgrammingError) ('42000', '[42000] [FreeTDS][SQL Server]Statement(s) could not be prepared. (8180) (SQLExecDirectW)') </pre>
Problem was missing commas.
</li>
<li>
<pre>DBAPIError: (Error) ('IM002', '[IM002] [unixODBC][Driver Manager]Data source name not found, and no default driver specified (0) (SQLDriverConnect)') None None</pre>
<ul>
<li>This occurred when I could not connect to the database server. To troubleshoot this, see:
<a href="http://www.freetds.org/userguide/serverthere.htm">FreeTDS User Guide: Chapter 8. Troubleshooting: Is the server there?</a>
</li>
<li>This also occurred when I did not configure <code>/etc/odbcinst.ini</code> correctly
as described in the <a href="#install">Install section</a>
</li>
</ul>
</li>
</ul>
<h4 id="see-also">See also</h4>
<ul>
<li><a href="http://www.easysoft.com/developer/interfaces/odbc/linux.html">
Easysoft Linux ODBC documentation
</a></li>
</ul>
<h4 id="location-odbcinst-file">How to find the location of the odbcinst.ini file</h4>
<pre class="console">$ odbcinst -j
unixODBC 2.3.0
DRIVERS............: /etc/odbcinst.ini
SYSTEM DATA SOURCES: /etc/odbc.ini
FILE DATA SOURCES..: /etc/ODBCDataSources
USER DATA SOURCES..: /home/eliot/.odbc.ini
SQLULEN Size.......: 8
SQLLEN Size........: 8
SQLSETPOSIROW Size.: 8</pre>
<h4 id="trace-odbc">Trace odbc</h4>
<p><a href="http://www.easysoft.com/support/kb/kb00945.html">http://www.easysoft.com/support/kb/kb00945.html</a>
Add the following to the odbcinst.ini file. Be sure to turn it off after debugging.</p>
<pre>[ODBC]
Trace = yes
TraceFile = /tmp/my_odbc_tracfile.log</pre>
Python MongoDB notes
2010-02-22T15:01:23-08:00https://www.saltycrane.com/blog/2010/02/python-mongodb-notes/<p><a href="http://www.mongodb.org/display/DOCS/Home">MongoDB</a> is a
popular new
<a href="http://blog.mongodb.org/post/119945109/why-schemaless">schemaless</a>,
document-oriented,
<a href="http://en.wikipedia.org/wiki/NoSQL">NoSQL</a> database.
It is <a href="http://blog.mongodb.org/post/116405435/capped-collections">useful</a>
for <a href="http://blog.mongodb.org/post/172254834/mongodb-is-fantastic-for-logging">logging</a>
and <a href="http://blog.mongodb.org/post/171353301/using-mongodb-for-real-time-analytics">real-time
analytics</a>. I'm working on a tool
to store log files from multiple remote hosts to MongoDB, then analyze it
in real-time and print pretty plots. My work in progress is located
<a href="http://github.com/saltycrane/live-log-analyzer">on github</a>.
</p>
<p>Here are my first steps using
<a href="http://api.mongodb.org/python">PyMongo</a>.
I store an Apache access log to MongoDB and then
query it for the number of requests in the last minute. I am running on
Ubuntu Karmic 32-bit (though I think MongoDB really
<a href="http://www.mongodb.org/display/DOCS/Developer+Zone">wants to run
on 64-bit</a>).
</p>
<h4>Install and run MongoDB</h4>
<ul>
<li>Download and install MongoDB
(<a href="http://www.mongodb.org/display/DOCS/Quickstart+Unix">Reference</a>)
<pre>cd ~/lib
curl http://downloads.mongodb.org/linux/mongodb-linux-i686-latest.tgz | tar zx
ln -s mongodb-linux-i686-2010-02-22 mongodb</pre>
</li>
<li>Create data directory
<pre>mkdir -p ~/var/mongodb/db</pre>
</li>
<li>Run MongoDB
(<a href="http://www.mongodb.org/display/DOCS/Getting+Started">Reference</a>)
<pre>~/lib/mongodb/bin/mongod --dbpath ~/var/mongodb/db</pre>
</li>
</ul>
<h4>Install PyMongo</h4>
<ul>
<li><a href="http://www.saltycrane.com/blog/2010/02/how-install-pip-ubuntu/">
Install pip</a>
</li>
<li>Install PyMongo
(<a href="http://api.mongodb.org/python/1.4%2B/installation.html">Reference</a>)
<pre>sudo pip install pymongo</pre>
</li>
</ul>
<h4>Simple Example</h4>
<p>writer.py:</p>
<pre class="python">import re
from datetime import datetime
from subprocess import Popen, PIPE, STDOUT
from pymongo import Connection
from pymongo.errors import CollectionInvalid
HOST = 'us-apa1'
LOG_PATH = '/var/log/apache2/http-mydomain.com-access.log'
DB_NAME = 'mydb'
COLLECTION_NAME = 'apache_access'
MAX_COLLECTION_SIZE = 5 # in megabytes
def main():
# connect to mongodb
mongo_conn = Connection()
mongo_db = mongo_conn[DB_NAME]
try:
mongo_coll = mongo_db.create_collection(COLLECTION_NAME,
capped=True,
size=MAX_COLLECTION_SIZE*1048576)
except CollectionInvalid:
mongo_coll = mongo_db[COLLECTION_NAME]
# open remote log file
cmd = 'ssh -f %s tail -f %s' % (HOST, LOG_PATH)
p = Popen(cmd, shell=True, stdout=PIPE, stderr=STDOUT)
# parse and store data
while True:
line = p.stdout.readline()
data = parse_line(line)
data['time'] = convert_time(data['time'])
mongo_coll.insert(data)
def parse_line(line):
"""Apache combined log format
%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"
"""
m = re.search(' '.join([
r'(?P<host>(\d+\.){3}\d+)',
r'.*',
r'\[(?P<time>[^\]]+)\]',
r'"\S+ (?P<url>\S+)',
]), line)
if m:
return m.groupdict()
else:
return {}
def convert_time(time_str):
time_str = re.sub(r' -\d{4}', '', time_str)
return datetime.strptime(time_str, "%d/%b/%Y:%H:%M:%S")
if __name__ == '__main__':
main()</pre>
<p>reader.py:</p>
<pre class="python">import time
from datetime import datetime, timedelta
from pymongo import Connection
DB_NAME = 'mydb'
COLLECTION_NAME = 'apache_access'
def main():
# connect to mongodb
mongo_conn = Connection()
mongo_db = mongo_conn[DB_NAME]
mongo_coll = mongo_db[COLLECTION_NAME]
# find the number of requests in the last minute
while True:
d = datetime.now() - timedelta(seconds=60)
N_requests = mongo_coll.find({'time': {'$gt': d}}).count()
print 'Requests in the last minute:', N_requests
time.sleep(2)
if __name__ == '__main__':
main()</pre>
<p>Running <code>python writer.py</code> in one terminal and
<code>python reader.py</code> in another terminal, I get the
following results:</p>
<pre>Requests in the last minute: 13
Requests in the last minute: 14
Requests in the last minute: 14
Requests in the last minute: 14
Requests in the last minute: 13
Requests in the last minute: 14
Requests in the last minute: 15
...</pre>
<h4>Related Documentation</h4>
<ul>
<li><a href="http://www.mongodb.org/display/DOCS/Advanced+Queries">
Advanced Queries - MongoDB Documentation</a></li>
<li><a href="http://www.mongodb.org/display/DOCS/Aggregation">
Aggregation - MongoDB Documentation</a></li>
<li><a href="http://api.mongodb.org/python/1.5.2%2B/api/pymongo/collection.html">
Collection level operations — PyMongo v1.5.2+ documentation</a></li>
</ul>
Using Python to write to an Excel / OpenOffice Calc spreadsheet on Ubuntu Linux
2010-02-15T17:41:25-08:00https://www.saltycrane.com/blog/2010/02/using-python-write-excel-openoffice-calc-spreadsheet-ubuntu-linux/<p>Via <a href="http://panela.blog-city.com/pyexcelerator_xlwt_cheatsheet_create_native_excel_from_pu.htm">
Matt Harrison's blog post</a>, here is how to write Excel or OpenOffice.org Calc
spreadsheet files using Python and the <a href="http://pypi.python.org/pypi/xlwt">xlwt</a>
library. Xlwt is a fork of <a href="http://pypi.python.org/pypi/pyExcelerator/">
pyExcelerator</a> which handles only writing spreadsheet files. For reading
spreadsheets, see <a href="http://pypi.python.org/pypi/xlrd">xlrd</a>.
Note, these libraries don't use <a href="http://en.wikipedia.org/wiki/Component_Object_Model">
COM</a>, so they will work on non-Windows OSes, such as Linux. For more information,
see <a href="http://panela.blog-city.com/pyexcelerator_xlwt_cheatsheet_create_native_excel_from_pu.htm">
Matt's blog post</a>. He even has a <a href="http://files.blog-city.com/files/F05/96843/b/cheatsheet.pdf">
PDF cheat sheet</a>.
</p>
<ul>
<li><a href="http://www.saltycrane.com/blog/2010/02/how-install-pip-ubuntu/">
Install pip</a></li>
<li>Install xlwt
<pre>sudo pip install xlwt</pre></li>
<li>Create an example script:
<pre class="python">import xlwt
DATA = (("The Essential Calvin and Hobbes", 1988,),
("The Authoritative Calvin and Hobbes", 1990,),
("The Indispensable Calvin and Hobbes", 1992,),
("Attack of the Deranged Mutant Killer Monster Snow Goons", 1992,),
("The Days Are Just Packed", 1993,),
("Homicidal Psycho Jungle Cat", 1994,),
("There's Treasure Everywhere", 1996,),
("It's a Magical World", 1996,),)
wb = xlwt.Workbook()
ws = wb.add_sheet("My Sheet")
for i, row in enumerate(DATA):
for j, col in enumerate(row):
ws.write(i, j, col)
ws.col(0).width = 256 * max([len(row[0]) for row in DATA])
wb.save("myworkbook.xls")</pre>
</li>
<li>Results:
<a href="http://picasaweb.google.com/lh/photo/MFnV-ijtOBQmBywJWIbJVw?feat=embedwebsite"><img src="http://lh5.ggpht.com/_WnP2PKiLI14/S3nySGDHAwI/AAAAAAAAAEk/udiOfW38N-s/s800/python_oocalc_ex.png" /></a>
</li>
</ul>
Saving a Python dict to a file using pickle
2008-01-07T18:00:00-08:00https://www.saltycrane.com/blog/2008/01/saving-python-dict-to-file-using-pickle/<p>Per <em>Programming Python, 3rd Edition</em>, there are a number of methods
to store persistent data with Python:
<ul>
<li>I often use <strong>flat files</strong> to read or write text (string)
data using the <a href="http://docs.python.org/lib/module-os.html">
os library</a>.
</li>
<li>Flat files are read sequentially, but <strong><a href="http://docs.python.org/lib/module-anydbm.html">dbm files</a></strong>
allow for keyed access to string data</li>
<li>The <strong><a href="http://docs.python.org/lib/module-pickle.html">pickle</a></strong> module can be used to store non-string
Python data structures, such as Python dicts. However, the data is not
keyed as with dbm files.</li>
<li><strong><a href="http://docs.python.org/lib/module-shelve.html">shelve files</a></strong> combine the best of the dbm and pickle
methods by storing pickled objects in dbm keyed files.</li>
<li>I've read good things about the <strong><a href="http://www.zope.org/Documentation/Articles/ZODB1">ZODB object-oriented database
</a></strong>, but I don't know too much about it. Per the book, it is a
more powerful alternative to shelves.</li>
<li>The final option is interfacing with a full-fledged <strong>SQL
relational databases</strong>. As I <a href="http://www.saltycrane.com/blog/2007/11/python-sqlite-tutorial-relational/">
mentioned before</a>, Python 2.5 has an interface to SQLite as part of
the standard distribution.</li>
</ul>
</p>
<p>Here is an example using <code>pickle</code> which writes a Python dict to
a file and reads it back again:</p>
<pre class="python">import pickle
# write python dict to a file
mydict = {'a': 1, 'b': 2, 'c': 3}
output = open('myfile.pkl', 'wb')
pickle.dump(mydict, output)
output.close()
# read python dict back from the file
pkl_file = open('myfile.pkl', 'rb')
mydict2 = pickle.load(pkl_file)
pkl_file.close()
print mydict
print mydict2
</pre>
<br />Results:
<pre>{'a': 1, 'c': 3, 'b': 2}
{'a': 1, 'c': 3, 'b': 2}</pre>
Migrating Excel data to SQLite using Python
2007-11-27T18:13:00-08:00https://www.saltycrane.com/blog/2007/11/migrating-excel-to-sqlite-using-python/<p>In
a <a href="http://www.saltycrane.com/blog/2007/11/python-sqlite-tutorial-relational/">previous
post</a>, I described how I designed a SQLite relational database
from an Excel table. It was a small example, so I hardcoded the data
into the Python script. For my actual problem, I need to convert my
Excel data into a SQLite database automatically. To do this, I used
the <code>win32com</code> module and the <code>sqlite3</code> module
included in Python 2.5.</p>
<p>Here is the table from my previous post. It shows some variables in
in my C program. It shows the variable name, type, the module it belongs to,
and a short description. Here is the table from my previous post. It shows some variables in
in my C program. It shows the variable name, type, the module it belongs to,
and a short description.
</p>
<table style="font-family: arial;
font-size: x-small;
background-color: #ffffff;">
<tr style="background-color:aaaaaa">
<td style="padding: 2px; background-color: #aaaaaa">id</td>
<td style="padding: 2px; background-color: #aaaaaa">name</td>
<td style="padding: 2px; background-color: #aaaaaa">module</td>
<td style="padding: 2px; background-color: #aaaaaa">type</td>
<td style="padding: 2px; background-color: #aaaaaa">desc</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">1</td>
<td style="padding: 2px; background-color: #dddddd">foo</td>
<td style="padding: 2px; background-color: #dddddd">ModuleExt</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of foo</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">bar</td>
<td style="padding: 2px; background-color: #dddddd">ModuleExt</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of bar</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">knark</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">int</td>
<td style="padding: 2px; background-color: #dddddd">Description of knark</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">wert</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of wert</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">jib</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of jib</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">6</td>
<td style="padding: 2px; background-color: #dddddd">laz</td>
<td style="padding: 2px; background-color: #dddddd">Module2</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of laz</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">7</td>
<td style="padding: 2px; background-color: #dddddd">kew</td>
<td style="padding: 2px; background-color: #dddddd">Module2</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of kew</td>
</tr>
</table>
<p>After installing the <code>win32com</code> module
from <a href="http://sourceforge.net/project/platformdownload.php?group_id=78018">http://sourceforge.net/project/platformdownload.php?group_id=78018</a>,
I used the following code.</p>
<pre class="python">import os
import sqlite3
from win32com.client import constants, Dispatch
#----------------------------------------
# get data from excel file
#----------------------------------------
XLS_FILE = os.getcwd() + "\\example.xls"
ROW_SPAN = (14, 21)
COL_SPAN = (2, 7)
app = Dispatch("Excel.Application")
app.Visible = True
ws = app.Workbooks.Open(XLS_FILE).Sheets(1)
exceldata = [[ws.Cells(row, col).Value
for col in xrange(COL_SPAN[0], COL_SPAN[1])]
for row in xrange(ROW_SPAN[0], ROW_SPAN[1])]
#----------------------------------------
# create SQL table and fill it with data
#----------------------------------------
conn = sqlite3.connect('example.db')
c = conn.cursor()
c.execute('''CREATE TABLE exceltable (
id INTEGER,
name TEXT,
module TEXT,
type TEXT,
desc TEXT
)''')
for row in exceldata:
c.execute('INSERT INTO exceltable VALUES (?,?,?,?,?)', row)
conn.commit()
#----------------------------------------
# display SQL data
#----------------------------------------
c.execute('SELECT * FROM exceltable')
for row in c:
print row</pre>
<p>The Excel filename is <em>example.xls</em> and the table data
begins at B14 (2nd column, 14th row) and ends at F20 (6th column,
20th row) in Sheet 1. The script uses a declarative approach to
store the data in a Python list of lists. It creates
a SQLite database named <em>example.db</em> and creates a connection to
it. It then fills the database using the Python list data
structure. Finally, it displays the newly created database. The
screen output is shown below.</p>
<pre>(1, u'foo', u'ModuleExt', u'double', u'Description of foo')
(2, u'bar', u'ModuleExt', u'double', u'Description of bar')
(3, u'knark', u'Module1', u'int', u'Description of knark')
(4, u'wert', u'Module1', u'double', u'Description of wert')
(5, u'jib', u'Module1', u'double', u'Description of jib')
(6, u'laz', u'Module2', u'double', u'Description of laz')
(7, u'kew', u'Module2', u'double', u'Description of kew')</pre>
<p>If I want to interact with the newly created database using SQLite
directly, I can run <kbd>sqlite3 example.db</kbd> from my Cygwin
bash command line. (Note the <code>conn.commit()</code> line in my
Python script is very important for this step to work. For some
reason, I didn't see this in the Python documentation.) Here is an
example session using SQLite directly with the new database.</p>
<pre>$ sqlite3 example.db
SQLite version 3.5.1
Enter ".help" for instructions
sqlite> .schema
CREATE TABLE exceltable (
id INTEGER,
name TEXT,
module TEXT,
type TEXT,
desc TEXT
);
sqlite> select * from exceltable;
1|foo|ModuleExt|double|Description of foo
2|bar|ModuleExt|double|Description of bar
3|knark|Module1|int|Description of knark
4|wert|Module1|double|Description of wert
5|jib|Module1|double|Description of jib
6|laz|Module2|double|Description of laz
7|kew|Module2|double|Description of kew
sqlite></pre>
Relational database introduction with Python and SQLite
2007-11-14T17:37:00-08:00https://www.saltycrane.com/blog/2007/11/python-sqlite-tutorial-relational/<p>I mentioned in a <a href="http://www.saltycrane.com/blog/2007/09/python-word-wrap-function/">previous post</a> that I have an Excel spreadsheet containing a bunch of information about the development C code I'm working on. It is a large table showing which variables are input and output from which functions. The variable names are in the first column and the function names are in the last several columns of the first row. I use "i" or "o" to denote if a variable is an input to or an output from a particular function. I also have a few columns for the variable type and description. A simplified example is shown in the table below.</p>
<strong style="font-family: arial; font-size: small; text-align: right">ORIGINAL TABLE</strong><br />
<table style="font-family: arial;
font-size: x-small;
background-color: #ffffff;">
<tr style="background-color:aaaaaa">
<td style="padding: 2px; background-color: #aaaaaa">name</td>
<td style="padding: 2px; background-color: #aaaaaa">module</td>
<td style="padding: 2px; background-color: #aaaaaa">type</td>
<td style="padding: 2px; background-color: #aaaaaa">desc</td>
<td style="padding: 2px; background-color: #aaaaaa">ExtSource</td>
<td style="padding: 2px; background-color: #aaaaaa">DoThis</td>
<td style="padding: 2px; background-color: #aaaaaa">CalcThis</td>
<td style="padding: 2px; background-color: #aaaaaa">CalcThat</td>
<td style="padding: 2px; background-color: #aaaaaa">ModifyStuff</td>
<td style="padding: 2px; background-color: #aaaaaa">ExtSink</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">foo</td>
<td style="padding: 2px; background-color: #dddddd">ModuleExt</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of foo</td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">bar</td>
<td style="padding: 2px; background-color: #dddddd">ModuleExt</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of bar</td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">knark</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">int</td>
<td style="padding: 2px; background-color: #dddddd">Description of knark</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">io</td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">wert</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of wert</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">jib</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of jib</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">i</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">laz</td>
<td style="padding: 2px; background-color: #dddddd">Module2</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of laz</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd">i</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">kew</td>
<td style="padding: 2px; background-color: #dddddd">Module2</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of kew</td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd"> </td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd">o</td>
<td style="padding: 2px; background-color: #dddddd">i</td>
</tr>
</table>
<p>As shown above, the table also includes the variable descriptions, type, and scope. In my real, unsimplified Excel table, I also include function-related information such as the function description, and the function prototype. This additional information makes the table very messy. Try to imagine where you would put the function description and function prototype information in the table above. I came up with a kludge solution, but it involved repeated information, and rows and columns with inconsistent meaning. I was realizing the limitations of a 2-dimensional table.</p>
<p>I had been interested in learning about relational databases, so this seemed like a good opportunity. Relational databases imply the SQL query language-- relational databases that use SQL are about the only kind around. They include: Oracle, Microsoft SQL Server, PostgreSQL, MySQL, and SQLite among others. I chose SQLite for my task because it is lightweight (no separate server), free, popular, and fast for small databases. (See <a href="http://www.sqlite.org/different.html">http://www.sqlite.org/different.html</a> for other distinctive features.) Also Python 2.5 now includes an interface module, <code>sqlite3</code>, as part of the standard distribution. See <a href="http://docs.python.org/lib/module-sqlite3.html">http://docs.python.org/lib/module-sqlite3.html</a> for the documentation.</p>
<h4>Relational model design</h4>
<p>I could create a relational database using my original table in the Excel spreadsheet. However, this organization is not in the spirit of relational database design. The [relational] approach is to create many smaller tables and relationships between those tables. For my example, I created a <em>variable</em> table, a <em>function</em> table, and a <em>variable_function</em> table to indicate the relationship between the variable and function tables. Though it looks more complicated than the original single table form, inputting this into a relational database, such as SQLite, will allow for more advanced manipulation of the data. It is also much more [conducive] to maintenance and expansion. For example, now there is a natural place to put the additional function-related information which was so clumsy in the original single table. To do this, I add <em>desc</em> and <em>prototype</em> columns to the <em>function</em> table. If I needed to add more information to about the different modules, I could create a separate <em>module</em> table and include the module_id as a column in the <em>variable</em> table.</p>
<strong style="font-family: arial; font-size: small; text-align: right">VARIABLE TABLE</strong><br />
<table style="font-family: arial;
font-size: x-small;
background-color: #ffffff;">
<col>
</col><tr style="background-color:aaaaaa">
<td style="padding: 2px; background-color: #aaaaaa">id</td>
<td style="padding: 2px; background-color: #aaaaaa">name</td>
<td style="padding: 2px; background-color: #aaaaaa">module</td>
<td style="padding: 2px; background-color: #aaaaaa">type</td>
<td style="padding: 2px; background-color: #aaaaaa">desc</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">1</td>
<td style="padding: 2px; background-color: #dddddd">foo</td>
<td style="padding: 2px; background-color: #dddddd">ModuleExt</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of foo</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">bar</td>
<td style="padding: 2px; background-color: #dddddd">ModuleExt</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of bar</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">knark</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">int</td>
<td style="padding: 2px; background-color: #dddddd">Description of knark</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">wert</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of wert</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">jib</td>
<td style="padding: 2px; background-color: #dddddd">Module1</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of jib</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">6</td>
<td style="padding: 2px; background-color: #dddddd">laz</td>
<td style="padding: 2px; background-color: #dddddd">Module2</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of laz</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">7</td>
<td style="padding: 2px; background-color: #dddddd">kew</td>
<td style="padding: 2px; background-color: #dddddd">Module2</td>
<td style="padding: 2px; background-color: #dddddd">double</td>
<td style="padding: 2px; background-color: #dddddd">Description of kew</td>
</tr>
</table>
<br />
<strong style="font-family: arial; font-size: small; text-align: right">FUNCTION TABLE</strong><br />
<table style="font-family: arial;
font-size: x-small;
background-color: #ffffff;">
<tr style="background-color:aaaaaa"><td style="padding: 2px; background-color: #aaaaaa">id</td><td style="padding: 2px; background-color: #aaaaaa">name</td></tr>
<tr style="background-color:dddddd"><td style="padding: 2px; background-color: #dddddd">1</td><td style="padding: 2px; background-color: #dddddd">ExtSource</td></tr>
<tr style="background-color:dddddd"><td style="padding: 2px; background-color: #dddddd">2</td><td style="padding: 2px; background-color: #dddddd">DoThis</td></tr>
<tr style="background-color:dddddd"><td style="padding: 2px; background-color: #dddddd">3</td><td style="padding: 2px; background-color: #dddddd">CalcThis</td></tr>
<tr style="background-color:dddddd"><td style="padding: 2px; background-color: #dddddd">4</td><td style="padding: 2px; background-color: #dddddd">CalcThat</td></tr>
<tr style="background-color:dddddd"><td style="padding: 2px; background-color: #dddddd">5</td><td style="padding: 2px; background-color: #dddddd">ModifyStuff</td></tr>
<tr style="background-color:dddddd"><td style="padding: 2px; background-color: #dddddd">6</td><td style="padding: 2px; background-color: #dddddd">ExtSink</td></tr>
</table>
<br />
<strong style="font-family: arial; font-size: small; text-align: right">VARIABLE_FUNCTION TABLE</strong><br />
<table style="font-family: arial;
font-size: x-small;
background-color: #ffffff;">
<tr style="background-color:aaaaaa">
<td style="padding: 2px; background-color: #aaaaaa">variable_id</td>
<td style="padding: 2px; background-color: #aaaaaa">function_id</td>
<td style="padding: 2px; background-color: #aaaaaa">type</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">1</td>
<td style="padding: 2px; background-color: #dddddd">1</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">1</td>
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">1</td>
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">1</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">input-output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">2</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">6</td>
<td style="padding: 2px; background-color: #dddddd">3</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">6</td>
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">6</td>
<td style="padding: 2px; background-color: #dddddd">6</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">7</td>
<td style="padding: 2px; background-color: #dddddd">4</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">7</td>
<td style="padding: 2px; background-color: #dddddd">5</td>
<td style="padding: 2px; background-color: #dddddd">output</td>
</tr>
<tr style="background-color:dddddd">
<td style="padding: 2px; background-color: #dddddd">7</td>
<td style="padding: 2px; background-color: #dddddd">6</td>
<td style="padding: 2px; background-color: #dddddd">input</td>
</tr>
</table>
<h4>SQLite Implementation (w/ Python)</h4>
<p>To implement this database, all I needed was Python 2.5. If you have an older version of Python, you can install the pysqlite module. Here is the Python code to implement the database above.
<pre class="python">import sqlite3
# data
VARIABLES = (
(1, 'foo', 'ModuleExt', 'double', 'Description of foo'),
(2, 'bar', 'ModuleExt', 'double', 'Description of bar'),
(3, 'knark', 'Module1', 'int', 'Description of knark'),
(4, 'wert', 'Module1', 'double', 'Description of wert'),
(5, 'jib', 'Module1', 'double', 'Description of jib'),
(6, 'laz', 'Module2', 'double', 'Description of laz'),
(7, 'kew', 'Module2', 'double', 'Description of kew'),
)
FUNCTIONS = (
(1, 'ExtSource'),
(2, 'DoThis'),
(3, 'CalcThis'),
(4, 'CalcThat'),
(5, 'ModifyStuff'),
(6, 'ExtSink'),
)
VAR_FUNC = (
(1, 1, 'output'),
(1, 2, 'input'),
(1, 4, 'input'),
(2, 1, 'output'),
(2, 2, 'input'),
(2, 3, 'input'),
(3, 2, 'input-output'),
(3, 3, 'input'),
(4, 2, 'output'),
(4, 4, 'input'),
(4, 5, 'input'),
(5, 3, 'output'),
(5, 5, 'input'),
(6, 3, 'output'),
(6, 5, 'output'),
(6, 6, 'input'),
(7, 4, 'output'),
(7, 5, 'output'),
(7, 6, 'input'),
)
# get connection and cursor objects
conn = sqlite3.connect('iodatabase')
c = conn.cursor()
# create tables
c.execute('''create table variable (
id integer,
name text,
module text,
type text,
desc text
)''')
c.execute('''create table function (
id integer,
name text
)''')
c.execute('''create table var_func (
variable_id integer,
function_id integer,
type text
)''')
# fill tables with data
for row in VARIABLES:
c.execute('insert into variable values (?,?,?,?,?)', row)
for row in FUNCTIONS:
c.execute('insert into function values (?,?)', row)
for row in VAR_FUNC:
c.execute('insert into var_func values (?,?,?)', row)</pre>
<h4>Querying the Database</h4>
</p><p>Now that I have created the database, I can get answers to interesting questions, such as <em>What variables are output by CalcThis?</em> Here is the Python/SQLite code to answer this question:
<pre class="python">c.execute(''.join([
'SELECT variable.name, variable.module, variable.type, variable.desc ',
'FROM variable, var_func, function ',
'WHERE variable.id=var_func.variable_id ',
'AND function.id=var_func.function_id ',
'AND function.name="CalcThis" ',
'AND var_func.type="output" ',
]))
FORMAT = '%-6s%-10s%-8s%-20s'
print FORMAT % ('name', 'module', 'type', 'desc')
print '-' * 44
for row in c:
print FORMAT % row</pre>
Here is the output.
<pre>name module type desc
----------------------------------------------
jib Module1 double Description of jib
laz Module2 double Description of laz</pre>
I can see this is consistent with my original table. The query works. For such a small example, the original method may seem easier, but as the number of entries grows, the benefit of the relational database grows as well. Here is another example which asks <em>Which functions use the variable wert as an input?</em>
<pre class="python">c.execute(''.join([
'SELECT function.name ',
'FROM variable, var_func, function ',
'WHERE variable.id=var_func.variable_id ',
'AND function.id=var_func.function_id ',
'AND variable.name="wert" ',
'AND var_func.type="input" ',
]))
print 'name'
print '------------'
for row in c:
print '%s' % row</pre>
Output:
<pre>name
---------
CalcThat
ModifyStuff</pre>
I would like to do even more complicated things like determine the prerequisite input variables across all functions for a given output variable. However, I still need to read more about that, so that will have to wait.
</p>