SaltyCrane Blog — Notes on JavaScript and web development

Iterating over lines in multiple Linux log files using Python

I needed to parse through my Nginx log files to debug a problem. However, the logs are separated into many files, most of them are gzipped, and I wanted the ordering within the files reversed. So I abstracted the logic to handle this into a function. Now I can pass a glob pattern such as /var/log/nginx/cache.log* to my function, and iterate over each line in all the files as if they were one file. Here is my function. Let me know if there is a better way to do this.

Update 2010-02-24:To handle multiple log files on a remote host, see my script on github.

import glob
import gzip
import re
 
def get_lines(log_glob):
    """Return an iterator of each line in all files matching log_glob.
    Lines are sorted most recent first.
    Files are sorted by the integer in the suffix of the log filename.
    Suffix may be one of the following:
         .X (where X is an integer)
         .X.gz (where X is an integer)
    If the filename does not end in either suffix, it is treated as if X=0
    """
    def sort_by_suffix(a, b):
        def get_suffix(fname):
            m = re.search(r'.(?:\.(\d+))?(?:\.gz)?$', fname)
            if m.lastindex:
                suf = int(m.group(1))
            else:
                suf = 0
            return suf
        return get_suffix(a) - get_suffix(b)
 
    filelist = glob.glob(log_glob)
    for filename in sorted(filelist, sort_by_suffix):
        if filename.endswith('.gz'):
            fh = gzip.open(filename)
        else:
            fh = open(filename)
        for line in reversed(fh.readlines()):
            yield line
        fh.close()

Here is an example run on my machine. It prints the first 15 characters of every 1000th line of all my syslog files.

for i, line in enumerate(get_lines('/var/log/syslog*')):
    if not i % 1000:
        print line[:15]

File listing:

$ ls -l /var/log/syslog*
-rw-r----- 1 syslog adm 169965 2010 01/23 00:18 /var/log/syslog
-rw-r----- 1 syslog adm 350334 2010 01/22 08:03 /var/log/syslog.1
-rw-r----- 1 syslog adm  18078 2010 01/21 07:49 /var/log/syslog.2.gz
-rw-r----- 1 syslog adm  16700 2010 01/20 07:43 /var/log/syslog.3.gz
-rw-r----- 1 syslog adm  18197 2010 01/19 07:52 /var/log/syslog.4.gz
-rw-r----- 1 syslog adm  15737 2010 01/18 07:45 /var/log/syslog.5.gz
-rw-r----- 1 syslog adm  16157 2010 01/17 07:54 /var/log/syslog.6.gz
-rw-r----- 1 syslog adm  20285 2010 01/16 07:48 /var/log/syslog.7.gz

Results:

Jan 22 23:57:01
Jan 22 14:09:01
Jan 22 03:51:01
Jan 21 17:35:01
Jan 21 14:37:33
Jan 21 08:35:01
Jan 20 22:12:01
Jan 20 11:56:01
Jan 20 01:41:01
Jan 19 15:18:01
Jan 19 04:53:01
Jan 18 18:35:01
Jan 18 08:40:01
Jan 17 22:10:01
Jan 17 11:32:01
Jan 17 01:05:01
Jan 16 14:27:01
Jan 16 04:01:01
Jan 15 17:25:01
Jan 15 08:50:01

Wmii Python script to monitor remote machines

I like to monitor our web servers by ssh'ing into the remote machine and watching "top", tailing log files, etc. Normally, I open a terminal, ssh into the remote machine, run the monitoring command (e.g. "top"), then repeat for the rest of the remote machines. Then I adjust the window sizes so I can see everything at once.

My window manager, wmii, is great for tiling a bunch of windows at once. It is also scriptable with Python, so I wrote a Python script to create my web server monitoring view. Below is my script. I also put a video on YouTube.

#!/usr/bin/env python

import os
import time

NGINX_MONITOR_CMD = "tail --follow=name /var/log/nginx/cache.log | grep --color -E '(HIT|MISS|EXPIRED|STALE|UPDATING|\*\*\*)'"
APACHE_MONITOR_CMD = "top"
MYSQL_MONITOR_CMD = "mysqladmin extended -i10 -r | grep -i 'questions\|aborted_clients\|opened_tables\|slow_queries\|threads_created' "

CMDS_COL1 = ['urxvt -title "Nginx 1" -e ssh -t us-ng1 "%s" &' % NGINX_MONITOR_CMD,
             'urxvt -title "Nginx 2" -e ssh -t us-ng2 "%s" &' % NGINX_MONITOR_CMD,
             ]
CMDS_COL2 = ['urxvt -title "Apache 1" -e ssh -t us-med1 "%s" &' % APACHE_MONITOR_CMD,
             'urxvt -title "Apache 2" -e ssh -t us-med2 "%s" &' % APACHE_MONITOR_CMD,
             'urxvt -title "Apache 3" -e ssh -t us-med3 "%s" &' % APACHE_MONITOR_CMD,
             ]
CMDS_COL3 = ['urxvt -title "MySQL 1" -e ssh -t us-my1 "%s" &' % MYSQL_MONITOR_CMD,
             'urxvt -title "MySQL 2" -e ssh -t us-my2 "%s" &' % MYSQL_MONITOR_CMD,
             ]
COLUMNS = [CMDS_COL1, CMDS_COL2, CMDS_COL3]

def create_windows():
    for i, col in enumerate(COLUMNS):
        cindex = str(i+1)
        for cmd in col:
            os.system(cmd)
            time.sleep(1)
            os.system('wmiir xwrite /tag/sel/ctl send sel %s' % cindex)
        os.system('wmiir xwrite /tag/sel/ctl colmode %s default-max' % cindex)
    os.system('wmii.py 45.5 31.5 23')

if __name__ == '__main__':
    create_windows()

Note 1: The script above uses another script I wrote previously, wmii.py, to set the column widths.

Note 2: The remote server addresses are specified by the nicknames us-ng1, us-ng2, us-med1, etc. configured in my ~/.ssh/config file as described here.

Note 3 (on using ssh and top): I first tried doing ssh host top, but this gave me a TERM environment variable not set. error. I then tried ssh host "export TERM=rxvt-unicode; top", but this gave me a top: failed tty get error. The solution that worked for me was to use the -t option with ssh. E.g. ssh -t host top. This is what I used in the script above.

Note 4 (added 2010-03-05): I used "tail --follow=name" instead of "tail -f" so that tail will follow the log file even after it has been rotated. For more information, see the man page for tail.

Note 5 (added 2010-03-05): To prevent your ssh session from timing out, add the following 2 lines to your ~/.ssh/config file (via):

Host *
  ServerAliveInterval 60

Trying out a Retry decorator in Python

The Python wiki has a Retry decorator example which retries calling a failure-prone function using an exponential backoff algorithm. I modified it slightly to check for exceptions instead of a False return value to indicate failure. Each time the decorated function throws an exception, the decorator will wait a period of time and retry calling the function until the maximum number of tries is used up. If the decorated function fails on the last try, the exception will occur unhandled.

import time
from functools import wraps


def retry(ExceptionToCheck, tries=4, delay=3, backoff=2, logger=None):
    """Retry calling the decorated function using an exponential backoff.

    http://www.saltycrane.com/blog/2009/11/trying-out-retry-decorator-python/
    original from: http://wiki.python.org/moin/PythonDecoratorLibrary#Retry

    :param ExceptionToCheck: the exception to check. may be a tuple of
        exceptions to check
    :type ExceptionToCheck: Exception or tuple
    :param tries: number of times to try (not retry) before giving up
    :type tries: int
    :param delay: initial delay between retries in seconds
    :type delay: int
    :param backoff: backoff multiplier e.g. value of 2 will double the delay
        each retry
    :type backoff: int
    :param logger: logger to use. If None, print
    :type logger: logging.Logger instance
    """
    def deco_retry(f):

        @wraps(f)
        def f_retry(*args, **kwargs):
            mtries, mdelay = tries, delay
            while mtries > 1:
                try:
                    return f(*args, **kwargs)
                except ExceptionToCheck, e:
                    msg = "%s, Retrying in %d seconds..." % (str(e), mdelay)
                    if logger:
                        logger.warning(msg)
                    else:
                        print msg
                    time.sleep(mdelay)
                    mtries -= 1
                    mdelay *= backoff
            return f(*args, **kwargs)

        return f_retry  # true decorator

    return deco_retry

Try an "always fail" case

@retry(Exception, tries=4)
def test_fail(text):
    raise Exception("Fail")

test_fail("it works!")

Results:

Fail, Retrying in 3 seconds...
Fail, Retrying in 6 seconds...
Fail, Retrying in 12 seconds...
Traceback (most recent call last):
  File "retry_decorator.py", line 47, in 
    test_fail("it works!")
  File "retry_decorator.py", line 26, in f_retry
    f(*args, **kwargs)
  File "retry_decorator.py", line 33, in test_fail
    raise Exception("Fail")
Exception: Fail

Try a "success" case

@retry(Exception, tries=4)
def test_success(text):
    print "Success: ", text

test_success("it works!")

Results:

Success:  it works!

Try a "random fail" case

import random

@retry(Exception, tries=4)
def test_random(text):
    x = random.random()
    if x < 0.5:
        raise Exception("Fail")
    else:
        print "Success: ", text

test_random("it works!")

Results:

Fail, Retrying in 3 seconds...
Success:  it works!

Try handling multiple exceptions

Added 2010-04-27

import random

@retry((NameError, IOError), tries=20, delay=1, backoff=1)
def test_multiple_exceptions():
    x = random.random()
    if x < 0.40:
        raise NameError("NameError")
    elif x < 0.80:
        raise IOError("IOError")
    else:
        raise KeyError("KeyError")

test_multiple_exceptions()

Results:

IOError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
Traceback (most recent call last):
  File "retry_decorator.py", line 61, in 
    test_multiple_exceptions("hello")
  File "retry_decorator.py", line 14, in f_retry
    f(*args, **kwargs)
  File "retry_decorator.py", line 56, in test_multiple_exceptions
    raise KeyError("KeyError")
KeyError: 'KeyError'

Unit tests

Added 2013-01-22. Note: Python 2.7 is required to run the tests.

import logging
import unittest

from decorators import retry


class RetryableError(Exception):
    pass


class AnotherRetryableError(Exception):
    pass


class UnexpectedError(Exception):
    pass


class RetryTestCase(unittest.TestCase):

    def test_no_retry_required(self):
        self.counter = 0

        @retry(RetryableError, tries=4, delay=0.1)
        def succeeds():
            self.counter += 1
            return 'success'

        r = succeeds()

        self.assertEqual(r, 'success')
        self.assertEqual(self.counter, 1)

    def test_retries_once(self):
        self.counter = 0

        @retry(RetryableError, tries=4, delay=0.1)
        def fails_once():
            self.counter += 1
            if self.counter < 2:
                raise RetryableError('failed')
            else:
                return 'success'

        r = fails_once()
        self.assertEqual(r, 'success')
        self.assertEqual(self.counter, 2)

    def test_limit_is_reached(self):
        self.counter = 0

        @retry(RetryableError, tries=4, delay=0.1)
        def always_fails():
            self.counter += 1
            raise RetryableError('failed')

        with self.assertRaises(RetryableError):
            always_fails()
        self.assertEqual(self.counter, 4)

    def test_multiple_exception_types(self):
        self.counter = 0

        @retry((RetryableError, AnotherRetryableError), tries=4, delay=0.1)
        def raise_multiple_exceptions():
            self.counter += 1
            if self.counter == 1:
                raise RetryableError('a retryable error')
            elif self.counter == 2:
                raise AnotherRetryableError('another retryable error')
            else:
                return 'success'

        r = raise_multiple_exceptions()
        self.assertEqual(r, 'success')
        self.assertEqual(self.counter, 3)

    def test_unexpected_exception_does_not_retry(self):

        @retry(RetryableError, tries=4, delay=0.1)
        def raise_unexpected_error():
            raise UnexpectedError('unexpected error')

        with self.assertRaises(UnexpectedError):
            raise_unexpected_error()

    def test_using_a_logger(self):
        self.counter = 0

        sh = logging.StreamHandler()
        logger = logging.getLogger(__name__)
        logger.addHandler(sh)

        @retry(RetryableError, tries=4, delay=0.1, logger=logger)
        def fails_once():
            self.counter += 1
            if self.counter < 2:
                raise RetryableError('failed')
            else:
                return 'success'

        fails_once()


if __name__ == '__main__':
    unittest.main()

Code / License

This code is also on github at: https://github.com/saltycrane/retry-decorator. It is BSD licensed.

Using Nginx as a caching proxy with Wordpress+Apache

We have been evaluating caching reverse proxy servers at work. We looked at Nginx+memcached, Squid, and Varnish. Most recently, we found that Nginx version 0.7 has support for caching static files using the proxy_cache directive in the NginxHttpProxyModule. This allows us to use Nginx as a caching proxy without having to handle the complication (or flexibility depending on how you look at it) of setting and invalidating the cache as with the Nginx+memcached setup. Here are my notes for setting it up with an Apache+Wordpress backend.

Update 2010-01-05: Over a couple months, we switched to Nginx 0.8 and we made a few tweaks to our Nginx configuration. Here is our updated conf file: nginx_wordpress_100105.conf.

Install Nginx 0.7

The version of Nginx in Ubuntu is an older version so we used a PPA created by Jeff Waugh: https://launchpad.net/~jdub/+archive/ppa. (He also has a development PPA which contains Nginx 0.8.)

  • Add the following to /etc/apt/sources.list:
    deb http://ppa.launchpad.net/jdub/ppa/ubuntu hardy main 
    deb-src http://ppa.launchpad.net/jdub/ppa/ubuntu hardy main
  • Tell Ubuntu how to authenticate the PPA
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E9EEF4A1

    Alternively, if the keyserver is down, you can follow the instructions for copying the public key from http://forum.nginx.org/read.php?2,5177,11272.

  • Install Nginx from new PPA
    apt-get update
    apt-get install nginx
  • Check the version of Nginx
    nginx -V
    nginx version: nginx/0.7.62
    configure arguments: --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --pid-path=/var/run/nginx.pid --lock-path=/var/lock/nginx.lock --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/body --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --with-debug --with-http_stub_status_module --with-http_flv_module --with-http_ssl_module --with-http_dav_module --with-http_gzip_static_module --with-ipv6 --with-http_realip_module --with-http_xslt_module --with-http_image_filter_module --with-sha1=/usr/include/openssl

Configure Nginx cache logging

Within the http {} block, add:

    log_format cache '***$time_local '
                     '$upstream_cache_status '
                     'Cache-Control: $upstream_http_cache_control '
                     'Expires: $upstream_http_expires '
                     '"$request" ($status) '
                     '"$http_user_agent" ';
    access_log  /var/log/nginx/cache.log cache;

Nginx configuration for backend servers

Within the http {} block, add:

    include /etc/nginx/app-servers.include;

And /etc/nginx/app-servers.include looks like:

upstream backend {
        ip_hash;

	server 10.245.275.88:80;
	server 10.292.150.34:80;
}

Configure cache path/parameters

Within the http {} block, add:

    proxy_cache_path /var/www/nginx_cache levels=1:2
                     keys_zone=one:10m;
                     inactive=7d max_size=200m;
    proxy_temp_path /var/www/nginx_temp;

More proxy cache configuration

We added the username from the wordpress_logged_in_* cookie as part of the cache key so that different logged in users will get the appropriate page from the cache. However, our Wordpress configuration sends HTTP headers disabling the cache when a user is logged in so this is actually not used. But it does not hurt to include this, in case we change our Wordpress configuration in the future.

Within the server {} block, add:

        location / {
            # capture cookie for use in cache key
            if ($http_cookie ~* "wordpress_logged_in_[^=]*=([^%]+)%7C") {
                set $my_cookie $1;
            }

            proxy_pass http://backend;
            proxy_cache one;
            proxy_cache_key $scheme$proxy_host$uri$is_args$args$my_cookie;
            proxy_cache_valid  200 302 304 10m;
            proxy_cache_valid  301 1h;
            proxy_cache_valid  any 1m;
        }

Configure locations that shouldn't be cached

If WordPress sends the appropriate HTTP Cache-Control headers, this step is not necessary. But we have added it to be on the safe side. Within the server {} block, add:

        location /wp-admin { proxy_pass http://backend; }
        location /wp-login.php { proxy_pass http://backend; }

Restart Nginx

The Nginx reverse proxy cache should work without modification to the Apache configuration. In our case, we had to disable WP Super Cache because we had been using that previously.

/etc/init.d/nginx restart

View the log

Check the /var/log/nginx/cache.log to see if everything is working correctly. The log should diplay HIT, MISS, and EXPIRED appropriately. If the log shows only misses, check the Cache-Control and Expires HTTP headers that are sent from Apache+Wordpress.

Example Apache/Wordpress configuration that disabled the Nginx cache

Part of the WP Super Cache configuration included the following in the .htaccess file. It had to be removed for Nginx cache the pages. (In particular, the must-revalidate part had to be removed.)

     Header set Cache-Control 'max-age=300, must-revalidate'

How to make urxvt look like gnome-terminal

My terminal of choice is rxvt-unicode (urxvt) because it is fast and lightweight. However, I recently opened up gnome-terminal and it was so much prettier than my urxvt. Here's how I made my urxvt look like gnome-terminal. The last step involves compiling urxvt from source because the latest source includes a patch to configure horizontal spacing of letters.

Set up colors

Add the following to your ~/.Xdefaults file:

! to match gnome-terminal "Linux console" scheme
! foreground/background
URxvt*background: #000000
URxvt*foreground: #ffffff
! black
URxvt.color0  : #000000
URxvt.color8  : #555555
! red
URxvt.color1  : #AA0000
URxvt.color9  : #FF5555
! green
URxvt.color2  : #00AA00
URxvt.color10 : #55FF55
! yellow
URxvt.color3  : #AA5500
URxvt.color11 : #FFFF55
! blue
URxvt.color4  : #0000AA
URxvt.color12 : #5555FF
! magenta
URxvt.color5  : #AA00AA
URxvt.color13 : #FF55FF
! cyan
URxvt.color6  : #00AAAA
URxvt.color14 : #55FFFF
! white
URxvt.color7  : #AAAAAA
URxvt.color15 : #FFFFFF

Select font

Also add the following to your ~/.Xdefaults file:

URxvt*font: xft:Monospace:pixelsize=11

Don't use a bold font

Also add the following to your ~/.Xdefaults file:

URxvt*boldFont: xft:Monospace:pixelsize=11

Fix urxvt font width

This is the most difficult thing to fix. It requires installing urxvt from CVS source.

  • Install prerequisites:
    apt-get build-dep rxvt-unicode
  • Get CVS source code:
    cvs -z3 -d :pserver:[email protected]/schmorpforge co rxvt-unicode
  • Configure:
    cd rxvt-unicode
    ./configure --prefix=/home/saltycrane/lib/rxvt-unicode-20091102
  • Make & make install:
    make
    make install
  • Link urxvt executable to your ~/bin directory:
    cd ~/bin
    ln -s ../lib/rxvt-unicode-20091102/bin/urxvt .
  • Edit ~/.Xdefaults once again:
    URxvt*letterSpace: -1

Also cool: Open links in Firefox

Here is another trick (thanks to Zachary Tatlock) to make clicking on URLs open in your Firefox browser. Add the following to your ~/.Xdefaults (yes there's Perl in your urxvt!):

URxvt.perl-ext-common : default,matcher
URxvt.urlLauncher     : firefox
URxvt.matcher.button  : 1

See also

Screenshots

Urxvt (default):

ugly urxvt screenshot

Gnome-terminal:

gnome-terminal screenshot

Urxvt (modified):

pretty urxvt screenshot

If you're interested, here is how I printed the terminal colors:

#!/bin/bash
echo -e "\\e[0mCOLOR_NC (No color)"
echo -e "\\e[1;37mCOLOR_WHITE\\t\\e[0;30mCOLOR_BLACK"
echo -e "\\e[0;34mCOLOR_BLUE\\t\\e[1;34mCOLOR_LIGHT_BLUE"
echo -e "\\e[0;32mCOLOR_GREEN\\t\\e[1;32mCOLOR_LIGHT_GREEN"
echo -e "\\e[0;36mCOLOR_CYAN\\t\\e[1;36mCOLOR_LIGHT_CYAN"
echo -e "\\e[0;31mCOLOR_RED\\t\\e[1;31mCOLOR_LIGHT_RED"
echo -e "\\e[0;35mCOLOR_PURPLE\\t\\e[1;35mCOLOR_LIGHT_PURPLE"
echo -e "\\e[0;33mCOLOR_YELLOW\\t\\e[1;33mCOLOR_LIGHT_YELLOW"
echo -e "\\e[1;30mCOLOR_GRAY\\t\\e[0;37mCOLOR_LIGHT_GRAY"

Notes on switching my Djangos to mod_wsgi

I'm slowly trying to make my Django web servers conform to current best practices. I've set up an Nginx reverse proxy for serving static files, started using virtualenv to isolate my Python environments, and migrated my database to PostgreSQL. I ultimately want to implement memcached+Nginx caching in my reverse proxy, but the next task on my to-do list is switching from mod_python to mod_wsgi.

Within the past year (or maybe before), mod_wsgi has become the preferred method for serving Django applications. I also originally thought switching from mod_python to mod_wsgi would save me some much needed memory on my 256MB VPS. But after trying it out, running with a single Apache process in each case, the memory footprint was about the same. Even switching from mod_wsgi's embedded mode to daemon mode didn't make a significant difference. Likely the performance is better with mod_wsgi, though.

Here are my notes on installing mod_wsgi.

Configuration References

Advice from mod_wsgi author Graham Dumpleton

Install mod_wsgi and apache mpm-worker

I'm not 100% sure about prefork vs. worker mpm, but Graham Dumpleton favors worker mpm.

sudo apt-get install libapache2-mod-wsgi
sudo apt-get install apache2-mpm-worker

Create .wsgi application file

My virtualenv is located at /srv/python-environments/saltycrane. My Django settings files is at /srv/SaltyCrane/iwiwdsmi/settings.py.

/srv/SaltyCrane/saltycrane.wsgi:

import os
import sys
import site

site.addsitedir('/srv/python-environments/saltycrane/lib/python2.5/site-packages')

os.environ['DJANGO_SETTINGS_MODULE'] = 'iwiwdsmi.settings'

sys.path.append('/srv/SaltyCrane')

import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

Edit Apache's httpd.conf file

I went back and forth between using embedded mode or daemon mode. I've ended up with embedded mode for now since it seems to use a tad less memory and is supposed to be a little bit faster. However, Graham Dumpleton seems to recommend daemon mode for people on VPSs. I may change my mind again later. To use daemon mode, I just need to uncomment the WSGIDaemonProcess and WSGIProcessGroup lines. I have StartServers set to 1 because I can only afford to have one Apache process running. This is assuming nginx is proxying requests to apache. For more on my nginx setup, see here.

Edit /etc/apache2/httpd.conf:

<IfModule mpm_worker_module>
    StartServers 1
    ServerLimit 1
    ThreadsPerChild 5
    ThreadLimit 5
    MinSpareThreads 5
    MaxSpareThreads 5
    MaxClients 5
    MaxRequestsPerChild 500
</IfModule>

KeepAlive Off
NameVirtualHost 127.0.0.1:8080
Listen 8080

<VirtualHost 127.0.0.1:8080>
    ServerName www.saltycrane.com
    # WSGIDaemonProcess saltycrane.com processes=1 threads=5 display-name=%{GROUP}
    # WSGIProcessGroup saltycrane.com
    WSGIScriptAlias / /srv/SaltyCrane/saltycrane.wsgi
</VirtualHost>

<VirtualHost 127.0.0.1:8080>
    ServerName supafu.com
    # WSGIDaemonProcess supafu.com processes=1 threads=5 display-name=%{GROUP}
    # WSGIProcessGroup supafu.com
    WSGIScriptAlias / /srv/Supafu/supafu.wsgi
</VirtualHost>

<VirtualHost 127.0.0.1:8080>
    ServerName handsoncards.com
    # WSGIDaemonProcess handsoncards.com processes=1 threads=5 display-name=%{GROUP}
    # WSGIProcessGroup handsoncards.com
    WSGIScriptAlias / /srv/HandsOnCards/handsoncards.wsgi
</VirtualHost>

Restart Apache

sudo /etc/init.d/apache2 restart

Install WordPress 2.8.4 on Ubuntu 9.04 Jaunty

Since we're using WordPress at work, I decided to install WordPress on my local machine for testing and educational purposes. Here are my notes. The versions of stuff are: WordPress 2.8.4, Ubuntu 9.04 Jaunty Jackalope, MySQL 5.1, PHP 5.2.6, Apache 2.2.11 with prefork mpm.

Install prerequisites

  • Install Apache and PHP
    sudo apt-get install php5
    sudo apt-get install php5-mysql
  • Install MySQL Server
    sudo apt-get install mysql-server

    Set a password for the MySQL root user when prompted.

Download Wordpress code

cd /var/www
sudo wget http://wordpress.org/wordpress-2.8.4.tar.gz
sudo tar zxvf wordpress-2.8.4.tar.gz

Create MySQL database and user

mysql -uroot -p

Enter the password you created above.

CREATE DATABASE wordpress;
CREATE USER 'wp_user'@'localhost' IDENTIFIED BY 'wp_password';
GRANT ALL PRIVILEGES ON wordpress.* TO 'wp_user'@'localhost';
\q

Edit WordPress wp-config.php file

  • Copy the sample file
    cd /var/www/wordpress
    sudo cp wp-config-sample.php wp-config.php
  • Edit the following lines in /var/www/wordpress/wp-config.php
    /** The name of the database for WordPress */
    define('DB_NAME', 'wordpress');
    
    /** MySQL database username */
    define('DB_USER', 'wp_user');
    
    /** MySQL database password */
    define('DB_PASSWORD', 'wp_password');
    
    /** MySQL hostname */
    define('DB_HOST', 'localhost');

Set up Apache virtual host

  • Edit /etc/apache2/sites-available/wordpress
    ServerName localhost
    <VirtualHost *:80>
    	DocumentRoot /var/www/wordpress
    	ErrorLog /var/log/apache2/wordpress.error.log
    </VirtualHost>
        
  • Symlink to sites-enabled
    sudo ln -s /etc/apache2/sites-available/wordpress /etc/apache2/sites-enabled/wordpress
  • Remove default virtual host from sites-enabled
    sudo rm /etc/apache2/sites-enabled/000-default

Restart Apache and view your Wordpress site

  • sudo /etc/init.d/apache2 restart
  • Go to http://localhost in your browser. You should get the WordPress Welcome page.

References

Notes on Python logging

mylogging.py:

import logging
import sys

DEBUG_LOG_FILENAME = '/var/log/my-debug.log'
WARNING_LOG_FILENAME = '/var/log/my-warning.log'

# set up formatting
formatter = logging.Formatter('[%(asctime)s] %(levelno)s (%(process)d) %(module)s: %(message)s')

# set up logging to STDOUT for all levels DEBUG and higher
sh = logging.StreamHandler(sys.stdout)
sh.setLevel(logging.DEBUG)
sh.setFormatter(formatter)

# set up logging to a file for all levels DEBUG and higher
fh = logging.FileHandler(DEBUG_LOG_FILENAME)
fh.setLevel(logging.DEBUG)
fh.setFormatter(formatter)

# set up logging to a file for all levels WARNING and higher
fh2 = logging.FileHandler(WARNING_LOG_FILENAME)
fh2.setLevel(logging.WARN)
fh2.setFormatter(formatter)

# create Logger object
mylogger = logging.getLogger('MyLogger')
mylogger.setLevel(logging.DEBUG)
mylogger.addHandler(sh)
mylogger.addHandler(fh)
mylogger.addHandler(fh2)

# create shortcut functions
debug = mylogger.debug
info = mylogger.info
warning = mylogger.warning
error = mylogger.error
critical = mylogger.critical

testlogging.py:

from mylogging import debug, info, warning, error

debug('debug message')
info('info message')
warning('warning message')
error('error message')

Run it:

python testlogging.py

Console output:

[2009-10-07 12:45:59,713] 10 (22886) testlogging: debug message
[2009-10-07 12:45:59,718] 20 (22886) testlogging: info message
[2009-10-07 12:45:59,718] 30 (22886) testlogging: warning message
[2009-10-07 12:45:59,719] 40 (22886) testlogging: error message

cat debug.log:

[2009-10-07 12:45:59,713] 10 (22886) testlogging: debug message
[2009-10-07 12:45:59,718] 20 (22886) testlogging: info message
[2009-10-07 12:45:59,718] 30 (22886) testlogging: warning message
[2009-10-07 12:45:59,719] 40 (22886) testlogging: error message

cat warning.log:

[2009-10-07 12:45:59,718] 30 (22886) testlogging: warning message
[2009-10-07 12:45:59,719] 40 (22886) testlogging: error message

Note: if you get a permission denied error for the log file, you can do this:

sudo touch /var/log/my-debug.log
sudo touch /var/log/my-warning.log
sudo chmod 666 /var/log/my-debug.log
sudo chmod 666 /var/log/my-warning.log

Documentation

Notes on Python Fabric 0.9b1

Fabric is a Python package used for deploying websites or generally running commands on a remote server. I first used Fabric about a year ago and thought it was great. Since then, Fabric has procured a new maintainer, a new domain, and a few new revisions.

Here are my notes on installing the latest stable version (0.9b1) on Ubuntu Jaunty and running a simple example.

Install Fabric 0.9b1

  • Install Easy Install & pip
    sudo apt-get install python-setuptools python-dev build-essential
    sudo easy_install -U pip
  • Install Fabric

    Note: According to the Fabric website, the latest version of the prerequisite Python library, Paramiko has a bug, so it is recommended to install the previous version, 1.7.4, instead. This can be accomplished by creating a requirements file for pip:

    http://www.lag.net/paramiko/download/paramiko-1.7.4.tar.gz
    http://git.fabfile.org/cgit.cgi/fabric/snapshot/fabric-0.9b1.tar.gz

    To install, use the pip install command with the -r option and the path to your requirements file. For convenience, you can install Fabric using my requirements file:

    sudo pip install -r http://www.saltycrane.com/site_media/code/fabric-requirements.txt

Using Fabric

  • Create a file called fabfile.py in ~/myproject:
    from __future__ import with_statement # needed for python 2.5
    from fabric.api import env, run
    
    def ec2():
        env.hosts = ['ec2-65-234-55-183.compute-1.amazonaws.com']
        env.user = 'saltycrane'
        env.key_filename = '/path/to/my/id_ssh_keyfile'
    
    def ps_apache():
        run('ps -e -O rss,pcpu | grep apache')
    
  • Run it
    cd ~/myproject
    fab ec2 ps_apache

    Results:

    [ec2-65-234-55-183.compute-1.amazonaws.com] run: ps -e -O rss,pcpu | grep apache
    [ec2-65-234-55-183.compute-1.amazonaws.com] err: stdin: is not a tty
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  3571 10996  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  5047 28352  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  5048 27756  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  5049 23752  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  5050 27344  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  5055 27344  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  5166 28404  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  5167 27900  0.0 S ?        00:00:00 /usr/sbin/apache2 -k start
    [ec2-65-234-55-183.compute-1.amazonaws.com] out:  9365  1208  0.0 S ?        00:00:00 /bin/bash -l -c ps -e -O rss,pcpu | grep apache
    
    Done.
    Disconnecting from ec2-65-234-55-183.compute-1.amazonaws.com... done.

List of available env options

I extracted this list from state.py (0.9b1). Or view the tip version

env.reject_unknown_hosts = True         # reject unknown hosts
env.disable_known_hosts = True          # do not load user known_hosts file
env.user = 'username'                   # username to use when connecting to remote hosts
env.password = 'mypassword'             # password for use with authentication and/or sudo
env.hosts = ['host1.com', 'host2.com']  # comma-separated list of hosts to operate on
env.roles = ['web']                     # comma-separated list of roles to operate on
env.key_filename = 'id_rsa'             # path to SSH private key file. May be repeated.
env.fabfile = '../myfabfile.py'         # name of fabfile to load, e.g. 'fabfile.py' or '../other.py'
env.warn_only = True                    # warn, instead of abort, when commands fail
env.shell = '/bin/sh'                   # specify a new shell, defaults to '/bin/bash -l -c'
env.rcfile = 'myfabconfig'              # specify location of config file to use
env.hide = ['everything']               # comma-separated list of output levels to hide
env.show = ['debug']                    # comma-separated list of output levels to show
env.version = '1.0'
env.sudo_prompt = 'sudo password:'
env.use_shell = False
env.roledefs = {'web': ['www1', 'www2', 'www3'],
                'dns': ['ns1', 'ns2'],
                }
env.cwd = 'mydir'

How to check the status code of a command

To check the return code of your command, set the env.warn_only option to True and check the return_code attribute of object returned from run(). For example:

def ec2():
    env.hosts = ['ec2-65-234-55-183.compute-1.amazonaws.com']
    env.user = 'saltycrane'
    env.key_filename = '/path/to/my/id_ssh_keyfile'
    env.warn_only = True

def getstatus():
    output = run('ls non_existent_file')
    print 'output:', output
    print 'failed:', output.failed
    print 'return_code:', output.return_code
fab ec2 getstatus
[ec2-65-234-55-183.compute-1.amazonaws.com] run: ls non_existent_file
[ec2-65-234-55-183.compute-1.amazonaws.com] err: ls: cannot access non_existent_file: No such file or directory

Warning: run() encountered an error (return code 2) while executing 'ls non_existent_file'

output:
failed: True
return_code: 2

Done.
Disconnecting from ec2-65-234-55-183.compute-1.amazonaws.com... done.

Links

Other notes

  • Error message: paramiko.SSHException: Channel closed.

    Try using Paramiko version 1.7.4 instead of 1.7.5. See http://www.mail-archive.com/[email protected]/msg00844.html.

  • How to check the version of Paramiko:
    $ python
    Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) 
    [GCC 4.3.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import paramiko
    >>> paramiko.__version__
    '1.7.5 (Ernest)'
  • Error message: Fatal error: No existing session

    This occurred when I used the wrong username.

Notes on working with files and directories in Python

Documentation:

How to list files in a directory

See my separate post: How to list the contents of a directory with Python

How to rename a file: os.rename

Documentation: http://docs.python.org/library/os.html#os.rename

import os
os.rename("/tmp/oldname", "/tmp/newname")

How to imitate mkdir -p

import os
if not os.path.exists(directory):
    os.makedirs(directory)

How to imitate cp -r (except copy only files including hidden dotfiles)

What didn't work for my purpose:

import os

def _copy_dash_r_filesonly(src, dst):
    """Like "cp -r src/* dst" but copy files only (don't include directories)
    (and include hidden dotfiles also)
    """
    for (path, dirs, files) in os.walk(src):
        for filename in files:
            srcfilepath = os.path.join(path, filename)
            dstfilepath = os.path.join(dst, os.path.relpath(srcfilepath, src))
            dstdir = os.path.dirname(dstfilepath)
            if not os.path.exists(dstdir):
                run('mkdir -p %s' % dstdir)
            run('cp -f %s %s' % (srcfilepath, dstfilepath))