How to remove ^M characters from a file with Python
Use the following Python script to remove ^M (carriage
return) characters from your file and replace them with newline
characters only. To do this in Emacs,
see my notes here.
remove_ctrl_m_chars.py:
import os
import sys
import tempfile
def main():
filename = sys.argv[1]
with tempfile.NamedTemporaryFile(delete=False) as fh:
for line in open(filename):
line = line.rstrip()
fh.write(line + '\n')
os.rename(filename, filename + '.bak')
os.rename(fh.name, filename)
if __name__ == '__main__':
main()
Run it
$ python remove_ctrl_m_chars.py myfile.txt
Documentation
-
Built-in Functions —
open() - Built-in Types — File Objects
-
tempfile— Generate temporary files and directories - String Methods ...
Notes on debugging ssh connection problems
- Run the ssh client in verbose mode
$ ssh -vvv user@host - On the server, check auth.log for errors
$ sudo tail -f /var/log/auth.logOn Red Hat, it's
/var/log/secure - For more debugging info, (assuming you have control of the ssh server)
run the sshd server in debug mode on another port
Then specify the port,
$ sudo /usr/sbin/sshd -ddd -p 33333-p 33333with the ssh client. e.g.$ ssh -vvv -p 33333 user@host
Commands run on Ubuntu 10.04
sftp error: Received message too long 170160758¶
Problem was in the .bashrc. See ...
... read more »Example using git bisect to narrow in on a commit
I learned about
git bisect
from this Stack Overflow poll:
What are your favorite git features or tricks?
I thought it was so cool that I wanted to share an example.
After upgrading from Django 1.2.3 to Django 1.3, something broke on the website
I was working on. To figure out what was wrong, I used git bisect
to find the Django revision that introduced the relevant change. I cloned the
Django github repo and
pip install -e'd it into my virtualenv. Then
I used git bisect as follows:
- Start git bisect
$ git bisect start$ git ...
Remove leading and trailing whitespace from a csv file with Python
I'm reading a csv file with the Python csv module and could not find a setting to remove trailing whitespace. I found this setting, Dialect.skipinitialspace, but it I think it only applies to leading whitespace. Here's a one-liner to delete leading and trailing whitespace that worked for me.
import csv
reader = csv.DictReader(
open('myfile.csv'),
fieldnames=('myfield1', 'myfield1', 'myfield3'),
)
# skip the header row
next(reader)
# remove leading and trailing whitespace from all values
reader = (
dict((k, v.strip()) for k, v in row.items()) for row in reader)
# print results
for row in reader:
print row
Example parsing XML with lxml.objectify
Example run with lxml 2.3, Python 2.6.6 on Ubuntu 10.10
from lxml import objectify
xml = '''
<dataset>
<statusthing>success</statusthing>
<datathing gabble="sent">joe@email.com</datathing>
<datathing gabble="not sent"></datathing>
</dataset>
'''
root = objectify.fromstring(xml)
print root.tag
print root.text
print root.attrib
# dataset
# None
# {}
print root.statusthing.tag
print root.statusthing.text
print root.statusthing.attrib
# statusthing
# success
# {}
for e in root.datathing:
print e.tag
print e.text
print e.attrib
print e.attrib['gabble']
# datathing
# joe@email.com
# {'gabble': 'sent'}
# sent
# datathing
# None
# {'gabble': 'not sent'}
# not sent
for e in ...Colorized, interactive "git blame" in Emacs: vc-annotate
A few months ago, I learned that
vc-annotate
displays a nicely colorized
git blame in Emacs.
Today I learned that it is also interactive. I can cycle through revisions
using p and n, open the file at a specified revision
with f, view the log with l, and show a diff
with d or changeset diff with D. (Unfortunately,
the diff is ugly and not in color.)
vc-annotate is included with Emacs. To use, open a version controlled file, and type C-x v g or
M-x vc-annotate
I am using Emacs 23.1 on Ubuntu Maverick
... read more »How to download a tarball from github using curl
The -L option is the key. It allows curl to redirect to the next URL. Here's how to download a tarball from github and untar it inline:
$ curl -L https://github.com/pinard/Pymacs/tarball/v0.24-beta2 | tar zx
Via http://support.github.com/discussions/repos/1789-you-cant-download-a-tarball-with-curl
Alternatively, using wget:
$ wget --no-check-certificate https://github.com/pinard/Pymacs/tarball/v0.24-beta2 -O - | tar xz
(Not too successfully) trying to use Unix tools instead of Python utility scripts
Inspired by articles such as
Why you should learn just a little Awk and
Learn one sed command, I am trying to make use of Unix tools
sed, awk, grep, cut, uniq, sort,
etc. instead of writing short Python utility scripts.
Here is a Python script I wrote this week. It greps a file for a given regular expression pattern and returns a unique, sorted, list of matches inside the capturing parentheses.
# grep2.py
import re
import sys
def main():
patt = sys.argv[1]
filename = sys.argv[2]
text = open(filename).read()
matchlist = set(m.group(1) for m in ...How to use the bash shell with Python's subprocess module instead of /bin/sh
By default, running subprocess.Popen with shell=True
uses /bin/sh as the shell. If you want to change the shell to
/bin/bash, set the executable keyword argument
to /bin/bash.
Solution thanks this great article: Working with Python subprocess - Shells, Processes, Streams, Pipes, Redirects and More
import subprocess
def bash_command(cmd):
subprocess.Popen(cmd, shell=True, executable='/bin/bash')
bash_command('a="Apples and oranges" && echo "${a/oranges/grapes}"')
Output:
Apples and grapes
For some reason, the above didn't work for my specific case, so I had to use the following instead:
import subprocess
def bash_command(cmd):
subprocess ...This time I decided to build my Linux desktop PC myself
Three years ago, I decided to buy a Dell Ubuntu desktop PC instead of building it myself. This time around, I went the other way. My main reasons were: better parts and easier upgrades. We'll see what I decide in 2014.
| Motherboard | ASUS AM3 AMD 880G HDMI and USB 3.0 Micro ATX Motherboard M4A88T-M/USB3 | Amazon | $99.99 |
| CPU | AMD Athlon II X3 445 Rana 3.1GHz Socket AM3 95W Triple-Core Desktop Processor ADX445WFGMBOX | Newegg | $77.00 |
| Power supply | OCZ Technology StealthXStream 2 500-Watt Power Supply | Amazon | $59.99 |
| Case | Cooler Master Elite 360 RC-360-KKN1-GP ATX Mid Tower ... |
About
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »
Search Blog
Tags
-
algorithms
(5)
-
aws
(9)
-
blogproject
(20)
-
c_cplusplus
(12)
-
cardstore
(8)
-
colinux
(2)
-
concurrency
(13)
-
conkeror
(2)
-
core
(2)
-
cygwin
(17)
-
datastructures
(14)
-
datetime
(4)
-
decorators
(4)
-
django
(40)
-
emacs
(22)
-
files_directories
(11)
-
git
(5)
-
hardware
(5)
-
install_setup
(8)
-
javascript
(3)
-
keyboard
(9)
-
matplotlib
(5)
-
mercurial
(4)
-
nginx
(2)
-
persistence
(5)
-
preferences
(7)
-
processes
(4)
-
pyqt
(18)
-
python
(144)
-
ratpoison
(3)
-
regexes
(6)
-
rsync
(3)
-
softwaretools
(17)
-
sql
(14)
-
ssh
(10)
-
subversion
(6)
-
twisted
(7)
-
ubuntu
(65)
-
urxvt
(5)
-
vxworks
(25)
-
webdev
(5)
-
wmii
(7)
Blogroll
- Adam Gomaa
- Alex Clemesha
- Amir Salihefendic
- Armin Ronacher
- David Beazley
- David Ziegler
- Duncan McGreggor
- Gareth Rushgrave
- Glyph Lefkowitz
- Guido van Rossum
- Ian Bicking
- Jacob Kaplan-Moss
- James Bennett
- James Tauber
- Jesper Noehr
- Marty Alchin
- Matt Harrison
- Nikolay Kolev
- Parand Darugar
- Peter Baumgartner
- Peter Bengtsson
- Rob Hudson
- Simon Willison
- Will McGugan