UnicodeEncodeError: 'ascii' codec can't encode character u'\xa1' in position 0: ordinal not in range(128)
If you've ever gotten this error, Django's smart_str
function might be able to help. I found this from James Bennett's
article,
Unicode in the real world. He provides a very good explanation
of Python's Unicode and bytestrings, their use in Django, and using
Django's Unicode utilities for working with non-Unicode-friendly
Python libraries. Here are my notes from his article as it applies
to the above error. Much of the wording is directly from James
Bennett's article.
This error occurs when you pass a Unicode string containing non-English characters (Unicode characters beyond 128) to something that expects an ASCII bytestring. The default encoding for a Python bytestring is ASCII, "which handles exactly 128 (English) characters". This is why trying to convert Unicode characters beyond 128 produces the error.
The good news is that you can encode Python bytestrings in other encodings
besides ASCII. Django's smart_str function in the
django.utils.encoding module, converts a Unicode string
to a bytestring using a default encoding of UTF-8.
Here is an example using the built-in function, str:
a = u'\xa1'
print str(a) # this throws an exception
Results:
Traceback (most recent call last): File "unicode_ex.py", line 3, inprint str(a) # this throws an exception UnicodeEncodeError: 'ascii' codec can't encode character u'\xa1' in position 0: ordinal not in range(128)
Here is an example using smart_str:
from django.utils.encoding import smart_str, smart_unicode
a = u'\xa1'
print smart_str(a)
Results:
¡
A simpler way to do this is:
print unicode(u'xa1').encode("utf-8")
Arthur, thanks for the tip. I'm not sure what differences the Django utility functions have. I will have to look into this further. For other readers, here is the documentation for encode: http://www.python.org/doc/2.5.2/lib/string-methods.html
aws
(4)
bison_flex
(1)
blogger
(4)
c
(10)
cardstore
(5)
colinux
(2)
concurrency
(8)
conkeror
(2)
cygwin
(17)
dell
(3)
django
(31)
eclipse
(30)
emacs
(18)
email
(1)
error
(11)
gnip
(1)
json
(1)
keyboard
(3)
linux
(31)
matplotlib
(5)
mercurial
(3)
openid
(1)
personal
(4)
preferences
(4)
pyqt
(18)
python
(88)
rails
(1)
ratpoison
(3)
recursion
(1)
rsync
(3)
ruby
(2)
sql
(10)
subversion
(4)
twisted
(5)
ubuntu
(33)
untagged
(7)
urxvt
(3)
vxworks
(26)
wmii
(3)