Using a Python timeout decorator for uploading to S3
At work we are uploading many images to S3 using Python's boto library. However we are experiencing a RequestTimeTooSkewed error once every 100 uploads on average. We googled, but did not find a solution. Our system time was in sync and our file sizes were small (~50KB).
Since we couldn't find the root cause, we added a watchdog timer as a bandaid solution. We already use a retry decorator to retry uploads to S3 when we get a 500 Internal Server Error response. To this we added a timeout decorator which cancels the S3 upload if it takes more than a couple minutes. With this decorator, we don't have to wait the full 15 minutes before S3 returns the 403 Forbidden (RequestTimeTooSkewed error) response.
I found the timeout decorator at Activestate's Python recipes. It makes use of Python's signal library. Below is an example of how it's used.
import signal
class TimeoutError(Exception):
def __init__(self, value = "Timed Out"):
self.value = value
def __str__(self):
return repr(self.value)
def timeout(seconds_before_timeout):
def decorate(f):
def handler(signum, frame):
raise TimeoutError()
def new_f(*args, **kwargs):
old = signal.signal(signal.SIGALRM, handler)
signal.alarm(seconds_before_timeout)
try:
result = f(*args, **kwargs)
finally:
signal.signal(signal.SIGALRM, old)
signal.alarm(0)
return result
new_f.func_name = f.func_name
return new_f
return decorate
Try it out:
import time
@timeout(5)
def mytest():
print "Start"
for i in range(1,10):
time.sleep(1)
print "%d seconds have passed" % i
if __name__ == '__main__':
mytest()
Results:
Start
1 seconds have passed
2 seconds have passed
3 seconds have passed
4 seconds have passed
Traceback (most recent call last):
File "timeout_ex.py", line 47, in <module>
function_times_out()
File "timeout_ex.py", line 17, in new_f
result = f(*args, **kwargs)
File "timeout_ex.py", line 42, in function_times_out
time.sleep(1)
File "timeout_ex.py", line 12, in handler
raise TimeoutError()
__main__.TimeoutError: 'Timed Out'
The RequestTimeTooSkewed error
S3ResponseError: 403 Forbidden <?xml version="1.0" encoding="UTF-8"?> <Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the current time is too large.</Message><MaxAllowedSkewMilliseconds>900000</MaxAllowedSkewMilliseconds><RequestId>7DDDS67HF8E37</RequestId><HostId>LKE893JFGDLASKJR9BJ-A9NASFPNAOPWEJORG-98DFGJA498JVJ-A04320JF0293JLKE</HostId><RequestTime>Tue, 27 Apr 2010 22:20:58 GMT</RequestTime><ServerTime>2010-04-27T22:55:24Z</ServerTime></Error>
See also
- http://nick.vargish.org/clues/python-tricks.html
- http://programming-guides.com/python/timeout-a-function
Related posts
- Fabric post-run processing Python decorator — posted 2010-11-06
- Two of the simplest Python decorator examples — posted 2010-03-09
- Trying out a Retry decorator in Python — posted 2009-11-17
6
Comments
—
Comments feed for this post
#3 Eliot commented on 2010-05-17:
Edwrrd: We did apt-get install ntp on our Ubuntu Hardy server and still got the RequestTimeTooSkewed errors.
Alexander: I don't know much about thread safety so I'm not sure.
#4 Kevin commented on 2011-10-19:
Hey Eliot, I know this is an old thread, but did you ever figure out the root cause of your RequestTimeTooSkewed errors? I think I'm running into the same issue -- NTPD is installed and the time on the server is correct, but requests will randomly hang for 15 minutes before returning a 403 error. Subsequent requests work immediately.
I'll probably end up implementing a similar fix, but I'd like to fix the source of the problem if I can...
#5 Eliot commented on 2011-10-19:
Hi Kevin, I don't think I ever did find the root cause. If I did, I don't remember now. Sorry.
#6 Mansour commented on 2011-11-02:
I have sort of run into related issue with boto and Google Storage and I seem to be able to use socket.setdefaulttimeout() function:
import socket
socket.setdefaulttimeout(30)
It sets the timeout on reads/writes to newly created sockets. Currently the only advantage of using this function over the alarm signal approach that I can think of is the ability to use it in a multithreaded environment (@Alexander signals and threads don't match well).
Mansour
Post a comment
About
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »
Search Blog
Tags
-
algorithms
(5)
-
aws
(9)
-
blogproject
(20)
-
c_cplusplus
(12)
-
cardstore
(8)
-
colinux
(2)
-
concurrency
(13)
-
conkeror
(2)
-
core
(2)
-
cygwin
(17)
-
datastructures
(14)
-
datetime
(4)
-
decorators
(4)
-
django
(40)
-
emacs
(22)
-
files_directories
(11)
-
git
(5)
-
hardware
(5)
-
install_setup
(8)
-
javascript
(3)
-
keyboard
(9)
-
matplotlib
(5)
-
mercurial
(4)
-
nginx
(2)
-
persistence
(5)
-
preferences
(7)
-
processes
(4)
-
pyqt
(18)
-
python
(144)
-
ratpoison
(3)
-
regexes
(6)
-
rsync
(3)
-
softwaretools
(17)
-
sql
(14)
-
ssh
(10)
-
subversion
(6)
-
twisted
(7)
-
ubuntu
(65)
-
urxvt
(5)
-
vxworks
(25)
-
webdev
(5)
-
wmii
(7)
Blogroll
- Adam Gomaa
- Alex Clemesha
- Amir Salihefendic
- Armin Ronacher
- David Beazley
- David Ziegler
- Duncan McGreggor
- Gareth Rushgrave
- Glyph Lefkowitz
- Guido van Rossum
- Ian Bicking
- Jacob Kaplan-Moss
- James Bennett
- James Tauber
- Jesper Noehr
- Marty Alchin
- Matt Harrison
- Nikolay Kolev
- Parand Darugar
- Peter Baumgartner
- Peter Bengtsson
- Rob Hudson
- Simon Willison
- Will McGugan
#1 Edwrrd M. Goldberg commented on 2010-05-14:
The problem is that the internal "hardware" clock is off the mark. You need to NTPD on that server and re-sync the clock. It is important to skip the use the local hardware clock and adjust time ONLY to the NTP time standard.
This issue comes up most of the time on Windows EC2 Servers. But I have seen CentOS servers that have this issue also.
NTPD or xntpd is the cure. Just add this service to the server with yum and set up the external resources.
Edward M. Goldberg http://myCloudWaatcher.com/ e.m.g.