I'm reading a csv file with the Python csv module and could not find a setting to remove trailing whitespace. I found this setting, Dialect.skipinitialspace, but it I think it only applies to leading whitespace. Here's a one-liner to delete leading and trailing whitespace that worked for me.
import csv reader = csv.DictReader( open('myfile.csv'), fieldnames=('myfield1', 'myfield1', 'myfield3'), ) # skip the header row next(reader) # remove leading and trailing whitespace from all values reader = ( dict((k, v.strip()) for k, v in row.items() if v) for row in reader) # print results for row in reader: print row
Wouldn't this load all the csv values into memory? And is this a practical solution for large data-sets?
It actually creates another generator so it won't load all values into memory at one time.
If you make it a list instead of a generator (change the parentheses to square brackets), it will load all values into memory at one time:
reader = [ dict((k, v.strip()) for k, v in row.items()) for row in reader] print type(reader)
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »