Trivial Thoughts
Thoughts and discussion on programming projects using the Python language.


Python Sites of Note
Software Development



Recent Posts
 4/23/04
 9/23/03
 9/22/03
 9/12/03
 9/11/03
 8/21/03
 7/21/03
 7/17/03
 7/10/03
 7/7/03
 7/1/03
 6/26/03
 6/25/03
 6/18/03
 6/15/03
 6/2/03
 5/28/03


Subscribe to "Trivial Thoughts" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.
 

 

Friday, April 23, 2004
 

The Python CSV Module and Legacy Data

When you work with csv files as much as I do, particularly with csv files created by legacy applications, you tend to run into the odd problems.  Consider the following legacy csv data (a real example):

"this is","an, example","10","of problem data",20

The reader of Python's csv module will turn this into a list like so:

[ "this is", "an, example", "10", "of problem data",20 ]

No problem so far.  Now, let's use the csv writer to turn this same data back into csv data again, round trip.  Without taking any special precautions, we would get:

this is,"an, example",10,of problem data,20

What happened here?  Well, the csv writer will normally only quote data when it contains the field separator.  We can get closer to what we want (that is, recreating the original csv data) by using the QUOTE_NONNUMERIC parameter to the writer.  When we do, we get:

"this is","an, example",10,"of problem data",20

Closer, but the third field, which was quoted in the original data, is not.  We could try using the QUOTE_ALL parameter, which would give us the third field quoted, but unfortunately we'd also get the fifth field quoted, which was not the way the original data had it.

What I need is a way of controlling the quoting of fields on a field by field basis.  Sadly, Python's csv module doesn't give me that level of control over field quoting.  So when I have to deal with legacy csv data like that above, I'm forced to bypass the csv module for writing, and roll my own.  I can still use the csv module for reading.


11:29:11 PM  comment []    


Click here to visit the Radio UserLand website. © Copyright 2004 Michael Kent.
Last update: 4/23/2004; 11:29:19 PM.
This theme is based on the SoundWaves (blue) Manila theme.
April 2004
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  
Sep   May

Previous/Next