CsvSplit

July 30, 2019

There was a question the other day on Reddit or Stack Overflow or someplace about handling CSV files with awk. We’ve done that in a previous exercise, but today I decided to handle CSV files in a different way. Specifically, I wrote an awk function csvsplit that works the same way as awk’s built-in split function except that it handles CSV strings instead of splitting on a regular expression:

n = csvsplit(str,arr)

Csvsplit takes a string and an array, deletes any current contents of the array, splits the string into fields using the normal CSV rules, stores the fields in arr[1] .. arr[n], and returns n. The splitting rules are: every comma splits a field, except that double-quotes around a field protect commas inside the field, and double-quotes may appear in a quoted field by doubling them (two successive double-quotes).

Your task is to write a csvsplit function for awk. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

2 Comments »

2 Responses to “CsvSplit”

John Cowan said
July 30, 2019 at 6:23 PM
The only problem is that newlines are allowed within a double-quoted field, at least by some programs as well as by RFC 4180, the nearest thing to a standard. So awk’s line-by-line model really doesn’t work without great pain.
programmingpraxis said
July 30, 2019 at 6:27 PM
That’s correct. If you need that functionality, the previous exercise linked in the task description provides it. But the current exercise provides a function that is useful in a large percentage of cases.

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Programming Praxis

CsvSplit

July 30, 2019

2 Responses to “CsvSplit”

Leave a comment

Categories

Archives

Archives

Programming Praxis

CsvSplit

July 30, 2019

Share this:

Related

2 Responses to “CsvSplit”

Leave a comment

Categories

Archives

Archives