Results 1 to 6 of 6
Hi,
I have a 'little' problem for which I really badly need a solution -- maybe there is someone out there who could help?
I need to parse a comma ...
- 08-10-2011 #1Just Joined!
- Join Date
- Aug 2011
- Posts
- 11
Help needed: Shell Skript - Korn Shell - Split Lines
Hi,
I have a 'little' problem for which I really badly need a solution -- maybe there is someone out there who could help?
I need to parse a comma seperated file (using ksh on AIX). But some fields contain strings which contains commata itself. For example:
(This is just an example. Real data can contain more than one field with double quotes and commata!)Code:#Name,Phone,Address,City,Description,YearOfBirth Mike,0123456,Street 1,City A,"Description, which contains a comma!",1980 Andrew,234567,Street 2,City B,"Description without a comma",1981
As you can see: Most fields are not encapsulated in '"' (double quotes), but if the field contains at least one comma there are double quotes.
I was thinking of replacing the comma inside the double quoted strings to something else so I can use 'cut' to parse the lines. Or maybe I could replace the commas not encapsulated in double quotes to something else. I was thinking of using 'sed' to get this done but I can't get it to work.
Any suggestions? I need something like this:
Any help would be highly appreciated!Code:cat data.txt | <something magic happens here> | while read LINE ; do NAME=$(echo $LINE | cut -f1 -d,) DESCRIPTION=$(echo $LINE | cut -f5 -d,) YEAR=$(echo $LINE | cut -f6 -d,) done
Many thanks in advance.
Regards,
theMickey.
- 08-11-2011 #2Linux Engineer
- Join Date
- Apr 2006
- Location
- Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
- Posts
- 1,117
Hi.
I found this some time ago as noted in a forum on usenet:
producing:Code:#!/usr/bin/env ksh # @(#) s1 Demonstrate csv parsing in shell. # From a post in comp.unix.shell, # Thu Dec 2 10:18:32 CST 2010 # Section 1, setup, pre-solution. # Infrastructure details, environment, commands for forum posts. # Uncomment export command to test script as external user. # export PATH="/usr/local/bin:/usr/bin:/bin" set +o nounset pe() { for i;do printf "%s" "$i";done; printf "\n"; } pl() { pe;pe "-----" ;pe "$*"; } C=$HOME/bin/context && [ -f $C ] && . $C set -o nounset # Section 3, solution. s='a,"b,c",d,e,f,g,h,i' pl " Input string, '$s'" pl " Results:" while [ -n "$s" ] do case $s in \"*) temp=${s#*\"*\",*} ;; *,*) temp=${s#*,} ;; *) break;; esac var+=( ${s%",$temp"} ) s=$temp done pe " Entire string '${var[*]}'" length=${#var[@]} for (( q=0; q<length; q++ )) do pe " $q of $length, var[$q]: ${var[$q]}" done exit 0
There are probably better vehicles for doing this -- perl and its modules come to mind, for example.Code:% ./s1 Environment: LC_ALL = C, LANG = C (Versions displayed with local utility "version") OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64 Distribution : Debian GNU/Linux 5.0.8 (lenny) ksh 93s+ ----- Input string, 'a,"b,c",d,e,f,g,h,i' ----- Results: Entire string 'a "b,c" d e f g h' 0 of 7, var[0]: a 1 of 7, var[1]: "b,c" 2 of 7, var[2]: d 3 of 7, var[3]: e 4 of 7, var[4]: f 5 of 7, var[5]: g 6 of 7, var[6]: h
Best wishes ... cheers, drlWelcome - get the most out of the forum by reading forum basics and guidelines: click here.
90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
We look forward to helping you with the challenge of the other 10%.
( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )
- 08-11-2011 #3Just Joined!
- Join Date
- Aug 2011
- Posts
- 11
This helped a lot! Thank you very much!Code:while [ -n "$s" ] do case $s in \"*) temp=${s#*\"*\",*} ;; *,*) temp=${s#*,} ;; *) break;; esac var+=( ${s%",$temp"} ) s=$temp done
- 08-12-2011 #4Just Joined!
- Join Date
- Aug 2011
- Posts
- 7
There is also a little tool called csvtool. You might have to install it though (apt-get install csvtool). It can be used to create, join and parse csv files. Especially it's call command can be useful. It's executed like:
Where test.csv is your csv file and myscript could be just any executable command or script. csvtool calls this script for each line of the file with the parameters set to the column values of the row.Code:csvtool call ./myscript test.csv
E.g. if the script is:
The output would beCode:echo -e param1=$1"\n"param2=$2"\n"param3=$3"\n"param4=$4"\n"param5=$5"\n"param6=$6
As you can see, it handles the quoted columns correctly.Code:param1=#Name param2=Phone param3=Address param4=City param5=Description param6=YearOfBirth param1=Mike param2=0123456 param3=Street 1 param4=City A param5=Description, which contains a comma! param6=1980 param1=Andrew param2=234567 param3=Street 2 param4=City B param5=Description without a comma param6=1981
Last edited by netzgewitter; 08-12-2011 at 06:04 PM. Reason: typo
- 08-12-2011 #5Just Joined!
- Join Date
- Aug 2011
- Posts
- 11
- 08-13-2011 #6Just Joined!
- Join Date
- Aug 2011
- Posts
- 7
As I said, it's probably not preinstalled. At least it should be available on every linux distro. I can't say much about AIX. To be honest, I only realized after I posted it that you are on AIX - sorry about that.


Reply With Quote
