Find the answer to your Linux question:
Results 1 to 6 of 6
Hi, I have a 'little' problem for which I really badly need a solution -- maybe there is someone out there who could help? I need to parse a comma ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Aug 2011
    Posts
    11

    Lightbulb Help needed: Shell Skript - Korn Shell - Split Lines


    Hi,

    I have a 'little' problem for which I really badly need a solution -- maybe there is someone out there who could help?

    I need to parse a comma seperated file (using ksh on AIX). But some fields contain strings which contains commata itself. For example:

    Code:
    #Name,Phone,Address,City,Description,YearOfBirth
    Mike,0123456,Street 1,City A,"Description, which contains a comma!",1980
    Andrew,234567,Street 2,City B,"Description without a comma",1981
    (This is just an example. Real data can contain more than one field with double quotes and commata!)

    As you can see: Most fields are not encapsulated in '"' (double quotes), but if the field contains at least one comma there are double quotes.

    I was thinking of replacing the comma inside the double quoted strings to something else so I can use 'cut' to parse the lines. Or maybe I could replace the commas not encapsulated in double quotes to something else. I was thinking of using 'sed' to get this done but I can't get it to work.

    Any suggestions? I need something like this:

    Code:
    cat data.txt | <something magic happens here> | while read LINE ; do
      NAME=$(echo $LINE | cut -f1 -d,)
      DESCRIPTION=$(echo $LINE | cut -f5 -d,)
      YEAR=$(echo $LINE | cut -f6 -d,)
    done
    Any help would be highly appreciated!

    Many thanks in advance.

    Regards,
    theMickey.

  2. #2
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,283
    Hi.

    I found this some time ago as noted in a forum on usenet:
    Code:
    #!/usr/bin/env ksh
    
    # @(#) s1	Demonstrate csv parsing in shell.
    # From a post in comp.unix.shell,
    # Thu Dec  2 10:18:32 CST 2010
    
    # Section 1, setup, pre-solution.
    # Infrastructure details, environment, commands for forum posts. 
    # Uncomment export command to test script as external user.
    # export PATH="/usr/local/bin:/usr/bin:/bin"
    set +o nounset
    pe() { for i;do printf "%s" "$i";done; printf "\n"; }
    pl() { pe;pe "-----" ;pe "$*"; }
    C=$HOME/bin/context && [ -f $C ] && . $C 
    set -o nounset
    
    # Section 3, solution.
    
    s='a,"b,c",d,e,f,g,h,i'
    pl " Input string, '$s'"
    
    pl " Results:"
    while [ -n "$s" ]
    do
      case $s in
        \"*) temp=${s#*\"*\",*}
             ;;
        *,*)  temp=${s#*,} ;;
        *) break;;
      esac
      var+=( ${s%",$temp"} )
      s=$temp
    done
    
    pe " Entire string '${var[*]}'"
    length=${#var[@]}
    for (( q=0; q<length; q++ ))
    do
      pe " $q of $length, var[$q]: ${var[$q]}"
    done
    
    exit 0
    producing:
    Code:
    % ./s1
    
    Environment: LC_ALL = C, LANG = C
    (Versions displayed with local utility "version")
    OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
    Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
    ksh 93s+
    
    -----
     Input string, 'a,"b,c",d,e,f,g,h,i'
    
    -----
     Results:
     Entire string 'a "b,c" d e f g h'
     0 of 7, var[0]: a
     1 of 7, var[1]: "b,c"
     2 of 7, var[2]: d
     3 of 7, var[3]: e
     4 of 7, var[4]: f
     5 of 7, var[5]: g
     6 of 7, var[6]: h
    There are probably better vehicles for doing this -- perl and its modules come to mind, for example.

    Best wishes ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  3. #3
    Just Joined!
    Join Date
    Aug 2011
    Posts
    11
    Code:
    while [ -n "$s" ]
    do
      case $s in
        \"*) temp=${s#*\"*\",*}
             ;;
        *,*)  temp=${s#*,} ;;
        *) break;;
      esac
      var+=( ${s%",$temp"} )
      s=$temp
    done
    This helped a lot! Thank you very much!

  4. #4
    Just Joined!
    Join Date
    Aug 2011
    Posts
    7
    There is also a little tool called csvtool. You might have to install it though (apt-get install csvtool). It can be used to create, join and parse csv files. Especially it's call command can be useful. It's executed like:

    Code:
    csvtool call ./myscript test.csv
    Where test.csv is your csv file and myscript could be just any executable command or script. csvtool calls this script for each line of the file with the parameters set to the column values of the row.

    E.g. if the script is:

    Code:
    echo -e param1=$1"\n"param2=$2"\n"param3=$3"\n"param4=$4"\n"param5=$5"\n"param6=$6
    The output would be

    Code:
    param1=#Name
    param2=Phone
    param3=Address
    param4=City
    param5=Description
    param6=YearOfBirth
    param1=Mike
    param2=0123456
    param3=Street 1
    param4=City A
    param5=Description, which contains a comma!
    param6=1980
    param1=Andrew
    param2=234567
    param3=Street 2
    param4=City B
    param5=Description without a comma
    param6=1981
    As you can see, it handles the quoted columns correctly.
    Last edited by netzgewitter; 08-12-2011 at 06:04 PM. Reason: typo

  5. #5
    Just Joined!
    Join Date
    Aug 2011
    Posts
    11
    Hi!

    Quote Originally Posted by netzgewitter View Post
    There is also a little tool called csvtool. You might have to install it though (apt-get install csvtool). It can be used to create, join and parse csv files. Especially it's call command can be useful. [...]
    I have to run this script on AIX (IBM's UNIX), and there is no such tool like csvtool But thanks anyway.

    I converted drl's while loop into a "function fnCut" which I can use now instead of "cut". And it works perfectly.

  6. #6
    Just Joined!
    Join Date
    Aug 2011
    Posts
    7
    As I said, it's probably not preinstalled. At least it should be available on every linux distro. I can't say much about AIX. To be honest, I only realized after I posted it that you are on AIX - sorry about that.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •