Find the answer to your Linux question:
Results 1 to 6 of 6
I'm trying to write a bash script that has to extract values from a csv file. Problem is there are lines like this: a,b,c,"dd,dd,dd",e,f,g I'm using awk to extract the ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Linux Newbie
    Join Date
    Mar 2009
    Posts
    228

    [SOLVED] Ignoring commas within double quotes


    I'm trying to write a bash script that has to extract values from a csv file. Problem is there are lines like this:

    a,b,c,"dd,dd,dd",e,f,g

    I'm using awk to extract the values but when I try it extract value 4 with awk I get:

    "dd

    instead of:

    "dd,dd,dd"

    Does anyone know how to get awk to ignore commas within double quotes?

  2. #2
    Just Joined! barriehie's Avatar
    Join Date
    Apr 2008
    Location
    The Desert!
    Posts
    85
    Quote Originally Posted by lomcevak View Post
    I'm trying to write a bash script that has to extract values from a csv file. Problem is there are lines like this:

    a,b,c,"dd,dd,dd",e,f,g

    I'm using awk to extract the values but when I try it extract value 4 with awk I get:

    "dd

    instead of:

    "dd,dd,dd"

    Does anyone know how to get awk to ignore commas within double quotes?
    Try this:
    file: testfile
    a,b,c,"dd,dd,dd",e,f,g
    h,i,j,k,l,"mm,mm,mm",n
    Code:
     cat ./testfile | gawk '{ if($0~/^.*".*".*$/) { gsub(/"/,"",$0); print $0 } else { print $0 } }'
    Edit: I just reread your post after posting, do you want to just nuke the quotes or make it look like dddddd or dd,dd,dd???

    Output:
    Code:
    a,b,c,dd,dd,dd,e,f,g
    h,i,j,k,l,mm,mm,mm,n
    Last edited by barriehie; 11-04-2010 at 06:42 AM. Reason: Oops

  3. #3
    Linux Newbie
    Join Date
    Mar 2009
    Posts
    228
    Thanks for the reply. Right now if I say:

    Code:
    awk '{print $4}'
    The output is "dd

    I want the output to be "dd,dd,dd"

    Ideally I would like the double quotes removed but I can do that later with sed if I have to.

    My knowledge of awk if very limited.

  4. $spacer_open
    $spacer_close
  5. #4
    Just Joined! barriehie's Avatar
    Join Date
    Apr 2008
    Location
    The Desert!
    Posts
    85
    Quote Originally Posted by lomcevak View Post
    Thanks for the reply. Right now if I say:

    Code:
    awk '{print $4}'
    The output is "dd

    I want the output to be "dd,dd,dd"

    Ideally I would like the double quotes removed but I can do that later with sed if I have to.

    My knowledge of awk if very limited.
    Makes 2 of us! So you want each field on its own line?
    Like this:
    Code:
    a
    b
    c
    "dd,dd,dd"
    e
    f
    g

  6. #5
    Linux Enthusiast
    Join Date
    Aug 2006
    Posts
    631
    You can do something like this:

    Code:
    awk -F\" '{for(i=1;i<=NF;i+=2) {gsub(",", ";", $i)}}1' OFS= file.csv |
    awk -F\; '{ print $1;print $2;print $3;print $4;print $5}'

  7. #6
    Linux Newbie
    Join Date
    Mar 2009
    Posts
    228
    Thanks Franklin52, that works.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •