Find the answer to your Linux question:
Results 1 to 7 of 7
Hello, I would like to create a .ksh script which cuts certain XML data from a document. I have tried to use SED and AWK and piping but its been ...
  1. #1
    Just Joined!
    Join Date
    Oct 2010
    Posts
    4

    extract xml from txt doc using sed/awk

    Hello,


    I would like to create a .ksh script which cuts certain XML data from a document. I have tried to use SED and AWK and piping but its been some time since i have operated on Linux and my memory is patchy.

    I would like the script to open a file that user wants.. cut certain information from that document and write it to another file.

    Code:
    sed 's/^[ \t]*//' Statement.xml | awk '/objectName|fieldValue|fieldName/' | less

    Firstly, i done the above which works.. it alligns all the information to the left making it easier on the eye intially then cuts all the information matching objectName etc.. however i now need the script to ask the user what file they wish to open, and the XML that was extracted needs to be put into another file.


    Code:
        sed 's/^[ \t]*//' trace.txt | awk '/objectName|fieldValue|fieldName|fieldID|retirementDate|policyId/' > output
    I have come up with the above, it does work however it ONLY EXTRACTS the information that i have specified and it already has the filename specified, whereas i would like to prompt to the user what file they wish to use...

    "objectName|fieldValue|fieldName|fieldID|retiremen tDate|policyId "

    Not all documents will contain objectName,fieldID, so now i need to look at how to copy the xml from a certain point in the document to another point which will differ in different docs. In the txt file ' trace.txt' everything after the words 'Sending XML' needs to be put in the output file which is consistant accross documents.

    i have quite a few .ksh .sh that do different things, once of which meant i had to delete the first 5 lines of the output file but this is obviously not consistant enough.

    Can any one help?
    Thanks in advance!

  2. #2
    Just Joined!
    Join Date
    Sep 2006
    Location
    Norfolk Island
    Posts
    31
    Hi richiep,

    It looks like you've done the hard part. Basically all you need now is a simple shell scipt file which you can use to ask for file names, different parameters to search for (or a file where they are located) and any other variable you want to use in your sed/awk string.

    There's a pretty good bash scripting guide on this site (I'm not allowed to post links yet) and lots about bash scripting can be found via google, but a basic rendition could be as simple as:

    Code:
    #!/bin/bash
    echo "Get my XML"
    echo -n "Enter the source file name : "
    read infile
    echo -n "Enter the search string, separate tags with pipe character : "
    read search
    echo -n "Enter output file name : "
    read outfile
    sed 's/^[ \t]*//' $infile | awk '/$search/' > $outfile
    echo "Data should be in $outfile if this fool got his code right"
    I'm no coder so that might work. It'll give you the idea anyway

    cheers

  3. #3
    Just Joined!
    Join Date
    Oct 2010
    Posts
    4
    ni_boy,


    Thanks for getting back to me, ill try that out shortly! I dont really know the difference between all of the scripts so looking up bash will most certainly help.. Ill let you know how i get on - thanks again.

  4. #4
    Just Joined!
    Join Date
    Oct 2010
    Posts
    4
    Hello again,

    I just tried it, i created the script and named it extractxml.bsh

    i have put it in my runnable PATH directory and i have also given it the correct permissions (777) however when i run it as a script extract_xml.bsh is says in correct command.

    IS there anything else i can try? I dont really no much about this problem!

    Thanks for the help

  5. #5
    Just Joined!
    Join Date
    Oct 2010
    Posts
    4

    almost there...

    Ok i have managed to get that working now, although it doesnt put anything in to the outfile... it creates it but it is empty.

    I am so very close to finishing this script, can anyone see the problem as to why the script is not searching thtough the infile and writing the results to the outfile?


    Many thanks

  6. #6
    Just Joined!
    Join Date
    Sep 2006
    Location
    Norfolk Island
    Posts
    31
    Hi Richie,

    the script lines I gave you were pretty basic and off the top of my head with no checks. might pay to post what you have in yr script to give others a look.

    Also, try a few echos of your variables to make sure they look right : echo "$infile", etc

    there's also no paths in the file so you need to make sure that the file yu are seaching is in the directory you are in.

    last, but not least, this script runs a bash shell but I think yr running ksh, is that right? If so, does your sed/awk work the same in bash? I'm no guru & I guess it should but it wouldn't hurt to try.

  7. #7
    Just Joined!
    Join Date
    Sep 2006
    Location
    Norfolk Island
    Posts
    31
    D'Oh! My bad, you need to change the quotes in the awk from singe to double. Single tels awk to treat the $ as a literal.

    Code:
    sed 's/^[ \t]*//' $infile | awk "/$search/" > $outfile
    cheers

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...