Find the answer to your Linux question:
Results 1 to 6 of 6
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    How do I use sed/awk to get strings from a file?


    Hello community!

    I'm elbow deep in a script and I need some assistance - I'm stuck and I can't figure this one out.

    I'm trying to get a list from a file.

    Here is what the file looks like:
    Code:
    file_example_1_name=example_1
    file_example_1_type=1
    file_example_1_location=/foo/
    file_example_2_name=example_2
    file_example_2_type=2
    file_example_2_location=/bar/
    file_example_3_name=example_3
    file_example_3_type=1
    file_example_3_location=/foobar/
    The file contains bits of information about the files. I would like to be able to find the location for each file of type 1. So I would like my output to look like:
    Code:
    example_1
    example_3
    I can figure out how to get to the final output but I'm stuck trying to extract the bits of data needed from all of the lines in the file. If I can get the name I should be able to do the rest.

    I've tried a lot of things, the best I've been able to do is:
    Code:
    sed -n 's/file_.*_type=1/???/p'
    and I get very close but I don't have the proper info to replace. I think I just need to replace the '???' with something but I'm really not sure if I'm even going down the right path. (I know my example doesn't work, it just lists 2 lines of '???')

    How do I go through a file or standard output and for each line search for an unknown value between two known strings and then print the previously unknown strings?

    Any help would be appreciated. There is so much talent in this forum it astounds me. I'm sure there is a million ways to do this - I'm just a newb and can't figure it out.

    Thanks!

    - robbie -

  2. #2
    Linux Guru
    Join Date
    Oct 2001
    Location
    Täby, Sweden
    Posts
    7,578
    Your specification is a bit confused. You say that you want to find the "location" of the files, but your example of the output lists only the lines which say "name" in the input, not "location".

    Either way, what complicates the task the most, is the fact that the records in the input (several lines, grouped by an ID) do not match the records naturally processed by sed or awk (individual lines). As you say, there are obviously a million ways to go about this, but if it were me, I'd write an explicit impedance matcher, maybe like this:

    Code:
    echo "endmarker_=" | cat input - |
    sed -n 's/^\([^=]*\)_\([^_=]*\)=\(.*\)$/\1\n\2\n\3/p' | (
        cur=; type=; loc=
        while read id && read recname && read data; do
            if [ "$id" != "$cur" ]; then
                cur="$id"
                if [ "$type" = 1 -a -n "$loc" ]; then echo "$loc"; fi
                type=; loc=
            fi
            if [ "$recname" = type ]; then type="$data"; fi
            if [ "$recname" = location ]; then loc="$data"; fi
        done
    )
    I hope that helps a bit.

  3. #3
    Linux Guru
    Join Date
    Oct 2001
    Location
    Täby, Sweden
    Posts
    7,578
    Oh, by the way. If you are sure that the lines containing the data you want come before the "type" lines, you can just use sed and its hold space. Try this:
    Code:
    sed -n '/file_.*_name=/{s/^[^=]*=//; h}; /file_.*_type=1/{g;p}'

  4. $spacer_open
    $spacer_close
  5. #4

    Sorry for the confusion

    Hello-

    I looked at the first solution and it scared me a bit. I have no idea what most of that means - I'm still new and learning but I'll dig through there someday and learn quite a bit. The second solution makes more sense - it looks like your running a sed within a sed to get the data and I might be able to figure that part out.

    When I run the second solution I get an error:
    Code:
    COMPY:~ robbie$ cat ~/test.file | sed -n '/file_.*_name=/{s/^[^=]*=//; h}; /file_.*_type=1/{g;p}'
    sed: 1: "/file_.*_name=/{s/^[^=] ...": extra characters at the end of h command
    My end goal is to get the "location" for the files but I need the name list for other things in my script and I'll learn more if I have to do the work. I'm just looking for how to take:

    Code:
    file_example_3_type=1
    and get

    Code:
    example_3
    Since the name label changes and is user configurable I want to avoid using cut as it will cause issues down the road.

    I've found a few things that will take the entire file and get the information from between the first "file_" and the last "_type=1" but I need a process that goes through line by line and substitutes the original line for just the name label part.

    I could add a while,do,done but it seems that I should be able to do it without a loop and all in one nice string.

    You've given me a good start to work with. Thanks, it looks like gibberish now but after reading the man page again maybe I can make some sense of it.

    I appreciate the support, these tools have been around longer than I have and it may take the rest of my time here to learn how to use them properly. Its an entirely different language like nothing else around. Thanks!

    - robbie -

  6. #5
    Linux Guru
    Join Date
    Oct 2001
    Location
    Täby, Sweden
    Posts
    7,578
    Quote Originally Posted by rkasowan View Post
    When I run the second solution I get an error:
    Code:
    COMPY:~ robbie$ cat ~/test.file | sed -n '/file_.*_name=/{s/^[^=]*=//; h}; /file_.*_type=1/{g;p}'
    sed: 1: "/file_.*_name=/{s/^[^=] ...": extra characters at the end of h command
    That's interesting. Is your distribution not using GNU sed? Try this instead, see if that makes the problem vanish:
    Code:
    sed -n '/file_.*_name=/{s/^[^=]*=//; h;}; /file_.*_type=1/{g; p;}'
    (The difference is the added semicolon after the `h' and `p' commands.)
    If you want to learn more about sed, I should recommend its texinfo manual over the manpage, since it holds more information as to the general operation of sed. Try `info sed'. (Or, in EMACS, C-h i m sed RET )

    Quote Originally Posted by rkasowan View Post
    My end goal is to get the "location" for the files but I need the name list for other things in my script and I'll learn more if I have to do the work. I'm just looking for how to take:

    Code:
    file_example_3_type=1
    and get

    Code:
    example_3
    Well, if all you wish is to extract the label from that line alone, then this is all you will need:
    Code:
    sed -n 's/^file_\(.*\)_type=1/\1/p'
    The `\1' in the replacement part of the `s' command tells sed to insert the matched text of the first parenthesized group of the regex.

  7. #6

    Thank you!

    -->
    Your solution was spot on!

    I'm running Darwin on a Mac so its a UNIX variant. I'm sorry I wasn't clear enough on the original post, I think I just over complicated the question.

    Thanks for the help - I've been stuck on this for a few days and finally can move on. I thought I was close but it would have taken me forever to figure this out. You rock!

    - robbie -

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •