Find the answer to your Linux question:
Results 1 to 4 of 4
I am trying to extract text from a file, say the file looks like this: </tr><tr><td>domain1.com</td><td>2006-11-06 14:43:37</td> <td><a href="index.php?_host_id=47">1</a></td> <td><a href="index.php?_host_id=47">2</a></td> <td><a href="index.php?_host_id=47">3</a></td> </tr><tr><td>domain2.com</td><td>2006-11-06 14:43:37</td> <td><a href="index.php?_host_id=48">1</a></td> <td><a href="index.php?_host_id=48">2</a></td> <td><a ...
  1. #1
    Just Joined!
    Join Date
    Nov 2006
    Posts
    3

    Bash scripting - string searching

    I am trying to extract text from a file, say the file looks like this:

    </tr><tr><td>domain1.com</td><td>2006-11-06 14:43:37</td>

    <td><a href="index.php?_host_id=47">1</a></td>
    <td><a href="index.php?_host_id=47">2</a></td>
    <td><a href="index.php?_host_id=47">3</a></td>

    </tr><tr><td>domain2.com</td><td>2006-11-06 14:43:37</td>

    <td><a href="index.php?_host_id=48">1</a></td>
    <td><a href="index.php?_host_id=48">2</a></td>
    <td><a href="index.php?_host_id=48">3</a></td>

    and many other lines like this. What I need is a way to get all of the host_id's only once and only 1 per line, like this:

    47
    48

    I have found the domain string with this already:

    awk '/domain/' file | cut -c 15-25

  2. #2
    Linux Engineer Javasnob's Avatar
    Join Date
    Jul 2005
    Location
    Wisconsin
    Posts
    942
    I would look into grep and sed.
    Flies of a particular kind, i.e. time-flies, are fond of an arrow.

    Registered Linux User #408794

  3. #3
    Linux Newbie
    Join Date
    Aug 2006
    Posts
    226
    If all the lines follow that same format then I one possibility would be the following:

    grep '_host_id=' file | cut -d '=' -f2 | sed 's/\".*$//'

    grep only returns the lines with the "_host_id" string
    cut returns the text following the second = sign
    sed strips everything from the " and beyond

    You could probably get rid of cut and do the line processing with just sed, but I don't have the time to figure it out. I also haven't tested this so it may be slightly off.

  4. #4
    Just Joined!
    Join Date
    Nov 2006
    Posts
    3
    Figured it out, thanks for the help.

    Didn't think to just 'cut' it up.

    Something useful too:

    'sort -um' will give you unique instances of strings within a file without sorting the results by value.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...