Find the answer to your Linux question:
Results 1 to 2 of 2
Hey guys, so I am writing a script, and I need help extracting the title of movies from this html code: Code: <a title="Faster" href="/browse/catalog/movieDetails/461027" onmouseover="if(DndUtil.windowLoaded){ new MovieRollover(this); }"><img src="url_blocked_bb/title/461027.jpeg?wid=100&hei=143" ...
  1. #1
    Just Joined!
    Join Date
    Dec 2010
    Posts
    4

    Help extracting using grep or similar

    Hey guys, so I am writing a script, and I need help extracting the title of movies from this html code:
    Code:
    <a title="Faster" href="/browse/catalog/movieDetails/461027" onmouseover="if(DndUtil.windowLoaded){ new MovieRollover(this); }"><img src="url_blocked_bb/title/461027.jpeg?wid=100&hei=143" height="143" width="100" alt=""></a>
    I need to extract it to just output "Faster" without the quotes.
    I have tried various grep statements, but I can't get it to extract it properly.

    Any ideas? I don't mind if its grep or awk or any other utility

    EDIT1; Oh, and this is going to be in a loop, so there will be movies with different lengths of names

    EDIT2; Actually, I figured it out. For my specific code it looks like this:

    Code:
    grep -o "\<a\ title=\"[A-Z][a-z]*" | cut -d\" -f2 | uniq
    EDIT3; So this works, I just realized that my code would only work on single word movie titles. Any help including spaces?

  2. #2
    Just Joined!
    Join Date
    Dec 2010
    Posts
    4
    I started thinking a bit differently on this, and I got it.

    Here is the code:
    Code:
    grep -o \<a\ title=\".*\"\ href | cut -d\" -f2

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...