Find the answer to your Linux question:
Results 1 to 2 of 2
Hey, I have a series of files that im trying to extract .wmv links out off. I have this so far... Code: issue='tempurl/issue9' # Get the URL for the Text ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Linux Newbie
    Join Date
    Dec 2005
    Posts
    104

    Problem with extracting links from a file...


    Hey,
    I have a series of files that im trying to extract .wmv links out off.
    I have this so far...
    Code:
    issue='tempurl/issue9'
    
    # Get the URL for the Text files containing the links and download them.
    for i in $( cat $issue.txt | grep -A1 '+ (Video)' | grep -v '+ (Video)' | grep -v '\-\-'); do
        cd /home/funny/tempurl/
       wget $i
    done
    
    
    ##Go through the text files and extract the links
    for i in $( ls -lh /home/funny/tempurl/ | awk ' {print $9}' ); do
        echo hello
        cat $i | grep 'src=' |grep 'wmv'| awk 'BEGIN { FS = "[\"]" } ; { print $6 }'
        cat $i | grep '<a href=' |grep  'http://media.ebaumsworld.com/' | awk 'BEGIN { FS = "[\"]" } ; { print $4 }'
    done ## for i in $( ll | awk ' {print $9}' ); do
    Some of the links are like http://media.ebaumsworld.com/LIKE TO WMV...

    but others are like src="MOVIE.WMV"
    Now with the commands i have above, it works, but displays the complete links twice, because they both contain WMV and SRC...
    So i get this
    hello
    christmas.wmv
    hello
    copierprank.wmv
    hello
    drunkwalk.wmv
    hello
    hello
    hello
    keepaway.wmv
    hello
    4
    hello
    hello
    pulpdragon.wmv
    hello
    rockhard.wmv
    hello
    soccer.wmv
    hello
    http://media.ebaumsworld.com/wmv/tit...ojacksback.wmv
    http://media.ebaumsworld.com/wmv/tit...ojacksback.wmv
    I only want one copy of each link, as i will pass it to wget to actually download them...

    help??
    cheers.

  2. #2
    Linux Newbie
    Join Date
    Dec 2005
    Posts
    104
    Something like this??
    Code:
    for i in $( ls -lh /home/funny/tempurl/ | awk ' {print $9}' ); do
    
        echo -------------------------------------------------------
    
     if [( cat $i | grep 'src=' |grep 'mp3'| awk 'BEGIN { FS = "[\"]" } ; { print $6 }') ="" ]; then
      $found=""
      ## Do nothing-- we dont have a match
     else
      cat $i | grep 'src=' |grep 'mp3'| awk 'BEGIN { FS = "[\"]" } ; { print $6 }'
     fi
    
     if [(cat $i | grep '<a href=' |grep  'http://media.ebaumsworld.com/' | awk 'BEGIN { FS = "[\"]" } ; { print $4 }') = "" ]; then
      $found=""
      ## Do nothing-- we dont have a match
     else
      cat $i | grep '<a href=' |grep  'http://media.ebaumsworld.com/' | awk 'BEGIN { FS = "[\"]" } ; { print $4 }'
     fi
    
     if [(cat $i | grep 'src=' |grep 'wmv'| awk 'BEGIN { FS = "[\"]" } ; { print $6 }') ="" ]]; then
      $found=""
      ## Do nothing-- we dont have a match
     else
       cat $i | grep 'src=' |grep 'wmv'| awk 'BEGIN { FS = "[\"]" } ; { print $6 }'
     fi
    
    done ## for i in $( ll | awk ' {print $9}' ); do

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •