Find the answer to your Linux question:
Results 1 to 4 of 4
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    Help with Grep to extract a substring in html page

    Please- I'm only looking to use grep..I know there are other better ways to do it using awk or sed or what have you but I need just grep..How do I extract the title in an HTML page..For example
    I want to extract just "MAIN PAGE" in <TITLE>MAIN PAGE</TITLE> using a regex..I can't use any pipes or anything like that..

  2. #2
    I don't know how you gonna get the patter lines that you need to extract, but once you get it, could you do this:

    [user@localhost]$ echo "<TITLE>MAIN PAGE</TITLE>"|cut -d'>' -f2|cut -d'<' -f1

  3. #3
    can't use pipes or cut ..just grep regex

  4. $spacer_open
  5. #4
    You right sorry for ignore the regex's part, Here go again try this:

    echo "<TITLE>MAIN PAGE</TITLE>"|egrep -o '>[A-Z ]{3,25}'|egrep -o '[A-Z ]{3,25}'

    Check the number 3, this is the number of chars at least have it the string, and 25 no more chars above this, so you can change it about the string that you get in the patters.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts