Find the answer to your Linux question:
Results 1 to 10 of 10
I am trying to remove all new line characters from a document downloaded from wget, but every time it reads a <br> tag, it is unable to remove the new ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jun 2011
    Posts
    17

    Problem removing new lines


    I am trying to remove all new line characters from a document downloaded from wget, but every time it reads a <br> tag, it is unable to remove the new line character. I have tried removing the <br> tags before removing the new lines using these methods:

    PHP Code:
    cat days.txt tr -'\n'
    cat days.txt awk '{ printf "%s", $0 }'
    cat days.txt sed ':a;N;$!ba;s/\n//g' 
    still didn't work. please help, sos,

  2. #2
    Just Joined!
    Join Date
    Aug 2011
    Posts
    51
    I'm not at my computer to test it but try:
    awk -F\n '{print $1}' days.txt

  3. #3
    Just Joined!
    Join Date
    Jun 2011
    Posts
    17
    Quote Originally Posted by histrungalot View Post
    I'm not at my computer to test it but try:
    awk -F\n '{print $1}' days.txt
    negative, no effect

  4. $spacer_open
    $spacer_close
  5. #4
    Just Joined!
    Join Date
    Sep 2007
    Posts
    4
    Can you post the result of:

    [user@localhost]$ file days.txt

  6. #5
    Just Joined!
    Join Date
    Aug 2011
    Posts
    51
    You don't have a dos2unix issue? 0xa vs 0xb 0xa

  7. #6
    Just Joined!
    Join Date
    Jun 2011
    Posts
    17
    it's acctually not days.txt, it's yanswer and "file ~/yanswer" returned this:
    yanswer: HTML document text

    i don't believe it's a dos2unix issue, i think it's just something invisible that get's added when there is a <br> present in the code of the file.

  8. #7
    Just Joined!
    Join Date
    Aug 2011
    Posts
    51
    Use od -c -A x -tx1 yanswer to see if there is something there

  9. #8
    Just Joined!
    Join Date
    Jun 2011
    Posts
    17
    it returned a very long list of characters, what exactly am i looking for?

  10. #9
    Just Joined!
    Join Date
    Aug 2011
    Posts
    51
    Look for the <br> and see what values are after it. For linux the '\n' is a 0x0a and windows its 0x0a 0x0d.
    So you are just checking to see that the '\n' is just char.
    You should see something like:

    0000: ... 3C 62 72 3E ?? ?? <- Where the ?? are what is coming after the <br> tab.
    < b r >

  11. #10
    Just Joined!
    Join Date
    Jun 2011
    Posts
    17
    ty, it seems there is an extra character "\r" that was causing the line to end and a new one to begin. simple sed command removed it and all is well, thanks for the help guys

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •