Find the answer to your Linux question:
Results 1 to 3 of 3
Hello Linux Gurus: I have an Open Office Document. In this file, many times it has a "start token line" that starts with the word "MODEL", followed by a number ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Nov 2012
    Posts
    9

    Separating a file by a token


    Hello Linux Gurus:

    I have an Open Office Document.

    In this file, many times it has a "start token line" that starts with the word "MODEL", followed by a number of spaces, followed by a number (For example: MODEL 1, MODEL 2, MODEL 3, MODEL 4, MODEL 5, .....etc).

    After each "start token line", there are many lines that end with an "end token line" that consists only of the word "ENDMDL".

    I would like to parse the file so that it grabs all lines starting from (and including) the "start token line" and ending at (and including) the "end token line" into a new output file.

    In other words, if I ran this on a file with 100 of these "start token line" and "end token line", I would like to produce 100 files.

    Any suggestions would be appreciated! I have parsed a file, but not tried a double token approach before...

  2. #2
    Just Joined!
    Join Date
    Nov 2012
    Posts
    9
    Quote Originally Posted by SuzuBell View Post
    Hello Linux Gurus:

    I have an Open Office Document.

    In this file, many times it has a "start token line" that starts with the word "MODEL", followed by a number of spaces, followed by a number (For example: MODEL 1, MODEL 2, MODEL 3, MODEL 4, MODEL 5, .....etc).

    After each "start token line", there are many lines that end with an "end token line" that consists only of the word "ENDMDL".

    I would like to parse the file so that it grabs all lines starting from (and including) the "start token line" and ending at (and including) the "end token line" into a new output file.

    In other words, if I ran this on a file with 100 of these "start token line" and "end token line", I would like to produce 100 files.

    Any suggestions would be appreciated! I have parsed a file, but not tried a double token approach before...

    Specifically, I am able to get the following awk command to work:

    awk '/MODEL/ {flag=1;next} /ENDMDL/{flag=0} flag {print}' 1KZS.pdb > TEST

    And indeed, it does copy and paste all lines between all instances of MODEL-->ENDMDL. However, it concatenates them all into *one* output file, whereas I would like to output each individual case into a *different* output file.

  3. #3
    Just Joined!
    Join Date
    Dec 2013
    Posts
    6
    Quote Originally Posted by SuzuBell View Post
    Specifically, I am able to get the following awk command to work:

    awk '/MODEL/ {flag=1;next} /ENDMDL/{flag=0} flag {print}' 1KZS.pdb > TEST

    And indeed, it does copy and paste all lines between all instances of MODEL-->ENDMDL. However, it concatenates them all into *one* output file, whereas I would like to output each individual case into a *different* output file.
    Check out the instructions on the GNU site (Linuxforums doesn't allow me to post the URL until I have acquired a minimum of seniority): Go to gnu.org, then /software/gawk/manual/gawk.html#Redirection.

  4. $spacer_open
    $spacer_close

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •