Find the answer to your Linux question:
Results 1 to 2 of 2
So I have a file that contains Code: >NM_#########AUGCAUCGUAGCUAGUCGAUACUGGACUG>NM_########AUGAGUAUGUAUGAUGUAUGUAUGA where # is any digit 0-9 (the text is many repetitions of the pattern above, not just that, but all in ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Dec 2012
    Posts
    3

    Insert new line help


    So I have a file that contains

    Code:
    >NM_#########AUGCAUCGUAGCUAGUCGAUACUGGACUG>NM_########AUGAGUAUGUAUGAUGUAUGUAUGA
    where # is any digit 0-9 (the text is many repetitions of the pattern above, not just that, but all in one line), and I want it to show

    Code:
    >NM_#########
    AUGUAGUGCUAGCUGAUCGAUGCUAGUCGUAGC
    >NM_########
    AGUGAGUCGUCGUGACUGACUGUGGCAUCGUA
    Basically I need to add a new line before every > and between a number and a letter.

    OR

    If it's easier, I have something like this

    Code:
    >NM_#########
    AUGCUGAC
    GACGUAGC
    ACGUGUAG
    >NM_########
    AGUGCUGA
    ACGUAGCU
    ACGUGCUA
    and I want to condense all the letter only lines to one line, like the output shown above.

    If someone could help me how to do this with a simple command, it would really help.

    Thank you!

  2. #2
    Just Joined!
    Join Date
    Dec 2012
    Posts
    4
    Assuming that the file containing text is named original.txt, try the following:
    - open a terminal
    - cd to the directory containing original.txt
    - execute this: sed -e "s/\(>NM_[0-9]*\)\([^>]*\)/\1\n\2\n/g" original.txt > modified.txt

    This will leave original.txt untouched and write the output in a second file (modified.txt) in the same directory.

    If you are interested in (or in need of) string manipulation I suggest you to have a look at Regular Expressions

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •