Find the answer to your Linux question:
Results 1 to 9 of 9
Hi, I have a very large XML file(Lets say file1.xml) and I have another XML file(file2.xml) which will contain some entries same as file1.xml My purpose is to remove those ...
  1. #1
    Just Joined!
    Join Date
    Apr 2007
    Posts
    59

    Remove entries found in file2 from file1 using sed, awk

    Hi,
    I have a very large XML file(Lets say file1.xml) and I have another XML file(file2.xml) which will contain some entries same as file1.xml
    My purpose is to remove those common entries from file1.xml and store it to file3.xml

    I can't sort the file. How can I do it?
    Pleas help!!

  2. #2
    Linux Newbie
    Join Date
    Jul 2008
    Posts
    181
    Why can't you use "sort"?

  3. #3
    Just Joined!
    Join Date
    Sep 2008
    Posts
    20
    I don't really understand your purpose.
    Can you post an example.
    I think you should use a scripting language like Perl or PHP to do it.

  4. #4
    Just Joined!
    Join Date
    Apr 2007
    Posts
    59
    I can't use the sort because it's an installer project file and I do not want to screw it up by sorting it.
    Let's take an example of text files:

    file1.txt contains:
    Code:
    line1
    line2
    line3
    line4
    line5
    file2.txt contains:
    Code:
    line3
    line4
    My purpose is to have a file3.txt which should contain:
    Code:
    line1
    line2
    line5

  5. #5
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    show samples of your xml files.

  6. #6
    Just Joined!
    Join Date
    Sep 2008
    Posts
    20
    for your simple example above, you can do:

    grep -v -f file1.txt file2.txt > file3.txt

  7. #7
    Just Joined!
    Join Date
    Apr 2007
    Posts
    59
    Thanks, It works for text files, but does not seem to work for xml files.

    file1.xml
    Code:
    <fileEntry mountPoint="559" file="${compiler:TOOLS_ROOT}/lib/embed.lib" overwrite="4" shared="false" mode="644" uninstallMode="0" />
    <fileEntry mountPoint="559" file="${compiler:ICU_ROOT}/lib/icuin.lib" overwrite="4" shared="false" mode="644" uninstallMode="0" />
    <fileEntry mountPoint="559" file="${compiler:ICU_ROOT}/lib/icuuc.lib" overwrite="4" shared="false" mode="644" uninstallMode="0" />
    <fileEntry mountPoint="559" file="${compiler:CXX_ROOT}/lib/mscvcch.lib" overwrite="4" shared="false" mode="644" uninstallMode="0" />
    file2.xml
    Code:
    <fileEntry mountPoint="559" file="${compiler:TOOLS_ROOT}/lib/embed.lib" overwrite="4" shared="false" mode="644" uninstallMode="0" />
    <fileEntry mountPoint="559" file="${compiler:CXX_ROOT}/lib/mscvcch.lib" overwrite="4" shared="false" mode="644" uninstallMode="0" />

  8. #8
    Just Joined!
    Join Date
    Sep 2008
    Posts
    20
    Hello,

    for your tiny xml example above, it still works perfectly.
    I tried it!

    grep -v -f file2.xml file1.xml > file3.xml

  9. #9
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    man comm will tell you all you need to know.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...