Find the answer to your Linux question:
Results 1 to 3 of 3
I've inherited a bash script. I'm not sure why the original author chose the method he did to get the desired results. Being fairly new to linux scripting I am ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    May 2013
    Posts
    2

    zip vs. find + xargs


    I've inherited a bash script. I'm not sure why the original author chose the method he did to get the desired results. Being fairly new to linux scripting I am interested in determining why the author wrote it the way he did. Did he do so based on performance maximization/resource minimization or "just because"? The script contains the following code:

    Code:
    find . -name '*.gz' | xargs -s 20000 zip -r -q -j myzip.zip
    I'm wondering why the author bothered piping in file names from the find command when he could have easily used a wildcard:

    Code:
     zip r q j *myzip.zip *.gz

    I see too that you can pipe stdin into the zip command, so this too would work:

    Code:
    find . -name '*.gz' | zip -r -q -j myzip.zip
    Is there any advantage or disadvantage to any of the above methods?

    A couple advantages I see with the second method is that the first method could truncate args (due to the -s 20000 parm) and it also is simpler. But, I don't want to change it only to discover there is a good reason it was written the way it was.

  2. #2
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,345
    Hi and welcome

    with the -s option the author established a safeguard.
    There is a limit on how long a commandline can be.
    ie: These can overflow:
    Code:
    find . -name '*.gz' | zip -r -q -j myzip.zip
    zip r q j *myzip.zip *.gz
    The first because the find command may output too many matches,
    the second because the shell may expand the wildcard as a too long line.

    Additionally this would be a heavy call because zip is invoked on every found .gz file:
    Code:
    find . -name '*.gz' | zip -r -q -j myzip.zip

    With xargs the arbitrary number of .gz filenames is separated in chunks and then piped to zip.

    TL;DR: The author knew what he was doing.
    One could add -type f as an additional parameter. This way only files are matched and not e.g. directories with the suffix .gz.
    You must always face the curtain with a bow.

  3. #3
    Just Joined!
    Join Date
    May 2013
    Posts
    2
    Thanks for the information! That is a big help.

    Thanks again.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •