Find the answer to your Linux question:
Results 1 to 2 of 2
Hello! I need to realize such task. 1. In my user's home dir I have folder1; 2. In folder1 I have some (various count) subfolders with random names; 3. In ...
  1. #1
    Just Joined!
    Join Date
    Apr 2011
    Posts
    2

    Some manipulations with files and folders. (loop, find, create and rm)

    Hello!

    I need to realize such task.

    1. In my user's home dir I have folder1;
    2. In folder1 I have some (various count) subfolders with random names;
    3. In these subfolders I have one file anyname.pdf (various name in each subfolder) and file content.txt (constant name in each subfolder)
    ## If in subfolder more than one .pdf or more than one .txt file, I must miss this subfolder (move it to Folder3 - it's one of the folders of my user's home dir) ;

    So,
    4. I must to scan every subfolders in folder1,
    enter in each one;
    5. If this subfolder has more than one .pdf file or more than one .txt file, I must go to next subfolder (with moving this subfolder to Folder3);
    6. In "good" subfolders I must take file content.txt;
    7. It has such structure:

    dfdf{some trash}wqwq
    begin_of_useful_info
    info info ... info
    end_of_useful_info
    dfdf{some trash}wqwq


    8. In this file content.txt I must find and cut only "useful_info" (between begin_of_useful_info and end_of_useful_info,including it):

    begin_of_useful_info
    info info ... info
    end_of_useful_info

    ( begin_of_useful_info and end_of_useful_info is key-words on each content.txt, it must be cutted too)

    9. this info I must put in new .txt file with name as .pdf file in this subfolder!

    Example:
    In folder1/4323353/ I have files GHTY34.pdf and content.txt. So, after operation I must get a file GHTY34.txt with useful_info)

    10. delete parsed file content.txt
    (so, now I have again two files in subfolder somename.pdf and somename.txt)

    11. then I must copy files .pdf and new .txt with similar names somewhere, for example in folder2 (my user's home dir) and delete analized subfolder.
    12. Go to next subfolder.

    That's all
    Thanks!

  2. #2
    Linux Newbie
    Join Date
    Nov 2008
    Location
    Tokyo, Japan
    Posts
    243
    You need to do your own work so I will not solve your exact problem, but we are here to help. Collaboration is important in any job.

    So here are some shell tricks that can help you get started. Please ask many questions about what the script means, and how to make it work better.

    Code:
    #!/bin/bash
    # This script will not solve your problem
    # but it will hopefully give you ideas.
    
    # The "find" command can limit the depth of
    # its search with the "-maxdepth N" option.
    # The "-type d" option will select only directories.
    find folder1 -maxdepth 1 -type d >sub-directory.list
    
    # The "-name" option can take
    # wildcards like "*.txt" and "*.pdf":
    find folder1 -maxdepth 1 -name '*.txt' >./text-files.list
    find folder1 -maxdepth 1 -name '*.pdf' >./pdf-files.list
    
    # "bash" can execute loops
    # It can run a command for every line in a file:
    ( while read SUB_DIRECTORY
      do
      # the "wc -l" command can count lines
      # so you can use it to count items in a sub directory
        TXT_FILE_COUNT=$(find $SUB_DIRECTORY -maxdepth 1 -name '*.txt' | wc -l)
        PDF_FILE_COUNT=$(find $SUB_DIRECTORY -maxdepth 1 -name '*.pdf' | wc -l)
      # the "expr" command can do simple arithmetic
        FILES_COUNT=$(expr $TXT_FILE_COUNT + $PDF_FILE_COUNT)
      # "bash" also has "if" statements
        # if FILES_COUNT is greater than 2:
        if [ "$FILES_COUNT" -gt 2 ]
        then mv $SUB_DIRECTORY folder3/$SUB_DIRECTORY
        fi
      done
    ) <./sub-directory.list
    # ^here the "sub-directory.list" file is used as input for
    # the above loop. You could also just do this:
    find folder1 -maxdepth1 -type d | \
    ( while read SUB_DIRECTORY
      do #... same as above
      done
    )
    You can use the same technique for sifting through "content.txt", and you can use "sed" to filter out trash. Then you can pipe the sed output to an "awk" script. "awk" is a bit different and a bit more efficient than "bash", but I will write it as bash so you can see how it is done:
    Code:
    # This "sed" regular expression will delete
    # all trash from lines that start with "dfdf" and
    # end with "wqwq". The trash is deleted, leaving
    # only "dfdfwqw" on the line.
    # Lookup "sed regular expressions
    # on Google for more information
    sed -e 's/^dfdf.*wqwq$//' content.txt | \
    ( while read LINE_OF_CONTENT
      do
        case "$LINE_OF_CONTENT" in
          ("dfdfwqwq") ST="open" ;;
          ("begin_of_useful_info")
            if [ "$ST" == "open" ]
            then ST="begin"
            else ST="error"
            fi
          ;;
          ("end_of_useful_info")
            if [ "$ST" == "begin" ]
            then ST="end"
            else ST="error"
            fi
          ;;
          ("dfdfwqwq")
            if [ "$ST" == "end" ]
            then ST="close"
            else ST="error"
            fi
          ;;
        esac
        case "$ST" in
          (open|start|end) echo $LINE_OF_CONTENT ;;
          (close) echo $LINE_OF_CONTENT ; ST="" ;;
          (error) echo "Failed, content does not match expected structure." >&2; exit 1 ;;
        esac
      done
    )
    But you can replace all of the code that starts from (sed ... content.txt | while read LINE_OF_CONTENT; do ... done) with an "awk" program that does the same thing, (lets call the awk program "my-line-filter.awk"). Then the whole script above could be replaced with this:
    Code:
    awk -f my-line-filter.awk content.txt >new-content.txt
    Last edited by ramin.honary; 04-01-2011 at 09:51 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...