Results 1 to 2 of 2
Hello!
I need to realize such task.
1. In my user's home dir I have folder1;
2. In folder1 I have some (various count) subfolders with random names;
3. In ...
- 04-01-2011 #1Just Joined!
- Join Date
- Apr 2011
- Posts
- 2
Some manipulations with files and folders. (loop, find, create and rm)
Hello!
I need to realize such task.
1. In my user's home dir I have folder1;
2. In folder1 I have some (various count) subfolders with random names;
3. In these subfolders I have one file anyname.pdf (various name in each subfolder) and file content.txt (constant name in each subfolder)
## If in subfolder more than one .pdf or more than one .txt file, I must miss this subfolder (move it to Folder3 - it's one of the folders of my user's home dir) ;
So,
4. I must to scan every subfolders in folder1,
enter in each one;
5. If this subfolder has more than one .pdf file or more than one .txt file, I must go to next subfolder (with moving this subfolder to Folder3);
6. In "good" subfolders I must take file content.txt;
7. It has such structure:
dfdf{some trash}wqwq
begin_of_useful_info
info info ... info
end_of_useful_info
dfdf{some trash}wqwq
8. In this file content.txt I must find and cut only "useful_info" (between begin_of_useful_info and end_of_useful_info,including it):
begin_of_useful_info
info info ... info
end_of_useful_info
( begin_of_useful_info and end_of_useful_info is key-words on each content.txt, it must be cutted too)
9. this info I must put in new .txt file with name as .pdf file in this subfolder!
Example:
In folder1/4323353/ I have files GHTY34.pdf and content.txt. So, after operation I must get a file GHTY34.txt with useful_info)
10. delete parsed file content.txt
(so, now I have again two files in subfolder somename.pdf and somename.txt)
11. then I must copy files .pdf and new .txt with similar names somewhere, for example in folder2 (my user's home dir) and delete analized subfolder.
12. Go to next subfolder.
That's all
Thanks!
- 04-01-2011 #2Linux Newbie
- Join Date
- Nov 2008
- Location
- Tokyo, Japan
- Posts
- 243
You need to do your own work so I will not solve your exact problem, but we are here to help. Collaboration is important in any job.
So here are some shell tricks that can help you get started. Please ask many questions about what the script means, and how to make it work better.
You can use the same technique for sifting through "content.txt", and you can use "sed" to filter out trash. Then you can pipe the sed output to an "awk" script. "awk" is a bit different and a bit more efficient than "bash", but I will write it as bash so you can see how it is done:Code:#!/bin/bash # This script will not solve your problem # but it will hopefully give you ideas. # The "find" command can limit the depth of # its search with the "-maxdepth N" option. # The "-type d" option will select only directories. find folder1 -maxdepth 1 -type d >sub-directory.list # The "-name" option can take # wildcards like "*.txt" and "*.pdf": find folder1 -maxdepth 1 -name '*.txt' >./text-files.list find folder1 -maxdepth 1 -name '*.pdf' >./pdf-files.list # "bash" can execute loops # It can run a command for every line in a file: ( while read SUB_DIRECTORY do # the "wc -l" command can count lines # so you can use it to count items in a sub directory TXT_FILE_COUNT=$(find $SUB_DIRECTORY -maxdepth 1 -name '*.txt' | wc -l) PDF_FILE_COUNT=$(find $SUB_DIRECTORY -maxdepth 1 -name '*.pdf' | wc -l) # the "expr" command can do simple arithmetic FILES_COUNT=$(expr $TXT_FILE_COUNT + $PDF_FILE_COUNT) # "bash" also has "if" statements # if FILES_COUNT is greater than 2: if [ "$FILES_COUNT" -gt 2 ] then mv $SUB_DIRECTORY folder3/$SUB_DIRECTORY fi done ) <./sub-directory.list # ^here the "sub-directory.list" file is used as input for # the above loop. You could also just do this: find folder1 -maxdepth1 -type d | \ ( while read SUB_DIRECTORY do #... same as above done )But you can replace all of the code that starts from (sed ... content.txt | while read LINE_OF_CONTENT; do ... done) with an "awk" program that does the same thing, (lets call the awk program "my-line-filter.awk"). Then the whole script above could be replaced with this:Code:# This "sed" regular expression will delete # all trash from lines that start with "dfdf" and # end with "wqwq". The trash is deleted, leaving # only "dfdfwqw" on the line. # Lookup "sed regular expressions # on Google for more information sed -e 's/^dfdf.*wqwq$//' content.txt | \ ( while read LINE_OF_CONTENT do case "$LINE_OF_CONTENT" in ("dfdfwqwq") ST="open" ;; ("begin_of_useful_info") if [ "$ST" == "open" ] then ST="begin" else ST="error" fi ;; ("end_of_useful_info") if [ "$ST" == "begin" ] then ST="end" else ST="error" fi ;; ("dfdfwqwq") if [ "$ST" == "end" ] then ST="close" else ST="error" fi ;; esac case "$ST" in (open|start|end) echo $LINE_OF_CONTENT ;; (close) echo $LINE_OF_CONTENT ; ST="" ;; (error) echo "Failed, content does not match expected structure." >&2; exit 1 ;; esac done )Code:awk -f my-line-filter.awk content.txt >new-content.txt
Last edited by ramin.honary; 04-01-2011 at 09:51 AM.


Reply With Quote