Find the answer to your Linux question:
Results 1 to 8 of 8
Howdy! I am new to linux and trying to accomplish a project by writing a script. I have a bunch of files and they are located in a folder tree ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Sep 2012
    Posts
    5

    Question Command line word count


    Howdy!

    I am new to linux and trying to accomplish a project by writing a script. I have a bunch of files and they are located in a folder tree inside a directory. I am to traverse the directory and locate all the files (done).

    Capture.PNG

    Next, I am to sort the list (done).

    Capture2.PNG

    Where I am running into problems, is I would like to make the output look like this:

    Capture3.PNG

    The RP# is in the folder path, and the last number is the word count of the file. I've played with piping the results from "find" to "cut" and then piping that to "wc", but I've never gotten it to work.

    Thanks for any help!

  2. #2
    Trusted Penguin
    Join Date
    May 2011
    Posts
    4,353
    try using the -exec flag in find to run the wc command.
    Code:
    find $DIR -type f -exec wc -w {} \;
    in bash, you can iterate over each line and parse what you want using cut/awk/sed, e.g.:

    Code:
    while read line; do
      echo do something with $line
    < <(find ...)

  3. #3
    Just Joined!
    Join Date
    Sep 2012
    Posts
    5
    Thanks for the reply!

    I wrote the script using your suggestion:
    Capture.PNG

    And I got this result:
    Capture1.PNG

    I am not familiar with the loops, so I wan't sure how to proceed with that. Is there any way to do this by only piping the output of what I already have to "wc" and then to "cut" to produce the desired output? I'm thinking that I could do all of these tasks independently and assign them variables, and then assemble it in the form I need at the end.

  4. $spacer_open
    $spacer_close
  5. #4
    Trusted Penguin
    Join Date
    May 2011
    Posts
    4,353
    you can get the output of a command in Bash by either surround the command in backticks, e.g:

    Code:
    output=`ls`
    or by using dollar sign parenthesis, e.g.:

    Code:
    output=$(ls)
    now the variable $output contains the STDOUT (standard output) from the ls command.

    now you can do whatever you want with the string contained in that variable (cut them, use sed on them, etc.). you can look for examples of cut or sed, for instance, in the initscripts on your system. look at the files in this dir:

    /etc/init.d/

  6. #5
    Linux User Krendoshazin's Avatar
    Join Date
    Feb 2005
    Location
    London, England
    Posts
    471
    Perhaps something like this would be more suitable for your needs:
    Code:
    for file in `find . -name '*.dat' | sort -f`
    do
    wordcount="`wc -l $file | awk '{ print $1 }'`"
    echo $file | awk -F"/" -v wc=$wordcount '{ print $2":  " wc }'
    done
    The only problem with it is that $2 is relative the the directory you search for the files from.

  7. #6
    Just Joined!
    Join Date
    Sep 2012
    Posts
    5
    Thanks Krendoshazin! You're awesome!

    That produced the output:
    Capture.PNG

    That is a huge step from what I was getting! I simply need it to look like this:
    Capture1.PNG

    It actually needs to be able to search the current directory, so that's perfect. Where are you trimming the line to show only that directory? I did not know that it could be done without using "cut". Essentially I need to make it change the text from "SystemrestorePoints" to RP#.

  8. #7
    Linux User Krendoshazin's Avatar
    Join Date
    Feb 2005
    Location
    London, England
    Posts
    471
    Try changing
    Code:
    echo $file | awk -F"/" -v wc=$wordcount '{ print $2":  " wc }'
    to
    Code:
    echo $file | awk -F"/" -v wc=$wordcount '{ print $3":  " wc }'
    When AWK explodes the directory structure using /, each component becomes $1, $2, $3, and so on. If you explode ./SystemRestore/ThisDir/AnotherDir then $1 is ., $2 is SystemRestore, $3 is ThisDir, and $4 is AnotherDir. AWK usually seperates these out using whitespace, but in this case we tell it to use something else.

  9. #8
    Just Joined!
    Join Date
    Sep 2012
    Posts
    5
    Awesome! That did the trick. I was able to modify the formatting of the output to make it look the way I needed. Thanks again.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •