Find the answer to your Linux question:
Results 1 to 2 of 2
I'm trying to create a script that takes a file full of names of files (mp3's actually) with the following syntax: Mon_Da_YEAR_Sub_ject_Sub_ject_Location_side_1_or_2 .mp3 ...and makes folders from those file names ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Apr 2007
    Posts
    1

    Sed


    I'm trying to create a script that takes a file full of names of files (mp3's actually) with the following syntax:

    Mon_Da_YEAR_Sub_ject_Sub_ject_Location_side_1_or_2 .mp3

    ...and makes folders from those file names in the following syntax:


    Location-YEAR-MN-DY- Subject -BLH

    The "BLH" part needs to be indiscriminately attached to every folder name.

    Basically it'll take some Regex to parse the data from the file name and reorganize it into the folder name...and I think there will have to be a loop statement like do-while...can anyone help me?

    ...I ended up having to use a batch file in DOS for the folder creation part...then I installed GNU Sed for DOS and took a stab at the RegEx...here is where I got stuck:

    OKay...first, here is some of my raw data:

    April_14_1991_Bread_of_life_Vallejo_side_1.mp3
    April_14_1991_Bread_of_life_Vallejo_side_2.mp3
    April_21_1991_Ministry_zadok_priesthood_Vallejo_si de_1.mp3
    April_21_1991_Ministry_zadok_priesthood_Vallejo_si de_2.mp3
    Apr_05_1992_Matthew_13_Sower_Berkeley_side_1.mp3
    Apr_05_1992_Matthew_13_Sower_Berkeley_side_2.mp3
    Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_1. mp3
    Aug_04_1991B_Moving_out_alive_soul_Vallejo_side_2. mp3
    Aug_04_1992_Daniel_7_Berkeley_side_1.mp3
    Aug_04_1992_Daniel_7_Berkeley_side_2.mp3

    So you have the month (.*) followed by the day ([0-9]{1,2}) followed by the year ([0-9]{4}) followed by a subject (.*) followed by a location (Vallejo,vallejo, Berkeley,berkeley, UC, Union City) followed by the word "side" followed by a 1 or 2

    sed -r 's/(.*)([0-9]{2})\s*_\s*([0-9]{4})\s*_\s*(.*)\s*([vallejo][.*])\s*(side)\s*_\s*([1,2])(.*)/\3-\1-\2-\4/' test2.txt
    > test3.txt

    The line above returns absolutely nothing, it returns the data precisely as it is read.

    Basically I need to combine this DOS script:

    for /f "usebackq delims=" %A in ("C:\Documents and Settings\David Candy\Desktop\New Text Document.txt" ) do md ".\%A"

    with the sed code above...can anyone help?!

  2. #2
    Linux Newbie Ziplock's Avatar
    Join Date
    Jan 2009
    Location
    Adelaide
    Posts
    169
    Hi there,

    .* is greedy - it will match everything up to the century in the first instance. Use (.*?) to make it match the smallest amount. There are several instances of this in your RE that you may want to change.

    Also [vallejo] will match any of those characters, you are probably looking for ([Vv]allejo.*)

    RE's are fun

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •