Find the answer to your Linux question:
Results 1 to 4 of 4
Like Tree1Likes
  • 1 Post By atreyu
Hello, I have migrated my free website to my linux web server recently. I have to re-code all my htm files, of which they were created under Windows. I need ...
  1. #1
    Just Joined!
    Join Date
    Oct 2004
    Posts
    7

    Multiline Search-Replace With Perl One-liner

    Hello,

    I have migrated my free website to my linux web server recently. I have to re-code all my htm files, of which they were created under Windows.

    I need to make a search and replace with Perl One-liner. I want to search/replace backward slash to forward slash only in the relative image path for all <img> tags, which may span over multiple lines.
    So the <img> tag may like this:
    <html>
    <body>
    <img src="public_html\images\Buster.jpg" height="250" width="128"
    alt=" /\ See above"
    title=" /\
    See above" />
    </body>
    </html>

    Sadly I haven't done RegEx searching in a while now. I'd be very thankful for some help.

    Thank you very much in advance!!!

    Best Regards,
    cibalo

  2. #2
    Linux Guru
    Join Date
    May 2011
    Posts
    1,843
    How about a 56-liner?

    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    use File::Copy;
    
    # get files from command line arguments
    die "Give me files!\n" unless($#ARGV >= 0);
    my(@files) = (@ARGV);
    
    # loop thru list of files
    for my $file(@files){
    
      # make sure file exists
      die $file,": No such file\n" unless(-f$file);
    
      # temporary copy of file, containing changes
      my $tmpFile = $file.'.tmp';
    
      # remove the temp file, if it exists
      unlink($tmpFile) if(-f$tmpFile);
    
      # open a new temp file
      open(TMP,'>',$tmpFile) or die "can't open '$tmpFile': $!\n";
    
      # read the input file
      open(FH,'<',$file) or die "can't read '$file': $!\n";
      while(<FH>){
        chomp;
    
        # match on the img tag opening and closing
        if(/<img[ \t]+/ .. /[ \t]+\/>/){
    
          # match on the portion of the line containing src=""
          if(/(^.*[ \t]+src=")(.*)(".*)$/){
            my $pre = $1;
            my $src = $2;
            my $post = $3;
    
            # here we swap '\' for '/'
            $src =~ s/\\/\//g;
    
            print TMP $pre,$src,$post,"\n";
          }else{
            print TMP $_,"\n";
          }
        }else{
          print TMP $_,"\n";
        }
      }
      close(FH);
      close(TMP);
    
      # if satisifed, write the temp file to original file
    #  move($tmpFile,$file);
    
    }
    Copy it to a script, call it whatever ("foo.pl"), and make it executable:
    Code:
    chmod +x foo.pl
    Then run it and pass it your HTML files as arguments, e.g.:

    Code:
    ./foo.pl $(find . -type f -iname '*.html')
    But really, first just try it on one test file first, e.g.:

    Code:
    ./foo.pl test.html
    It will create a temporary file for each input file, containing the actual changes (e.g., "test.html.tmp"). Look at this resultant file. If it looks okay, then remove the comment character (#) from line 54 (or thereabouts - the one that says move($tmpFile,$file); ) to allow the script, going forward, to move the temp file into place, overwriting the original file.
    Irithori likes this.

  3. #3
    Just Joined!
    Join Date
    Oct 2004
    Posts
    7
    Quote Originally Posted by atreyu View Post
    How about a 56-liner?
    Hello atreyu,

    Thank you very much for replying to my post. What I need is a working solution, be it a 56-liner or one-liner. I do try your suggestions and it works like a charm.

    For testing, I even change the order of attributes ("alt" and "src") in img tag as:
    <html>
    <body>
    <img height="250" width="128" alt=" /\ See above"
    src="public_html\images\Buster.jpg"
    title=" /\
    See above" />
    </body>
    </html>

    Then run it:
    $ ./foo.pl perltest.html
    $ ls perltest.html*
    perltest.html perltest.html.tmp
    $ diff -s perltest.html*
    4c4
    < src="public_html\images\Buster.jpg"
    ---
    > src="public_html/images/Buster.jpg"

    Best Regards,
    cibalo

  4. #4
    Linux Guru
    Join Date
    May 2011
    Posts
    1,843
    sweet - i'm glad it works. any chance to sling some perl code...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...