Find the answer to your Linux question:
Results 1 to 7 of 7
hello i ve a little script that fetches (or lets say at least it should do that) fetches images or screenshots from webpages and stores them. but i do not ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Linux Newbie
    Join Date
    May 2013
    Posts
    209

    storing the results of a perl::mechanize script


    hello i ve a little script that fetches (or lets say at least it should do that) fetches images or screenshots from webpages and stores them.

    but i do not know where it stores them...

    i want to store the images in a folder. is this doable?



    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    use WWW::Mechanize::Firefox;
    
    my $mech = new WWW::Mechanize::Firefox();
    
    open(INPUT, "<urls.txt") or die $!;
    
    while (<INPUT>) {
            chomp;
            print "$_\n";
            $mech->get($_);
            my $png = $mech->content_as_png();
            my $name = "$_";
            $name =~s/^www\.//;
            $name .= ".png";
            open(OUTPUT, ">$name");
            print OUTPUT $png;
            sleep (5);
    }

  2. #2
    Just Joined!
    Join Date
    Dec 2008
    Location
    Lund, Sweden
    Posts
    31
    Essentially the script takes a list of web sites in urls.txt, reads from each one of them, I assume renders the image as png file, and then saves them in a file with the same name as the web side, but without the 'www.' at the beginning and adding '.png' at the end, so that for instance www.butoba.net would become butoba.net.png .

    By manipulating $name you could add a directory (folder) name to the start of it, for instance by adding
    Code:
    $name = "directory/" . $name
    before the open command in the loop.

    Of course a hard coded directory name might be a bit simplistic, but you could have the directory name as an input to the script etc.

  3. #3
    Just Joined!
    Join Date
    Sep 2008
    Posts
    23
    I would create a separate path variable for the directory. Probably look something like:
    Code:
    // add a variable before the loop
    $path = "path/to/dir";
    
    ...
    
    // change your  open(OUTPUT, ">$name") to
    open(OUTPUT, ">$path/$name");
    Though that ain't much different than Ricard's. Guess I just like my path and name to be separate vars.


    James

  4. #4
    Linux Newbie
    Join Date
    May 2013
    Posts
    209
    Hello dear James hello dear ricard,


    first of all - many many thanks for the answers!


    i have the sricpt residing in this folder - which is located in the

    Code:
    martin@linux-70ce:~/perl>
    so what if i would create a folder right here in this folder - named
    Code:
    "images"
    so in other words i would choose the path to the folder like so;


    this means that this following step:

    Code:
    // add a variable before the loop
    $path = "path/to/dir";
    gets to this:

    Code:
    $path = "~/perl/images";
    and this following step

    ...

    Code:
    // change your  open(OUTPUT, ">$name") to
    open(OUTFILE, ">$path/$name");
    would bring me to this following expression:

    Code:
    open(OUTFILE, ">$path/$name")

    the full code would look like this:




    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    use WWW::Mechanize::Firefox;
    
    my $mech = new WWW::Mechanize::Firefox();
    
    my $path = "~/perl/images"; 
    
    open(INPUT, "<urls.txt") or die $!;
    
    while (<INPUT>) {
            chomp;
            print "$_\n";
            $mech->get($_);
            my $png = $mech->content_as_png();
            my $name = "$_";
            $name =~s/^www\.//;
            $name .= ".png";
            open(OUTFILE, ">$path/$name");
    	print OUTPUT $png;
            sleep (5);
    }
    is this correct -

    ahhh - i will try out the code right now - and i come back and report all my findings

    update; -

    hmm in the folder images which is located here

    Code:
    $path = "~/perl/images";
    no images are stored

    and i get this following results in terminal:

    what goes wrong here?


    Code:
    martin@linux-70ce:~/perl> perl mech20.pl
    Name "main::OUTPUT" used only once: possible typo at mech20.pl line 24.
    Name "main::OUTFILE" used only once: possible typo at mech20.pl line 23.
    
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 1.
    
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 2.
    
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 3.
    domain-name3
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 4.
    domain-name4
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 5.
    domain-name5
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 6.
    domain-name6
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 7.
    domain-name7
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 8.
    domain-name8
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 9.
    print() on unopened filehandle OUTPUT at mech20.pl line 24, <INPUT> line 10.
    martin@linux-70ce:~/perl>

    domain-name3 IS USED AS PLACEHOLDER -sinde i cannot post ulrs -
    i am new here - andall thouse things are not allowd for me...

  5. #5
    Just Joined!
    Join Date
    Sep 2008
    Posts
    23
    I've used perl & mechanize in the past, but ain't no genius. Did notice if I google your original code, others have asked the same question as you. Maybe you'll find your answer there.

  6. #6
    Just Joined!
    Join Date
    Dec 2008
    Location
    Lund, Sweden
    Posts
    31
    First of all, yeah J-Dude, I like your solution better, I always forget that it's possible to do variable substitution directly in any string in Perl. It comes to mind naturally for print, but of course works equally well wherever you use a string.

    Secondly, the problem with the latest version is that you use the file handle OUTFILE when opening the file, then refer to it as OUTPUT in the subsequent print command. You need to use the same name in both cases; I'd prefer OUTFILE but its a matter of opinion.

  7. #7
    Linux Newbie
    Join Date
    May 2013
    Posts
    209
    hello dear ricard hello dear J-dude,

    many many thank for the quick answer and the helping hand.
    i corrected the code to the following

    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    use WWW::Mechanize::Firefox;
    
    my $mech = new WWW::Mechanize::Firefox();
    
    my $path = "~/perl/images"; 
    
    open(INPUT, "<urls.txt") or die $!;
    
    while (<INPUT>) {
            chomp;
            print "$_\n";
            $mech->get($_);
            my $png = $mech->content_as_png();
            my $name = "$_";
            $name =~s/^www\.//;
            $name .= ".png";
            open(OUTPUT, ">$path/$name");
    	print OUTPUT $png;
            sleep (5);
    }

    unfortunatley it ended up with no stored image the folder images

    note; the perlscript called mech20.pl resides in the folder

    Code:
    perl/
    the folder called "images" resides in the same folder where the script "mech20.pl" is located

    Code:
    perl/images
    but the sad thing is that no image is written /or let me say stored in this folder?!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •