Get a 'complete' webpage with wget or curl
Can anyone help?? I have put * in the links so they aren't live.
I am stuck trying to get a complete webpage from the command line just like the right click in firefox or epiphany does.
The page is completed with javascript and data is hidden when viewing source.
Here is an example (its not the actual site i want but it does the same result)
if you save in firefox the page Code:
h**p://guida.tv.it/guidatv/grid.html
you get a 350+ kb file with all the links in.
However with
Code:
wget -p - h**p://guida.tv.it/guidatv/grid.html >> /home/me/out.html
I get an 80kb file without the links or the data that the javascript creates
using perl fails
Quote:
perl -MWLP::Simple -e 'getprint "h**p://guidatv.sky.it/guidatv/grid.htm
l"'
with
Quote:
Can't locate WLP/Simple.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .).
BEGIN failed--compilation aborted
.
I have tried combinations of the below
Quote:
cd cmdline && wget -nd -pHEKk h**p://www.pixelbeat.org/cmdline.html) Store local browsable version of a page to the current dir
wget -c h**p://www.example.com/large.file Continue downloading a partially downloaded file
wget -r -nd -np -l1 -A '*.jpg' h**p://www.example.com/ Download a set of files to the current directory
wget f*p://remote/file[1-9].iso/ FTP supports globbing directly
• wget -q -O- h**p://www.pixelbeat.org/timeline.html | grep 'a href' | head Process output directly
echo 'wget url' | at 01:00 Download url at 1AM to current dir
wget --limit-rate=20k url Do a low priority download (limit to 20KB/s in this case)
wget -nv --spider --force-html -i bookmarks.html Check links in a file
wget --mirror h**p://www.example.com/ Efficiently update a local copy of a site (handy from cron)
wget -r -l 1 h**p://www.nameofsite.com
wget -O - w*w.google.com | html2text > google.txt
I can get the java files that creates the page or mirror the site, but i dont want to do that. I just want to replicate saving the complete page as 1 file and get the data into a text file that i can strip and edit. I want to get data from several sources for a statistical program.
Does anyone know how to do this??