Results 1 to 3 of 3
(Sorry for the xxx below, newbs are not allowed to post certain banned strings.)
I want to mirror a website using wget -r, but even after reading the wget manual, ...
- 06-16-2010 #1Just Joined!
- Join Date
- Jun 2010
- Posts
- 5
wget: "following links" vs. "retrieving directory tree"
(Sorry for the xxx below, newbs are not allowed to post certain banned strings.)
I want to mirror a website using wget -r, but even after reading the wget manual, I'm still a little unclear on how it works. With a command like:
will wget automatically follow all html links and recreate the entire directory tree under the bar/ directory? What if I don't want to follow any html links but still want to recreate the entire directory tree under the bar/ directory? What if I want the opposite: I don't need the directory tree but just want to follow all links in bar/index.html, up to say, N links away from index.html? I don't see how these operations could be distinguished using wget's recursive retrieval capabilities. Thanks for any input!Code:wget -r --level=inf hxxp://xxx.foo.com/bar
- 06-16-2010 #2Linux Guru
- Join Date
- Nov 2007
- Posts
- 1,695
Wget doesn't know about a "directory tree" under /bar unless there is an http link to the folder. If you want a backup independent of the webserver (folders, files, etc.) then you're better off using something like rsync.Code:man wget --mirror Turn on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite recursion depth and keeps FTP directory listings. It is currently equivalent to -r -N -l inf --no-remove-listing.
- 06-16-2010 #3Just Joined!
- Join Date
- Jun 2010
- Posts
- 5


Reply With Quote
