Results 1 to 10 of 17
A reminder that using automated software to produce a local copy of this website for offline browsing (or any other reason) is strictly forbidden.
Running any automated software against this ...
- 06-05-2005 #1Linux Guru
- Join Date
- Apr 2003
- Location
- London, UK
- Posts
- 3,284
"Downloading" LinuxForums for "Offline Browsing"
A reminder that using automated software to produce a local copy of this website for offline browsing (or any other reason) is strictly forbidden.
Running any automated software against this site specifically, or my servers in general is viewed as an attempted denial service attack, and will be reported to your ISP.
Thanks for you're co-operation.
- 06-06-2005 #2Linux Newbie
- Join Date
- May 2005
- Posts
- 108
That's nice, and what, pray tell, is the ISP obligated to do? What crime has been committed?
- 06-06-2005 #3Linux Guru
- Join Date
- Apr 2003
- Location
- London, UK
- Posts
- 3,284
It is an automated DOS against my server.
Originally Posted by eatinglemur
One individual yesterday sent 20,000 HTTP requests for PHP pages in 1 hour, seriously degrading the performance of this site. Such floods are no different to using a SYN or ICMP flood or any other form of malicious denial of service attack that ties up the resources of my server preventing legitimate traffic getting to the site. Many ISP's state in their terms of service that such activity is not allowed.
- 06-06-2005 #4Linux Newbie
- Join Date
- May 2005
- Posts
- 108
How rude!
Originally Posted by jasonlambert
- 06-06-2005 #5Very True
Originally Posted by eatinglemur \"TTFN Taa Taa For Now\" by Tigger in Winnie the Pooh
http://www.distrowatch.com Linux Distros
We Live in a Windows World but there is Linux to save the day
- 06-06-2005 #6Linux Enthusiast
- Join Date
- Jan 2005
- Posts
- 575
Re: "Downloading" LinuxForums for "Offline Browsing"
Out of curiosity isn't that what Google does ?
Originally Posted by jasonlambert
- 06-07-2005 #7
Not exactly, google uses a 'crawler' that follows links and behaves much more like a human (ie viewing 1 page at at time) automated downloaders tend to open several simultanious connections to a server at the same time and download as much content as possible. In addition search engines respect the file robots.txt which may contain a list of files to be indexed, and files not to be indexed. (linuxforums.org has one, here should you want to see one) its a fine line, but in the end, the only *serious* difference that I know is google does this in moderation, and behaves more like a fast reader would as opposed to a DOS style script which has no reguards for moderation.
- 06-07-2005 #8
It also has to do with the frequency of the indexing. Somebody using one the aforementioned programs most likely will index a site more times than something like a google crawler would (and with the billions of pages indexed by google, I think the explanation is self-explanatory) causing bandwidth to be adversely affected more, and more often.
- 06-07-2005 #9forum.guy
- Join Date
- May 2004
- Location
- arch linux
- Posts
- 17,782
Hmm... maybe that's what is causing the hiccup I talked about in this thread:
Originally Posted by jasonlambert
http://www.linuxforums.org/forum/topic-45213.html
- 06-08-2005 #10Linux Guru
- Join Date
- Apr 2003
- Location
- London, UK
- Posts
- 3,284
It's certainly possible.



