Results 11 to 17 of 17
Ermm... Is it OK to save a copy of a page for reference?...
- 07-05-2005 #11Linux Newbie
- Join Date
- Apr 2005
- Posts
- 110
Ermm... Is it OK to save a copy of a page for reference?
Registered Linux user #393668
- 07-05-2005 #12Linux Guru
- Join Date
- Apr 2003
- Location
- London, UK
- Posts
- 3,284
If you mean using your browser's "File" -> "Save Page As.." function, yes that is perfectly OK
Originally Posted by greenpenguin
- 07-05-2005 #13Linux Guru
- Join Date
- Nov 2004
- Posts
- 6,110
While we're on site issues, I aven't seen this mentioned so apologies if I've missed a thread - How come all of the recent articles on the home page are stuck? Is this a knock on of the server switch?
- 07-06-2005 #14Linux Guru
- Join Date
- Apr 2003
- Location
- London, UK
- Posts
- 3,284
yep, fixed now
Originally Posted by bigtomrodney 
Thanks,
J.
- 07-06-2005 #15Linux Guru
- Join Date
- Nov 2004
- Posts
- 6,110
Cheers dude!
- 09-08-2005 #16Just Joined!
- Join Date
- Jul 2005
- Location
- NH
- Posts
- 41
Noted and am glad that I haven't done that in about a year to other sites. That area of violation seems really "grey" though imo. At least there was a memo going out about it.
- 10-13-2005 #17Just Joined!
- Join Date
- Aug 2003
- Location
- Sydney, Australia
- Posts
- 52
Howdy jasonlambert,
It's been donks since iv'e been here. Good to see things are still kicking
I can understand what your getting at. I do know of one popular Linux site that had to shut down earlier this year, then reformate their site, and install a more powerfull server due to flood attacks. But they were actual "attacks" not just a few benign wgets on a thread or two.
jasonlambert wrote:
Yes, well ... that is a bit much of course. Can't you just disconnect/drop a connection like that. Even so, it wouldn't stop the packets comming in.It is an automated DOS against my server.
One individual yesterday sent 20,000 HTTP requests for PHP pages in 1 hour, seriously degrading the performance of this site. Such floods are no different to using a SYN or ICMP flood or any other form of malicious denial of service attack that ties up the resources of my server preventing legitimate traffic getting to the site. Many ISP's state in their terms of service that such activity is not allowed.
I often use "wget" to extract a thread at times. If it contains a number of pages. By that i mean in the order of 30 to 40 or so...
And if it is a topic that i'm trying to study and so need to spend some time going over.
I have my script set up to allow for a couple of seconds delay between each request ... as per below,
The robots.txt is always respected.Code:#!/bin/sh for f in $1 do wget -v -np -N -x $f 2>&1 |tee -a $LOGFILE #wget -v -np -N --wait=2 -x -r -l 2 --reject=swf,pdf,ps,wav,Z,jar,zip,deb,rpm,tar,gz,bz2 $f 2>&1 |tee -a $LOGFILE #wget -v -np -N --wait=2 -r -l 1 --reject=pdf,ps -x $f 2>&1 |tee -a $LOGFILE #wget -v -np -N --wait=2 -x -r -l 3 --reject=swf,gz,tar,Z,tar.gz,jar,pdf,ps,wav,deb,rpm $f 2>&1 |tee -a $LOGFILE done
In my case, i'm on a limited budget, and a few wgets saves me lots of time/money. But if you consider the above script to much weight for the site i will respect your wishes.
But the 20,000 page request you mention is something in a completely different context really. I mean --- that is obviously an abuse by someone.
Pity -=-=-
Iv'e never had any complaint feed back in the past.
jm[/quote]



