Find the answer to your Linux question:
Results 1 to 5 of 5
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    Question Wget - URL hit generator


    I am trying to see whether wget can be used to generate actual url hits on a web page. This does not look good so far….

    I changed the following lines in /etc/wgetrc to:
    http_proxy=http : / /<proxy_ip>:<port>/
    use_proxy on
    Output :
    root# wget -c <url>/ > /dev/null 
    --2011-01-16 12:26:38--  <url>
    Connecting to <proxy_ip>:<port>... connected.
    Proxy request sent, awaiting response... 200 OK
    Length: unspecified [text/html]
    Saving to: `index.html.3'
        [   <=>                                 ] 50 548      88,9K/s   in 0,6s    
    2011-01-16 12:26:39 (88,9 KB/s) - `index.html.3' saved [50548]
    This does NOT generate a hit on the actual web page!
    It does not seem like the, > /dev/null part is working either...

    How can I get this to work?

  2. #2
    What counter do you wanna trigger?

    Is the proxy counting the hits?!

  3. #3
    I am not sure what trigger this will affect, but when I use the exact same proxy in firefox (preferences -> Advanced -> network) I do get a hit on the particular web page.

    I would really like the same "mechanism" to be trigger by using wget from the commad line..

  4. $spacer_open
  5. #4
    Linux Engineer Kloschüssel's Avatar
    Join Date
    Oct 2005
    What is this "hit" for you? Do you see it in a tool like google analytics?

    Then there are at least two possibilities:

    1] google analytics doesn't count the Wget User-Agent (a http header) as count-worthy and you won't be able to much about that unless you change the headers you send over

    2] google analytics has already counted the request and regards any further requests as following the first. in that case you can't do much as these checks use some data to identify you (usually your ip address together with user-agent and other data usable by heuristic algorithms) and you would have to change at least one part of that data (i.e. if you have a dialup connection and your ISP supports it reconnect to get a new ip address).

    If you use a masquerading proxy like tor and the hits don't count up you're certainly out of luck and the analytics is well designed and doesn't use IP addresses to identify users. Good luck for that case!

  6. #5
    Most likely whatever is counting hits checks the user-agent field, as well as a couple others, of the http request before deciding whether or not to consider it a hit. I'm pretty sure wget doesn't set this field by default. Check out the --user-agent and --header options in wget's manpage.

    This link will be able to answer more of you're questions:
    Last edited by mharv87; 01-25-2011 at 03:23 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts