Find the answer to your Linux question:
Results 1 to 4 of 4
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jan 2010
    Location
    Sydney, Australia
    Posts
    68

    Dnsmasq as a DNS Blacklist server


    G'day everyone,

    I think I've painted myself into a corner and I'm not sure where to go from here...
    I've setup a DNS Blacklist server using DNSMASQ. I've done this by downloading a domain category list from Shalla Secure Services KG I've also created a script to extract the inappropriate domains and add them to the dnsmasq.conf using the option:
    address=/blocked.domain/ip.it.resolves.to

    It is very effective however after it has been running for a little while the load it is under seems to be too much for it and results in the process running at 100% of one of the cores in the server (VMware ESX guest).

    If I stop the requests to the process for a little while it returns to 0% usage and it will take a few minutes to rise again after I put the network back onto it.

    The server is used as a DNS cache and forwarder, it sits at the gateway to the network and resolves forwarded queries from the Win2k8 DNS servers and the workstations are set to use the Win2k8 servers for DNS

    The config is more than 1,000,000 lines (yep, nearly 50MB of config file), originally I was surprised to see it run at all.

    The server is running Ubuntu 12.04 command line install, there is also Nginx installed (to serve the catch all block page).
    I'm aware that DNSMASQ was designed for small networks and this isn't exactly the design the developers had intended for it but I'd like to know if there's any tweaks people might be able to suggest that might help...sysctl settings for example or DNSMASQ configuration settings that may reduce the burden on the CPU.

    Is it possible to run DNSMASQ over multiple cores?
    I added the local-ttl option and it seems to help (default action is to provide local addresses with 0 TTL) but it is still not enough for it to run stable.

    Cheers,

    Griffo

  2. #2
    Just Joined!
    Join Date
    Jan 2010
    Location
    Sydney, Australia
    Posts
    68
    I've been reading up on ulimit settings, is it possible the process has hit its file descriptor limit when this happens?

  3. #3
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,667
    I dont think it will hit the default 1024 open files per process limit, because the number of addresses inserted into dnsmasq.conf does not change the number of open files.
    You can verify this by looking at the limits of your running dnsmasq process:
    Code:
    cat /proc/<PID>/limits
    If dnsmasq stops responding, you might want to look at what it is doing via strace or perf.

    I would also look into metrics over time to figure out which resource runs low (if at all).
    And for that you need s/th to collect these.
    For an easy but rather crude solution, you could install atop and tell it to run as daemon.
    atop will then record process, cpu, io snapshots in 10min intervalls (as I said: rather crude), which you can look at with atop -r <metricfile>

    A better solution would be to add graphing to your monitoring solution, maybe InGraph if you use nagios.

    But in general I would suggest to use a dedicated graphing solution.
    We use OpenTSDB 2. It is *very* powerful, but not trivial to install and maintain.
    Graphite works similar and needs only one machine instead of a cluster.
    You must always face the curtain with a bow.

  4. $spacer_open
    $spacer_close
  5. #4
    Just Joined!
    Join Date
    Jan 2010
    Location
    Sydney, Australia
    Posts
    68
    Thanks for the ideas.

    The problem appears to be under control now, but to be honest I'm not sure which change fixed it (I wan't able to perform individual trial and error testing on each setting).

    I disabled the all-servers option in dnsmasq (forwards the request to all upstream servers and responds with the first answer), I disabled cache

    I also increased increased the OS send/receive buffers to 25MB (probably a little excessive) after reading this helped other DNS servers (unbound or maybe bind, can't remember which now).
    net.core.rmem_max=26214400
    net.core.wmem_max=26214400

    Anyway, it has been up for half of the day now (previously 5minutes to an hour before suffering problems). On AVG the process is between 10-50% and rarely gets over 70%

    Cheers,
    Brad

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •