Find the answer to your Linux question:
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 11 to 20 of 27
is there any extreme activity? Are you thrashing your hard drive very much? Is you CPU pegged at 99% all the time? Anything extraordinary like that? How much swap do ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #11
    Linux Enthusiast
    Join Date
    Jun 2002
    Location
    San Antonio
    Posts
    621

    is there any extreme activity? Are you thrashing your hard drive very much? Is you CPU pegged at 99% all the time? Anything extraordinary like that? How much swap do you have, how much of it is being cached (the free command will tell that). Umm . . . are these SCSI drives? If so which module do they use, in fact, list all your modules that are loaded, one of them might be buggy or something. Are you using a non-standard ethernet card like a tigon3 or something? Is any hardware the same from one machine to the other? RAM, sound card, anything?
    I respectfully decline the invitation to join your delusion.

  2. #12
    Just Joined!
    Join Date
    Mar 2003
    Posts
    12
    Problem occurs even when "off peak". And CPU is at 10-40 percent. Okay some peaks when converting images etc. but that's normal.
    Not doing much hardware-activity either. Hmm, at some times there are a lot of HTTP-connects (because of polling for a chat-script - no I can't change that!) and mysql-query (also because of chat because it insolves even offline-chat etc.). But crash doesn't occur at the "peak times" - that's the astonishing part.
    Using IDE-drives with soft-raid1 on ext3. Oh ... and I almost forgot: It's a Intel Etherexpress 100MBit on board - so a quite usual ethernet-adapter. No controllers etc. as PCI-cards, all network, IDE etc. directly on the Asus-mainboard.

  3. #13
    Linux Guru
    Join Date
    Oct 2001
    Location
    Täby, Sweden
    Posts
    7,578
    Since DNS works, I find it unlikely that the kernel is being evil. The fact that inetd services don't respond but still ACK the SYNs could, however, point to that some program has eaten all PIDs, so that inetd can't fork new processes. Still that doesn't explain why the console doesn't work.
    Still, could you try compiling something like the following and run it, then try connecting to it while the server has crashed? (Don't have it running normally since it offers a nice and clean buffer overflow entry =) )

    Note! I just wrote this on the fly, so there are surely typos in it. Until you state otherwise, I'm going to assume that you can program and fix my typos, since you seem to know your way around a Linux system.
    Code:
    #include <sys/socket.h>
    #include <netinet/in.h>
    #include <unistd.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <errno.h>
    
    char buf&#91;1024&#93;, buf2&#91;1024&#93;;
    int bufpos;
    
    #define STRANDLEN&#40;str&#41; str, strlen&#40;str&#41;
    
    int main&#40;int argc, char **argv&#41;
    &#123;
        int ret;
        int sock, cli, fd;
        struct sockaddr_in name;
        char *p;
    
        if&#40;&#40;sock = socket&#40;PF_INET, SOCK_STREAM, 0&#41;&#41; < 0&#41;
        &#123;
            perror&#40;"socket"&#41;;
            exit&#40;1&#41;;
        &#125;
        name.sin_family = AF_INET;
        name.sin_port = htons&#40;5555&#41;;
        name.sin_addr.s_addr = 0;
        if&#40;bind&#40;sock, &#40;struct sockaddr *&#41;&name, sizeof&#40;name&#41;&#41; < 0&#41;
        &#123;
            perror&#40;"bind"&#41;;
            exit&#40;1&#41;;
        &#125;
        if&#40;listen&#40;sock, 10&#41; < 0&#41;
        &#123;
            perror&#40;"listen"&#41;;
            exit&#40;1&#41;;
        &#125;
        while&#40;1&#41;
        &#123;
            if&#40;&#40;sock2 = accept&#40;sock, NULL, 0&#41;&#41; < 0&#41;
            &#123;
                if&#40;errno = EINTR&#41;
                    continue;
                perror&#40;"accept"&#41;;
                exit&#40;1&#41;;
            &#125;
            send&#40;sock2, STRANDLEN&#40;"Hello\n"&#41;, 0&#41;;
            while&#40;&#40;ret = read&#40;sock2, buf + bufpos, 1024&#41;&#41; > 0&#41;
            &#123;
                bufpos += ret;
                while&#40;&#40;p = memchr&#40;buf, '\n', bufpos&#41;&#41; != NULL&#41;
                &#123;
                    *p = 0;
                    if&#40;!strcmp&#40;buf, "fork"&#41;&#41;
                    &#123;
                        ret = fork&#40;&#41;;
                        if&#40;ret < 0&#41;
                        &#123;
                            send&#40;sock2, STRANDLEN&#40;strerror&#40;errno&#41;&#41;, 0&#41;;
                            send&#40;sock2, "\n", 1, 0&#41;;
                        &#125;
                        if&#40;!ret&#41;
                        &#123;
                            send&#40;sock2, STRANDLEN&#40;"Child responding\n"&#41;, 0&#41;;
                            exit&#40;0&#41;;
                        &#125;
                        if&#40;ret > 0&#41;
                            send&#40;sock2, STRANDLEN&#40;"Fork successful\n"&#41;, 0&#41;;
                    &#125;
                    if&#40;!strcmp&#40;buf, "hd"&#41;&#41;
                    &#123;
                        send&#40;sock2, STRANDLEN&#40;"Testing harddrive &#40;read&#41;\n"&#41;, 0&#41;;
                        if&#40;&#40;fd = open&#40;"/etc/termcap", O_RDONLY&#41;&#41; < 0&#41;
                        &#123;
                            send&#40;sock2, STRANDLEN&#40;strerror&#40;errno&#41;&#41;, 0&#41;;
                            send&#40;sock2, "\n", 1, 0&#41;;
                        &#125; else &#123;
                            while&#40;read&#40;fd, buf2, 1024&#41; > 0&#41;;
                            close&#40;fd&#41;;
                        &#125;
                        send&#40;sock2, STRANDLEN&#40;"Testing harddrive &#40;write&#41;\n"&#41;, 0&#41;;
                        if&#40;&#40;fd = open&#40;"/tmp/hdtest", O_WRONLY | O_SYNC&#41;&#41; < 0&#41;
                        &#123;
                            send&#40;sock2, STRANDLEN&#40;strerror&#40;errno&#41;&#41;, 0&#41;;
                            send&#40;sock2, "\n", 1, 0&#41;;
                        &#125; else &#123;
                            for&#40;ret = 0; ret < 1024; ret++&#41;
                                write&#40;fd, buf2, 1024&#41;;
                            close&#40;fd&#41;;
                        &#125;
                        send&#40;sock2, STRANDLEN&#40;"Harddrive test successful\n"&#41;, 0&#41;;
                    &#125;
                    if&#40;!strcmp&#40;buf, "commands"&#41;&#41;
                    &#123;
                        if&#40;!fork&#40;&#41;&#41;
                        &#123;
                            for&#40;fd = 1; fd <= 2; fd++&#41;
                            &#123;
                                close&#40;fd&#41;;
                                dup2&#40;sock2, fd&#41;;
                            &#125;
                            execl&#40;"/bin/sh", "/bin/sh", "-c", "echo Hello; cat /proc/uptime; free; ps -AHw", NULL&#41;;
                            send&#40;sock2, STRANDLEN&#40;strerror&#40;errno&#41;&#41;, 0&#41;;
                            send&#40;sock2, "\n", 1, 0&#41;;
                            exit&#40;0&#41;;
                        &#125;
                    &#125;
                    memmove&#40;buf, p, bufpos -= &#40;p - buf&#41;&#41;;
                &#125;
            &#125;
            close&#40;sock2&#41;;
        &#125;
    &#125;
    As you can see, the program allows for several tests. First, it lives in memory and requires no hard drive access (like the DNS daemon), so it should be able to accept connections and respond, but we don't know yet, so that's the first test. Then, it allows for the "fork" test, the "hd" test and the "commands" test. Try them all a couple of times if you get accept()ed by the server. That way it will be much easier to know what's failing.
    I recommend that you run the tests before the server crashes, too, so that you can see that they all work.

  4. #14
    Linux Guru
    Join Date
    Oct 2001
    Location
    Täby, Sweden
    Posts
    7,578
    When I think about it more closely, maybe you should add a "resond" test as well, just making the program answer Hello or something. Or even better, responding once whenever there's a command in the buffer.

  5. #15
    Just Joined!
    Join Date
    Mar 2003
    Posts
    12
    Thank you for the code. I'll try it out in a few days when I have the old server here standing beside me. This seems a good idea to track down the problem. But how could I provoke it? The idea of "eating all PIDs" seems quite realisitic - but is it easy to write such a "destructive" tool? I tried something like that once, but after forking a hundered times I couldn't create any more instances and still there were much PIDs left.
    If the server is standing here and no longer in production I can test as much as I wand ... including crashing it intentionally.

    I'm sure I can fix typo-errors in your code myself if needed - thank you for that. But could you maybe also imagine a sourcecode to provoke crashes?

    Thanky you VERY much.

  6. #16
    Just Joined!
    Join Date
    Mar 2003
    Posts
    12
    And one more thing: Eating PIDs? Hmm - may EVERY normal script or user fork unlimited new processes? Is that real? If so: Oh my God :-(

  7. #17
    Linux Guru
    Join Date
    Oct 2001
    Location
    Täby, Sweden
    Posts
    7,578
    Yeah, it shocked me the first time I realized it as well. Just this could crash your entire system:
    Code:
    #include <unistd.h>
    
    int main&#40;void&#41;
    &#123;
        while&#40;1&#41;
            fork&#40;&#41;;
    &#125;
    Fortunately for all of us, the Big Guys Out There have thought of that, and implemented the ulimit interface (which is supersededby the (get|set)rlimit functions in the kernel, but the shell interface is still called ulimit), which can limit the number of processes per uses. It's normally not used, but if it's a problem for you, there is even a PAM module that you can use to set it up.

    One thing is important to note: Although the PID numbers may span up to 65535, the kernel does not have that many task descriptors. I don't remember just how many, but I think it is 512 or so. That can surely be changed easily in the kernel source, though.

    It has happened to me once or twice. Once there was a ypserv process that just went berserk all of a sudden. That might be what you would imagine, but there's really another possibility as well, that might be far more likely to happen: The second time it was a program that I had made myself, where I had forgotten (actually I hadn't forgotten it, it was really a bug in the LinuxThreads pthreads implementation in libc, but anyway) to wait for forked processes, so that their zombie process descriptors were left in the system. Although more processes than usually weren't running or even active, those zombie processes can still prevent more processes from being forked. After that happened, I quickly fixed the bug... I think I lost at least 20 days of uptime in that incident. =)

  8. #18
    Just Joined!
    Join Date
    Mar 2003
    Posts
    12
    Well I wasn't able to try it yet on the old server but I tried it on the "Play-Linux-PC" beside me. At first I must say: It worked :-) But taking a deeper look the symptoms of the "blocked" system were different.

    On my play-system (with the endless fork) I was still able to enter username and password (although I wasn't able to get a shell because there was no pid left for the shell). In contrast on the crashed server I wasn't.
    When I press "Ctrl-Alt-Del" on my play-PC it says "INIT: cannot fork; try" and retries several times. On the crashed server it simply did nothing.
    On my play-PC I wasn't even able to connect to the ftp-port ... but this still worked on the crashed server, even though the service didn't respond but ...

    So I must say thank you very much for the PID-eating-id ... but that wasn't the case. Are there any other "things" that can be eaten up? And how to test them?

    Or could I possibly be wrong with my assumption that PID-eating can't have been the case? I just wonder right now because for sure on the production-system I have some cronjobs doing things, other tasks writing syslog-entries etc. - could THIS be the cause for the system-lockup? So PID-eating was just the "first stone" and the hang was caused by something else?

    How could I track down if such a PID-eating-problem occurs? And if the process-limit is reached - are there syslog-entries for this?

    Can I even set PID-limits per User if e.g. Apache changes to a different user and executes CGIs or can I just set limits for users that log in via console / SSH?

    I am running several services (Apache, MySQL, DNS, Shoutcast-Server) on my server. In this environment: Can I really set limits? Or might the "complexity" of this system make everything unusable? E.g. its just ONE Apache that I start but it forks several instances, forks CGIs etc. Same for MySQL which also forks several instances.


    Please tell me if I'm wrong or right with the PID-eating-way. What do you think?

  9. #19
    Linux Engineer
    Join Date
    Jan 2003
    Location
    Lebanon, pa
    Posts
    994
    By default on redhat every user can fork unlimited processes which leaves you open to fork bombs. You can use ulimit to limit the number of processes per user. I strongly suggest using that. Anyone with a shell on your box can just run:
    Code:
    perl -e 'while&#40;fork&#41; &#123;`ls -w10000`;&#125;'
    With ulimit set correctly, that should prevent any user from taking down the system

  10. #20
    Just Joined!
    Join Date
    Mar 2003
    Posts
    12
    Some questions:
    a) how about users without shell? can they also be limited?
    b) how about programs which switch user-rights and then execute a "fork-bomb"? will they also be limited? with the rights of the "new" user?

    finally:
    c) how to limit users? and what are good limits for the special users (mysql, apache, ...)?

Page 2 of 3 FirstFirst 1 2 3 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •