Find the answer to your Linux question:
Results 1 to 9 of 9
Hello I've installed FC11 on a computer, everything works fine but if I go into the command prompt using Ctrl+Alt+F1 , I get continuously the following error. Code: EDAC MC0: ...
  1. #1
    Linux User
    Join Date
    Jul 2007
    Location
    Greece
    Posts
    277

    Continous error at the command prompt

    Hello

    I've installed FC11 on a computer, everything works fine but if I go into the command prompt using Ctrl+Alt+F1, I get continuously the following error.

    Code:
    EDAC MC0: CE row3, channel 0, label "": (Branch=0 DRAM-Bank=0 RDWR=Read RAS=570 CAS=136 CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC))
    Does anyone know what does it mean? The error keeps coming up with different values for Bank, RAS and CAS. (No idea what these are)

    I searched on the net about it, some people say there is a bad memory slot or bad sectors in the disk. Memtest didn't find anything and I used smartmontools to check for bad sectors on the disk but nothing was found.

    Has anyone seen this before?

    Many thanks
    One Love!!!

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,974
    It means that you have a failing memory module. However, because your system has ECC (Error Correction Code) RAM, it can continue to run until too many bits are fubar. Run hardware diagnostics and replace the bad memory stick.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Linux User
    Join Date
    Jul 2007
    Location
    Greece
    Posts
    277
    Do you know any other tools for locating bad memory? or should I run memtest again? I did run memtest before but there were no errors.
    One Love!!!

  4. #4
    Linux User
    Join Date
    Jul 2007
    Location
    Greece
    Posts
    277
    I followed the instructions from here:
    Linux server memory check

    According to this if the checksums do not match, there a must be a faulty memory guaranteed. My checksums do match so I guess my RAM is fine?
    One Love!!!

  5. #5
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,974
    What I do in these situations is to boot into the BIOS and make sure that the POST (power-on self test) is set to full, vs. fast. Then, you may also be able to alter your memory settings to not perform error correction (depends upon motherboard and/or bios). Reboot the system, observe if the POST detects any errors, and then rerun memtest with continuous looping and set to stop on error. Make sure you run it for several full iterations for sure. I know that this takes time, so you might want to do this overnight instead of during the work day if this is a work system.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  6. #6
    Linux User
    Join Date
    Jul 2007
    Location
    Greece
    Posts
    277
    The only place I saw the POST option in the BIOS is under Bios where the halt on POST error is enabled.
    Under Advanced there is an option called SERR signal condition and is set to single bit (alternate choices are None, multiple bit and both).
    The bios right hand side description for this, says: Select ECC error conditions that SERR# be asserted.

    If I select "None" would that disable the error correction?

    Memtest runs forever until you stop it. Do you know how to set memtest to stop when it finds an error?
    Last time I let it run for 2 whole days and didn't find any errors. This machine though has 16 Gigs of RAM on it, do you think I should let it run longer?



    Thank you
    One Love!!!

  7. #7
    Linux User
    Join Date
    Jul 2007
    Location
    Greece
    Posts
    277
    I set the SERR signal condition to none but the error keeps coming up.
    Thinking to re-install.
    One Love!!!

  8. #8
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,974
    If I select "None" would that disable the error correction?
    That is correct. Setting SERR to none will disable error-correction. That is what you would do in order to run memtest most effectively, but for normal operation you would set ECC to single or multi-bit correction, although multi-bit will take some more of your memory for the ECC bits.

    In any case, Linux is detecting an interrupt from the board that is telling it that there is an error happening (but corrected). This should also show up in the system logs (/var/log/messages most likely). It is possible that a fault in the motherboard or CPU is causing this. I would recommend getting a diagnostic program from your system vendor for best results.
    Last edited by Rubberman; 05-19-2010 at 03:21 PM.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  9. #9
    Linux User
    Join Date
    Jul 2007
    Location
    Greece
    Posts
    277
    Thank you for your time Rubberman.

    I will disable Error Correction and run memtest again and see what happens.

    Thanks again.
    One Love!!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...