Results 1 to 1 of 1
Hi,
I'm new to mcelog as of this morning, but I'd like to help out a friend on a project and am hoping someone can give me a push in ...
- 07-28-2010 #1Just Joined!
- Join Date
- Jun 2010
- Location
- silicon valley
- Posts
- 6
machine check exception question (mcelog)
Hi,
I'm new to mcelog as of this morning, but I'd like to help out a friend on a project and am hoping someone can give me a push in the right direction.
I've tried Googling but I can't find anything that explains what the entries mean in the mcelog. If anyone could point me to a good resource, even just salient points in code, that would be great. An example entry from my test box is as follows:
I'm mostly interested in the first line (does MCE 0 indicate a status level or just the first MCE error found?), the fourth line (what does CPU 2 4 mean?), and how to distinguish between different types of MCE/CMCI stuff. If anyone even just knows how to annotate the above entry with a short description of meaning, that would be incredibly awesome.Code:MCE 0 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 2 4 northbridge TSC 109db45d6c65 ADDR 2b1bc4190 Northbridge Chipkill ECC error Chipkill ECC syndrome = be21 bit40 = error found by scrub bit46 = corrected ecc error bit62 = error overflow (multiple errors) bus error 'local node response, request didn't time out generic read mem transaction memory access, level generic' STATUS d410c100be080a13 MCGSTATUS 0
Additionally, I am looking at the 1.0-pre2 tarball and am confused re: diskdb.c and memdb.c. My project is supposed to be combing these logs and making them into a simplistic database. My understanding is that attempts with the diskdb had problems; however I can't quite understand how to trace memdb -- what it is, where it might be created, how I might look at it, etc.
Any help at all - pointers to good websites, other forums, or email lists appreciated. Thanks so much


Reply With Quote
