Results 1 to 2 of 2
Over the last month or so my CentOS server has been crashing for reasons I do not know. It has been running for over a year with regular yum updates ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 12-23-2012 #1Just Joined!
- Join Date
- Dec 2012
- Posts
- 1
Random Crashing
Over the last month or so my CentOS server has been crashing for reasons I do not know. It has been running for over a year with regular yum updates without problems. The load on the server is perfectly normal with CPU usage at 5-6% and RAM usage at less than half of 32GB of RAM (multiple smaller game servers run off of this box). I am unsure if this is a software issue at all.
I have pasted my /var/log/messages file around the time of my latest crash all the way up to the crash. Because I am a CentOS newb, this is gibberish to me, so I am curious if anything in the file points to a crash of some kind? Or if there are other logs I could check and paste? If not, it would lead me to believe there is a hardware issue or overheating.
Here is the messages:
Thanks in advance.Code:Dec 21 14:58:03 server1 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280() (Not tainted) Dec 21 14:58:03 server1 kernel: Hardware name: X9SCL/X9SCM Dec 21 14:58:03 server1 kernel: NETDEV WATCHDOG: eth2 (e1000e): transmit queue 0 timed out Dec 21 14:58:03 server1 kernel: Modules linked in: fuse autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 sg microcode serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e ext4 mbcache jbd2 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Dec 21 14:58:03 server1 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-279.14.1.el6.x86_64 #1 Dec 21 14:58:03 server1 kernel: Call Trace: Dec 21 14:58:03 server1 kernel: <IRQ> [<ffffffff8106b7b7>] ? warn_slowpath_common+0x87/0xc0 Dec 21 14:58:03 server1 kernel: [<ffffffff8106b8a6>] ? warn_slowpath_fmt+0x46/0x50 Dec 21 14:58:03 server1 kernel: [<ffffffff81459c0d>] ? dev_watchdog+0x26d/0x280 Dec 21 14:58:03 server1 kernel: [<ffffffff8108caad>] ? insert_work+0x6d/0xb0 Dec 21 14:58:03 server1 kernel: [<ffffffff814599a0>] ? dev_watchdog+0x0/0x280 Dec 21 14:58:03 server1 kernel: [<ffffffff8107e937>] ? run_timer_softirq+0x197/0x340 Dec 21 14:58:03 server1 kernel: [<ffffffff810a23c0>] ? tick_sched_timer+0x0/0xc0 Dec 21 14:58:03 server1 kernel: [<ffffffff8102b40d>] ? lapic_next_event+0x1d/0x30 Dec 21 14:58:03 server1 kernel: [<ffffffff81073f61>] ? __do_softirq+0xc1/0x1e0 Dec 21 14:58:03 server1 kernel: [<ffffffff81096d60>] ? hrtimer_interrupt+0x140/0x250 Dec 21 14:58:03 server1 kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30 Dec 21 14:58:03 server1 kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0 Dec 21 14:58:03 server1 kernel: [<ffffffff81073d45>] ? irq_exit+0x85/0x90 Dec 21 14:58:03 server1 kernel: [<ffffffff81506450>] ? smp_apic_timer_interrupt+0x70/0x9b Dec 21 14:58:03 server1 kernel: [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20 Dec 21 14:58:03 server1 kernel: <EOI> [<ffffffff812cddbe>] ? intel_idle+0xde/0x170 Dec 21 14:58:03 server1 kernel: [<ffffffff812cdda1>] ? intel_idle+0xc1/0x170 Dec 21 14:58:03 server1 kernel: [<ffffffff8109929d>] ? sched_clock_cpu+0xcd/0x110 Dec 21 14:58:03 server1 kernel: [<ffffffff81407c27>] ? cpuidle_idle_call+0xa7/0x140 Dec 21 14:58:03 server1 kernel: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110 Dec 21 14:58:03 server1 kernel: [<ffffffff814f754f>] ? start_secondary+0x22a/0x26d Dec 21 14:58:03 server1 kernel: ---[ end trace c6b419e0a29214c3 ]--- Dec 21 14:58:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:58:04 server1 abrtd: Directory 'oops-2012-12-21-14:58:04-2219-0' creation detected Dec 21 14:58:04 server1 abrt-dump-oops: Reported 1 kernel oopses to Abrt Dec 21 14:58:04 server1 abrtd: Can't open file '/var/spool/abrt/oops-2012-12-21-14:58:04-2219-0/uid': No such file or directory Dec 21 14:58:06 server1 kernel: Bridge firewalling registered Dec 21 14:58:13 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:13 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:13 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:58:14 server1 abrtd: Sending an email... Dec 21 14:58:14 server1 abrtd: Email was sent to: root_localhost Dec 21 14:58:14 server1 abrtd: New problem directory /var/spool/abrt/oops-2012-12-21-14:58:04-2219-0, processing Dec 21 14:58:14 server1 abrtd: Can't open file '/var/spool/abrt/oops-2012-12-21-14:58:04-2219-0/uid': No such file or directory Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:23 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:33 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:43 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:58:53 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:53 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:53 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:59:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:59:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:59:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:59:13 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:59:13 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:59:13 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 15:03:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 15:03:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 15:03:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
- 12-23-2012 #2Linux Guru
- Join Date
- Nov 2007
- Posts
- 1,722
You have an error in what appears to be the network scheduler code:
Followed by lots of errors relating to the eth2 NIC:Code:Dec 21 14:58:03 server1 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280() (Not tainted) Dec 21 14:58:03 server1 kernel: Hardware name: X9SCL/X9SCM Dec 21 14:58:03 server1 kernel: NETDEV WATCHDOG: eth2 (e1000e): transmit queue 0 timed out
If you want a shotgun approach, replace the NIC.Code:Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:23 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register Dec 21 14:58:33 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register


Reply With Quote
