Hello,

We have 50+ servers with the common configuration (CentOS 5.x 64bit + Xen 3.4.3) in different DC. Recently some of these servers have started to reboot with kernel panic.

Please check the dmesg output after the panic below.

Looking forward to get any ideas how to solve this issue. Thank you.

# xm info :
release : 2.6.18-274.3.1.el5xen
version : #1 SMP Tue Sep 6 20:57:11 EDT 2011
machine : x86_64
nr_cpus : 8
nr_nodes : 1
cores_per_socket : 4
threads_per_core : 2
cpu_mhz : 2533
hw_caps : bfebfbff:28100800:00000000:00000340:0098e3fd:00000 000:00000001:00000000
virt_caps : hvm
total_memory : 32759
free_memory : 16822
node_to_cpu : node0:0-7
node_to_memory : node0:16822
xen_major : 3
xen_minor : 4
xen_extra : .3
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)
cc_compile_by : root
cc_compile_domain : gitco.tld
cc_compile_date : Sun Jun 19 13:52:00 CEST 2011
xend_config_format : 4

Here's the dmesg output:

Oct 4 16:46:00 server Unable to handle kernel NULL pointer dereference
Oct 4 16:46:00 server at 00000000000003c8 RIP:
Oct 4 16:46:00 server [<ffffffff88652383>] :bridge:__br_deliver+0xcd/0xfb
Oct 4 16:46:00 server PGD 1a8bd067
Oct 4 16:46:00 server PUD 132e9067
Oct 4 16:46:00 server PMD 0
Oct 4 16:46:00 server
Oct 4 16:46:00 server Oops: 0000 [1]
Oct 4 16:46:00 server SMP
Oct 4 16:46:00 server
Oct 4 16:46:00 server last sysfs file: /devices/xen-backend/vbd-2-768/statistics/wr_sect
Oct 4 16:46:00 server CPU 1
Oct 4 16:46:00 server
Oct 4 16:46:00 server Modules linked in:
Oct 4 16:46:00 server netconsole
Oct 4 16:46:00 server ebtable_filter
Oct 4 16:46:00 server ebtables
Oct 4 16:46:00 server tun
Oct 4 16:46:00 server bridge
Oct 4 16:46:00 server netloop
Oct 4 16:46:00 server netbk
Oct 4 16:46:00 server blktap
Oct 4 16:46:00 server blkbk
Oct 4 16:46:00 server ipt_REDIRECT
Oct 4 16:46:00 server ipt_owner
Oct 4 16:46:00 server ip_nat_ftp
Oct 4 16:46:00 server ip_conntrack_ftp
Oct 4 16:46:00 server xt_state
Oct 4 16:46:00 server xt_length
Oct 4 16:46:00 server ipt_ttl
Oct 4 16:46:00 server xt_tcpmss
Oct 4 16:46:00 server ipt_TCPMSS
Oct 4 16:46:00 server xt_multiport
Oct 4 16:46:00 server xt_limit
Oct 4 16:46:00 server ipt_LOG
Oct 4 16:46:00 server ipt_TOS
Oct 4 16:46:00 server ipt_tos
Oct 4 16:46:00 server ipt_REJECT
Oct 4 16:46:00 server iptable_filter
Oct 4 16:46:00 server iptable_nat
Oct 4 16:46:00 server ip_nat
Oct 4 16:46:00 server ip_conntrack
Oct 4 16:46:00 server nfnetlink
Oct 4 16:46:00 server ip_tables
Oct 4 16:46:00 server x_tables
Oct 4 16:46:00 server loop
Oct 4 16:46:00 server dm_mirror
Oct 4 16:46:00 server dm_multipath
Oct 4 16:46:00 server scsi_dh
Oct 4 16:46:00 server video
Oct 4 16:46:00 server backlight
Oct 4 16:46:00 server sbs
Oct 4 16:46:01 server power_meter
Oct 4 16:46:01 server hwmon
Oct 4 16:46:01 server i2c_ec
Oct 4 16:46:01 server dell_wmi
Oct 4 16:46:01 server wmi
Oct 4 16:46:01 server button
Oct 4 16:46:01 server battery
Oct 4 16:46:01 server asus_acpi
Oct 4 16:46:01 server ac
Oct 4 16:46:01 server parport_pc
Oct 4 16:46:01 server lp
Oct 4 16:46:01 server parport
Oct 4 16:46:01 server sr_mod
Oct 4 16:46:01 server cdrom
Oct 4 16:46:01 server sg
Oct 4 16:46:01 server snd_hda_intel
Oct 4 16:46:01 server snd_seq_dummy
Oct 4 16:46:01 server snd_seq_oss
Oct 4 16:46:01 server snd_seq_midi_event
Oct 4 16:46:01 server snd_seq
Oct 4 16:46:01 server snd_seq_device
Oct 4 16:46:01 server snd_pcm_oss
Oct 4 16:46:01 server snd_mixer_oss
Oct 4 16:46:01 server snd_pcm
Oct 4 16:46:01 server snd_timer
Oct 4 16:46:01 server snd_page_alloc
Oct 4 16:46:01 server tpm_tis
Oct 4 16:46:01 server snd_hwdep
Oct 4 16:46:01 server i2c_i801
Oct 4 16:46:01 server snd
Oct 4 16:46:01 server e1000e
Oct 4 16:46:01 server tpm
Oct 4 16:46:01 server serial_core
Oct 4 16:46:01 server soundcore
Oct 4 16:46:01 server i2c_core
Oct 4 16:46:01 server pcspkr
Oct 4 16:46:01 server serio_raw
Oct 4 16:46:01 server tpm_bios
Oct 4 16:46:01 server dm_raid45
Oct 4 16:46:01 server dm_message
Oct 4 16:46:01 server dm_region_hash
Oct 4 16:46:01 server dm_log
Oct 4 16:46:01 server dm_mod
Oct 4 16:46:01 server dm_mem_cache
Oct 4 16:46:01 server ahci
Oct 4 16:46:01 server libata
Oct 4 16:46:01 server shpchp
Oct 4 16:46:01 server megaraid_sas
Oct 4 16:46:01 server sd_mod
Oct 4 16:46:01 server scsi_mod
Oct 4 16:46:01 server ext3
Oct 4 16:46:01 server jbd
Oct 4 16:46:01 server uhci_hcd
Oct 4 16:46:01 server ohci_hcd
Oct 4 16:46:01 server ehci_hcd
Oct 4 16:46:01 server
Oct 4 16:46:01 server Pid: 26388, comm: qemu-dm Not tainted 2.6.18-274.3.1.el5xen #1
Oct 4 16:46:01 server RIP: e030:[<ffffffff88652383>]
Oct 4 16:46:01 server [<ffffffff88652383>] :bridge:__br_deliver+0xcd/0xfb
Oct 4 16:46:01 server RSP: e02b:ffff8800065d3b18 EFLAGS: 00010296
Oct 4 16:46:01 server RAX: 0000000000000000 RBX: ffff880034831000 RCX: ffff880009b95bc0
Oct 4 16:46:01 server RDX: ffffffff80000000 RSI: ffff880009b95bc0 RDI: ffff88003846be00
Oct 4 16:46:01 server RBP: ffff880037dfa500 R08: 0000000002000000 R09: ffffffff88652195
Oct 4 16:46:01 server R10: 0000000080000000 R11: ffffffff8853f9c0 R12: ffff880009b95c18
Oct 4 16:46:01 server R13: ffffffff80567f00 R14: ffff880037dfa000 R15: 0000000000000000
Oct 4 16:46:01 server FS: 00002ba775bc9bf0(0000) GS:ffffffff8062e080(0000) knlGS:0000000000000000
Oct 4 16:46:01 server CS: e033 DS: 0000 ES: 0000
Oct 4 16:46:01 server Process qemu-dm (pid: 26388, threadinfo ffff88003e064000, task ffff88001b614080)
Oct 4 16:46:01 server Stack:
Oct 4 16:46:01 server ffff880009b95bc0
Oct 4 16:46:01 server ffff880009b95bc0
Oct 4 16:46:01 server ffff880037dfa500
Oct 4 16:46:01 server ffffffff88651294
Oct 4 16:46:01 server
Oct 4 16:46:01 server ffff88001df131c0
Oct 4 16:46:01 server ffff880009b95bc0
Oct 4 16:46:01 server ffffffff808d3870
Oct 4 16:46:01 server ffffffff80424b93
Oct 4 16:46:01 server
Oct 4 16:46:01 server ffff880033474460
Oct 4 16:46:01 server ffff880009b95bc0
Oct 4 16:46:01 server
Oct 4 16:46:01 server Call Trace:
Oct 4 16:46:01 server <IRQ>
Oct 4 16:46:01 server [<ffffffff88651294>] :bridge:br_dev_xmit+0xc7/0xdb
Oct 4 16:46:01 server [<ffffffff80424b93>] dev_hard_start_xmit+0x1b7/0x28a
Oct 4 16:46:01 server [<ffffffff80230b81>] dev_queue_xmit+0x31f/0x3ef
Oct 4 16:46:01 server [<ffffffff80233073>] ip_output+0x29a/0x2dd
Oct 4 16:46:01 server [<ffffffff80441cb7>] ip_forward+0x24f/0x2bd
Oct 4 16:46:01 server [<ffffffff80236cdc>] ip_rcv+0x539/0x57c
Oct 4 16:46:01 server [<ffffffff80221590>] netif_receive_skb+0x495/0x4c4
Oct 4 16:46:01 server [<ffffffff88652efd>] :bridge:br_handle_frame_finish+0x1bc/0x1d3
Oct 4 16:46:01 server [<ffffffff88657123>] :bridge:br_nf_pre_routing_finish+0x2e9/0x2f8
Oct 4 16:46:02 server [<ffffffff88656e3a>] :bridge:br_nf_pre_routing_finish+0x0/0x2f8
Oct 4 16:46:02 server [<ffffffff80258542>] nf_hook_slow+0x58/0xbc
Oct 4 16:46:02 server [<ffffffff88656e3a>] :bridge:br_nf_pre_routing_finish+0x0/0x2f8
Oct 4 16:46:02 server [<ffffffff8028829d>] find_busiest_group+0x1db/0x44a
Oct 4 16:46:02 server [<ffffffff88657d20>] :bridge:br_nf_pre_routing+0x600/0x61c
Oct 4 16:46:02 server [<ffffffff802352d4>] nf_iterate+0x41/0x7d
Oct 4 16:46:02 server [<ffffffff88652d41>] :bridge:br_handle_frame_finish+0x0/0x1d3
Oct 4 16:46:02 server [<ffffffff80258542>] nf_hook_slow+0x58/0xbc
Oct 4 16:46:02 server [<ffffffff88652d41>] :bridge:br_handle_frame_finish+0x0/0x1d3
Oct 4 16:46:02 server [<ffffffff88653082>] :bridge:br_handle_frame+0x16e/0x1a4
Oct 4 16:46:02 server [<ffffffff802214a3>] netif_receive_skb+0x3a8/0x4c4
Oct 4 16:46:02 server [<ffffffff802488d3>] try_to_wake_up+0x392/0x3a4
Oct 4 16:46:02 server [<ffffffff802319a4>] process_backlog+0x9b/0x104
Oct 4 16:46:02 server [<ffffffff8020d0a1>] net_rx_action+0xb4/0x1c6
Oct 4 16:46:02 server [<ffffffff80212f06>] __do_softirq+0x8d/0x13b
Oct 4 16:46:02 server [<ffffffff8025fda4>] call_softirq+0x1c/0x278
Oct 4 16:46:02 server <EOI>
Oct 4 16:46:02 server [<ffffffff8026db69>] do_softirq+0x31/0x90
Oct 4 16:46:02 server [<ffffffff8024fc30>] netif_rx_ni+0x19/0x1d
Oct 4 16:46:02 server [<ffffffff8866d51d>] :tun:tun_chr_writev+0x3b4/0x402
Oct 4 16:46:02 server [<ffffffff8020bf28>] free_hot_cold_page+0x133/0x175
Oct 4 16:46:02 server [<ffffffff8866d585>] :tun:tun_chr_write+0x1a/0x1f
Oct 4 16:46:02 server [<ffffffff8021747d>] vfs_write+0xce/0x174
Oct 4 16:46:02 server [<ffffffff80217cc6>] sys_write+0x45/0x6e
Oct 4 16:46:02 server [<ffffffff8025f106>] system_call+0x86/0x8b
Oct 4 16:46:02 server [<ffffffff8025f080>] system_call+0x0/0x8b
Oct 4 16:46:02 server
Oct 4 16:46:02 server
Oct 4 16:46:02 server Code:
Oct 4 16:46:02 server 48
Oct 4 16:46:02 server 8b
Oct 4 16:46:02 server 80
Oct 4 16:46:02 server c8
Oct 4 16:46:02 server 03
Oct 4 16:46:02 server 00
Oct 4 16:46:02 server 00
Oct 4 16:46:02 server 48
Oct 4 16:46:02 server 85
Oct 4 16:46:02 server c0
Oct 4 16:46:02 server 74
Oct 4 16:46:02 server 1e
Oct 4 16:46:02 server 48
Oct 4 16:46:02 server 8b
Oct 4 16:46:02 server 50
Oct 4 16:46:02 server 38
Oct 4 16:46:02 server 48
Oct 4 16:46:02 server 8b
Oct 4 16:46:02 server 45
Oct 4 16:46:02 server 18
Oct 4 16:46:02 server
Oct 4 16:46:02 server RIP
Oct 4 16:46:02 server [<ffffffff88652383>] :bridge:__br_deliver+0xcd/0xfb
Oct 4 16:46:02 server RSP <ffff8800065d3b18>
Oct 4 16:46:02 server CR2: 00000000000003c8
Oct 4 16:46:02 server
Oct 4 16:46:02 server Kernel panic - not syncing: Fatal exception
Oct 4 16:46:02 server