RHEL 3, bnx2 v1
ISSUE: HP DL380 G5's are seizing up and not allowing network activity past a certain point. The G5's are running RedHat Ent. Linux 3 AS/ES for x86. A "netstat -tan" on one of the locked up hosts (already in a quescient state), indicates that the kernel has loads of data queued up but will not transmit.
DL380 G4's using the tg3 driver for Tigon3 gigE NICs behave correctly while DL380 G5's using the bnx2 driver for the Broadcom NetXtreme II
BCM5708 NIC's lock up randomly. The release notes for bnx2-1.4.43f-1.src.rpm states that it supports RedHat Ent. Linux 3 AS/ES for x86 but later says it's LIMITATIONS are that the DRIVER DOES NOT WORK PROPERLY FOR THE KERNEL VERSION LESS THAN 2.4.24.
The problem is that the latest kernel available from RedHat on RHN is 2.4.21-53
I found this in the release notes for the bnx2-1.7.1c.
Problem: A rare tx race window exists in the tx path.
Cause: CPU re-ordering can cause the tx queue to be stopped forever when the tx ring is full in a very rare condition.
bnx18.104.22.168 & above may fix this issue but which versions are supported by HP for RHEL 3 update 9?
The customer has proceeded with a release for RHEL3 update 9 due to statements in release notes that RHEL 3 is supported so RHEL 4 is out of the question for this release.
I'm looking for a HP SUPPORTED version of the driver for RHEL 3 and DL380 G5's.
An update to bnx2-1.4.52d, the lastest supported version from HP HAS NOT fixed this issue.
At an arbitrary point systems COMPLETELY STOP sending data through the bnx2 driver and ONLY a reboot of the system allows data to once again pass. Rerouting the data through 127.0.0.1 works during the PROBLEM timeframe but data will not pass through the bnx2 NIC until a reboot. Netstat shows data piling up in the queue.