Find the answer to your Linux question:
Results 1 to 2 of 2
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    Question Multicast routing Kernel >=3.2 issue.


    This is my problem:

    I have setup a multicast router server, using Debian, and using XORP 1.8.5 to support PIM-SMv2 routing and IGMP Querier.

    When i updated from Debian 5 (kernel 2.6.26 (686-pae)) to Wheezy and got kernel 3.2 an issue started, let me explain:

    I got about 24 Multicast streams (HD 1080p), that is about a total of 400Mbps Multicast input.
    I use 4 Ethernet ports bridged together the Video encoders are connected to those ports. the bridge interface is then the Multicast source port for my setup.
    Another network is connected to a different network adapter on my server. (Which the multicast receiver are connected).

    The routing and multicast works fine on 2.6.26 and is dead stable. (low resource usage, zero packet loss, etc).

    However, after upgrade to kernel 3.2(can also confirm this for kernel 3.10), the multicast routing mysteriously stops functioning after about ~17hours.
    no kernel log messages or log event is present when the multicast routing stops functioning.
    "ip mroute show" is empty, but if i run tcpdump I can confirm that the Multicast is present. if i do a pcap capture I can tcpreplay it on another host and watch the stream in VLC.
    The only way to solve this is to reboot the server. (Also, Memory usage and CPU usage is very low. so it does not seem to be a resource issue.)
    (i have tried restarting XORP, removed and reconfigure the bridge and re-modpobe the network driver modules)

    Here is the statistics for the Input bridge adapter right after "stop":
    br0    Link encap:Ethernet  HWaddr XX:XX:XX:XX:XX:XX  
              inet addr:  Bcast:  Mask:
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:4295483655 errors:0 dropped:0 overruns:0 frame:0
              TX packets:4210 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0 
              RX bytes:3350477250900 (3.0 TiB)  TX bytes:240828 (235.1 KiB)
    "/proc/net/ip_mr_cache" Right before the "stop" (a couple of sec before)
    Group    Origin   Iif     Pkts    Bytes    Wrong Oifs
    0A0202EF 0BFFFF0A 1   181610713 4217402668        0
    090202EF 0BFFFF0A 1   181610715 4217404228        0
    080202EF 0BFFFF0A 1   181610716 4217405008        0
    070202EF 0BFFFF0A 1   181610716 4217405008        0
    180202EF 0DFFFF0A 1   181611351 4217900308        0
    170202EF 0DFFFF0A 1   181611353 4217901868        0
    120202EF 0CFFFF0A 1   170885816 146950304        0
    020202EF 0AFFFF0A 1   181609636 4216562608        0
    160202EF 0DFFFF0A 1   181611358 4217905768        0
    150202EF 0DFFFF0A 1   181611359 4217906548        0
    060202EF 0AFFFF0A 1   181609641 4216566508        0
    140202EF 0DFFFF0A 1   181611364 4217910448        0
    050202EF 0AFFFF0A 1   181609644 4216568848        0
    130202EF 0DFFFF0A 1   181611366 4217912008        0
    040202EF 0AFFFF0A 1   181609649 4216572748        0
    030202EF 0AFFFF0A 1   181609651 4216574308        0  0:1  
    110202EF 0CFFFF0A 1   170885832 146962784        0
    010202EF 0AFFFF0A 1   181609652 4216575088        0  0:1  
    100202EF 0CFFFF0A 1   170885837 146966684        0
    0F0202EF 0CFFFF0A 1   170885838 146967464        0
    0E0202EF 0CFFFF0A 1   170885841 146969804        0
    0D0202EF 0CFFFF0A 1   170885842 146970584        0
    0C0202EF 0BFFFF0A 1   181610767 4217444788        0
    0B0202EF 0BFFFF0A 1   181610772 4217448688        0
    "/proc/net/ip_mr_vif" right before "stop"
    Interface      BytesIn  PktsIn  BytesOut PktsOut Flags Local    Remote
     0 eth7             0       0  -150375936 363227520 00000 0100280A 00000000
     1 br0     -920561460 -1179941         0       0 00000 01FFFF0A 00000000
     2 pimreg            0       0         0       0 00004 01FFFF0A 00000000

    (btw, if i reduce the number of multicast inputs or the video bitrate, this only extends the time before multicast routing stops).

    Its not a "rare issue", it happens every time.

    Please help me figure out whats causing the issue.
    or do anyone know of a fix for this?

    Is very hard to debug, as dmesg does not have any recent messages(other than messages that are displayed during boot).
    And its occurring ~17 hours.

    I'm also not a programmer, so i don't know how to debug this properly.
    (I'm more of a network guy.)

    however: it looks like the RX packet count is at the limit of a "unsigned integer" (4294967295) every time the multicast routing stops on the recent kernels. (don't know if this is the case, but its weird if its a coincidence. On the 2.6.26 kernel, the RX packet counter just starts at 0 packets once it reaches this number.),

    I re-installed to a 64bit system, and had the same result.

    Does anyone know of a solution to this?

    I suspect there is something fishy in the kernel related to mroute, as normal IPv4 traffic otherwise works just fine.

    Thanks for reading.

  2. #2
    So i switch to pimd, just to check if its an issue with xorp.

    I can still reproduce the issue with pimd. therefor i believe this is a kernel BUG.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts