Find the answer to your Linux question:
Results 1 to 7 of 7
Hi, I hope to get some clues on my problem. My system: 2GB RAM, Fedora12, an PCIe card with DMA write engine, PCIe linux driver exported as character device with ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Nov 2010
    Posts
    11

    system unstable when using MMAP and REMAP_PFN_RANGE


    Hi,
    I hope to get some clues on my problem.
    My system: 2GB RAM, Fedora12, an PCIe card with DMA write engine, PCIe linux driver exported as character device with ioctl and mmap operations, test-application which checks the received data.

    - System memory is limited to 1GB by specifying "mem = 1024M" in /boot/grub/grub.conf file. The higher half is used for DMA transfer, DMA engine transfer upto 512MB only. i.e 1GB=>1.5GB then back to 1GB, continuously streaming.

    - Up on certain amount of data, PCIe card interrupts driver, driver updates total data available, called "total".

    - Application accesses the "total" via ioctl, accesses the transferred data by mmap.

    - mmap in character-device-pcie-driver called remap_pfn_range and pass directly the physical address to it:
    remap_pfn(vma, vma->vm_start, (physical_address>>PAGE_SHIFT), vma->vm_end-vma->vm_start, vma->vm_page_prot)

    - Application checks data by accessing the memory returned by mmap and checks certain locations for known signatures.

    - The DMA always dumps data into Main memory regardless of whether the application is running or not.
    ================

    Everythings works fine, data is valid, interrupt, poll, mmap functions work perfectly. However, the X11 system is very unstable. Symptoms:
    - ssh deamon, sshd, received segmentation fault
    - some audit programme has segmentation fault
    - terminal used to launch the test-application quits
    - X11 GUI quits (logged off to the log-on screen) or gets black-out

    I've tried and found out that the system is stable when:
    - I do not run the test application, i.e only driver and dma running.
    - OR test-application does mmap but not accessing the memory at all, i.e. just do mmap and the test-application go into while-forever loop and do nothing.

    I really have no clue what's wrong. Why does accessing the memory make system & other applications unstable?

    I'm seeking help from community and experts. Any hint would help a lot.

    Thanks for reading,

    JL

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,392
    This thread would be better off if you moved it to the Kernel forum. However, my guess is that you have mapped memory used by the video system. What video hardware are you running? Remember that Intel GPU's "steal" some system memory for video purposes (shared RAM). If you have mapped some memory used by that, then it would definitely destabilize the video system.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Nov 2010
    Posts
    11
    Quote Originally Posted by Rubberman View Post
    This thread would be better off if you moved it to the Kernel forum. However, my guess is that you have mapped memory used by the video system. What video hardware are you running? Remember that Intel GPU's "steal" some system memory for video purposes (shared RAM). If you have mapped some memory used by that, then it would definitely destabilize the video system.
    Thank you, Rubberman.
    System uses 82Q35 intel chipset with integrated VGA/memory controller.
    I'll follow your hint.

  4. #4
    Just Joined!
    Join Date
    Nov 2010
    Posts
    11

    some findings

    I did some "experiments" on my system and i found out that somehow the way i access the Physical memory causes the problem.

    Brief on my system:
    - My PCIe card receives 1Gb ethernet and DMA to the system memory.
    - My PCIe interrupts after every 2048 packets, the interrupt isr will wake the test-application up.
    - My test-application accesses physical system memory and checks for data.

    My system seems to be stable when i access physical data less frequently. i I still receive full-rate data but check for 1 packet in every 64 packet. i.e verify data in one packet and skip next 63. My system is very stable this case, runs for hours without any segfault.

    However, whenever i access every packet (regardless of actuall incoming rate, i've tried to lower the incoming rate, as low as 3Mb/s), system gives segfault on many applications and deamons (my test application is always fine).

    Thanks a bunch for spending time reading this.

    Jeff

  5. #5
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,392
    Are you doing a "fast" IRQ in your driver (non-nestable, non-preemptable), or regular (can nest or be preempted)? If it is a "fast" interrupt, then you may be doing too much work in the interrupt handler and this is causing interference with other (video) hardware. In such a case, the frequency that you are hitting the isr, combined with the amount of time you are in the isr, may be contributory to your problem. This is something you need to analyze since what you are doing should not cause such problems. The term is "race condition"...
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  6. #6
    Just Joined!
    Join Date
    Nov 2010
    Posts
    11
    Quote Originally Posted by Rubberman View Post
    This is something you need to analyze since what you are doing should not cause such problems.
    Yeah, true. Thanks R, let me zoom into that area.

  7. #7
    Just Joined!
    Join Date
    Nov 2010
    Posts
    11
    After checking again, i found nothing wrong with the driver.
    I'm sure that the interrupt is not nested, 12ms per interrupt, in the interrupt ISR doesn't do much, it reads hardware registers and wake_up_interruptible the poll operation.
    Last edited by jefflieu; 12-01-2010 at 06:58 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •