Find the answer to your Linux question:
Results 1 to 6 of 6
Dear All, I have two machine running RedHAT Linux and attached to EMC SAN storage and running an oracle cluster active active. Yesterday, One of the machine went down due ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jul 2009
    Posts
    3

    RHEL Kernal booting error


    Dear All,

    I have two machine running RedHAT Linux and attached to EMC SAN storage and running an oracle cluster active active. Yesterday, One of the machine went down
    due to power fluctation. This is multiprocess machine model 6800 power edge and using (REDHAT ENTERPRISE LINUX AS (2.4.21-37.ELsmp) ) Kernal by default.
    the machine gets hang on stuff below. It seems to me a HBA driver loading problem. But when i use different kernel (REDHAT ENTERPRISE AS-UP (2.4.21-37.EL))
    it boots the machine and attache to storage. But i cannot run my oracle cluster stuff on it becasue the kernel should be (REDHAT ENTERPRISE LINUX AS (2.4.21-37.ELsmp) )
    Please help me how can i pass thorugh the booting process using kernal (REDHAT ENTERPRISE LINUX AS (2.4.21-37.ELsmp) ) . Please find below the error detail
    when using kernel ELsmp.

    error message detail
    --------------------

    loading megaraid_sas.o module
    /lib/megaraid_sas.o

    Hint: insmod errors can be caused by inccorect module parameter,including invalid
    I/O or IRQ paramegter. you may find more information in syslog or the output from dmesg.

    error: /bin/insmod exited abnormally loading lpfc.o module

    Machine Exception 0000000000000000000004



    I will be very thankfull to you.
    Regards,
    Amir

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,754
    You very much need to run a full diagnostic suite on this system. Since the single processor kernel works, but the smp one does not there are a couple of possibilities that come to my mind.

    1. The power fluxuation damaged one of your processors.
    2. The power fluxuation damaged either the kernel or some of the components that need to be loaded.

    Of these two, I think # 2 is the more likely scenario. However, do run a diagnostic suite on the hardware just to be certain there was no damage. Then, if you have a backup of the /boot partition, restore that and see what happens. You do have a backup for what seems to be a production system, don't you?
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Jul 2009
    Posts
    3
    Dear,

    we ran the diagnostic suite and every thing is fine. Now the point #2 is more likely my scenario. I Don't have a backup of /boot partition and what is the reason to restore the boot partition because i have two kernel one is ELsmp and the other is EL and EL kernel is working fine but because of other kernel ELsmp it's trying to load the lpfc.o modul and it's abnormally terminated that's why i am not able to boot the ELsmp kernel.
    Please guide me how can i reinstall the lpfc.o module for ELsmp kernel sitting in EL kernel.

    Thanks.
    Regards,
    Amir

  4. $spacer_open
    $spacer_close
  5. #4
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,754
    What do the boot diagnostic messages say when it tries to load the module? It (lpfc.o) could be corrupted, or some other component is that it depends upon. Do you have another installation that has that component which you could copy to see if it works? Do you have the kernel source to rebuild? Backup? I always keep an off-line bit-image copy of my system disc(s) that I can restore with any Linux recovery CD/DVD/USB drive. It has saved my bacon any number of times in just such scenarios. I try to keep the copy up-to-date every couple of weeks or just before I do any major updates (and again after the updates are verified as healthy) so I don't have too much patching to do when it is restored. Which reminds me that I need to do that again this weekend since I installed a new kernel yesterday...
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  6. #5
    Just Joined!
    Join Date
    Jul 2009
    Posts
    3
    Dear,

    The diagnostic message did not say any thing every thing work fine with diagnostic suite. I also tried to copy the lpfc.o modules from other working machine but gave me the same error.
    I am not senior in Linux and do not have a bit-image copy of it.

    Please advise me.

    Thanks.
    Regards,
    Amir

  7. #6
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,754
    We are trying to help you, but you have to take some steps to find the root cause of this yourself. The error indicating lpfc.o as causing the error might simply be a red herring, in that it is the symptom, not the cause of the problem. If you have no backups of your boot image, then you have no way to determine if this is a software failure, or a hardware one even though your diagnostics indicate the hardware is ok. It might be a problem with a peripheral that the diagnostics don't check.

    So, I will tell you what I tell all my clients that have these sort of problems and didn't keep a backup of their critical systems. "I'm sorry, but my magic wand doesn't cover that - my consulting rates are $200USD per hour + expenses and I have no idea how long it will take to get you back online - at least a day, and hopefully less than a week..."
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •