Find the answer to your Linux question:
Results 1 to 7 of 7
Hello all! Some really weird thigs are happening to my server. It works fine for several days, and after it it goes down, but in a really strange way - ...
  1. #1
    Just Joined!
    Join Date
    Oct 2009
    Posts
    3

    Unhappy Unable to access file system fedora 6

    Hello all!

    Some really weird thigs are happening to my server.
    It works fine for several days, and after it it goes down, but in a really strange way - looks like a file system collapses or something.
    The symptoms are the following:

    1) reboot returns an error:
    shutdown: no such fil or directory

    2) tomcat cannot load servlet classes

    3) wget returns input/output error

    4) cannot download file via SFTP, cannot upload file via SFTP, cannot create a directory

    5) Despite the fact that almost all file operations are inoperative, "ls" works.

    Reboot using sys rq works fine
    echo 1 > /proc/sys/kernel/sysrq
    echo b > /proc/sysrq-trigger

    After this reboot the whole system and all applications (Tomcat, Postgres) are working fine, until next crash.

    I have no idea what's happening and what should I do to get an understanding of this situation.
    Can someone help me with this issue or at least give some advices on how to make a proper diagnostics of Fedora Core 6, because I have no idea what sould I start with.

    Thank you in advance!

  2. #2
    Linux Newbie grishi_111's Avatar
    Join Date
    Oct 2007
    Location
    Jafarpur Sitharra(U.P.)/New Delhi, India
    Posts
    171
    i don't have any idea about your problem.
    just one thing
    FC.6 is very old now.
    you should upgrade to FC.11 or something else.
    even FC.12 is scheduled to be released in mid of November.
    Sorry, it was unintentional.
    You should have told me at least once and i could have fix it.
    thanks for reminding me.

  3. #3
    Linux Guru coopstah13's Avatar
    Join Date
    Nov 2007
    Location
    NH, USA
    Posts
    3,149
    or if you want something red hat based but is stable and supported for a long time, you should use CentOS.

    My guess is you are running out of disk space, or have not enough memory/swap.

  4. #4
    Just Joined!
    Join Date
    Oct 2009
    Posts
    3
    Unfortunately I can't reinstall or upgrade the operationg system on this server, I'm renting it and I all I have is an SSH connection to it.

    Some information was in server's messages file, it corresponds to the time when this error occured

    Oct 26 15:00:43 rbi0104 kernel: audit(1256565643.414:3): avc: denied { execmod } for pid=1989 comm="jsvc" name="libjvm.so" dev=sda3 ino=2588918 scontext=system_u:system_r:initrc_t:s0 tcontext=root:object_r:usr_t:s0 tclass=file
    Oct 26 15:00:51 rbi0104 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Oct 26 15:00:51 rbi0104 kernel: ata1.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error)
    Oct 26 15:00:51 rbi0104 kernel: ata1: EH complete
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error)
    Oct 26 15:00:52 rbi0104 kernel: ata1: EH complete
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error)
    Oct 26 15:00:52 rbi0104 kernel: ata1: EH complete
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error)
    Oct 26 15:00:52 rbi0104 kernel: ata1: EH complete
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error)
    Oct 26 15:00:52 rbi0104 kernel: ata1: EH complete
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Oct 26 15:00:52 rbi0104 kernel: ata1.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error)
    Oct 26 15:00:52 rbi0104 kernel: ata1: EH complete
    Oct 26 15:00:52 rbi0104 kernel: SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
    Oct 26 15:00:52 rbi0104 kernel: sda: Write Protect is off
    Oct 26 15:00:52 rbi0104 kernel: SCSI device sda: drive cache: write back
    Oct 26 15:00:52 rbi0104 kernel: SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
    Oct 26 15:00:52 rbi0104 kernel: sda: Write Protect is off
    What could this mean?

    kernel: ata1.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error)

    Some kind of hardware error?

    In case if someone would be so kind to help me and would like to look through the whole messages file, I published it here:

    http: / / heroes . kz/manager/messages.zip

    And the quoted messages are from file messages.1
    I'm trying to google it by myself right now, but I would be very appreciated if someone will give me some ideas.

    Thank you in advance!

  5. #5
    Linux Guru Lakshmipathi's Avatar
    Join Date
    Sep 2006
    Location
    3rd rock from sun - Often seen near moon
    Posts
    1,568
    - Lakshmipathi.G
    -------------------
    FOSS India Award winning ext3fs Undelete tool and tutorials www.giis.co.in
    First they criticize you,Then they laugh at you,Then they fight with you,Then you win. - M.K.Gandhi
    -------------------

  6. #6
    Just Joined!
    Join Date
    Oct 2009
    Posts
    3
    Thank you very much, Lakshmipati. These links describe my problemm pretty well, and looks like this is an issue with some SMART capabilities.

    But I still got no idea about how to fix it.
    I made tests described in "Linux-Kernel Archive: FYI: strange libata EH lines in dmesg once after every bootup" and they produced very similar results.

    1)# smartctl --smart=on
    works fine

    2)# smartctl --saveauto=on -d ata /dev/sda
    produces the following error

    === START OF ENABLE/DISABLE COMMANDS SECTION ===
    Error SMART Enable Auto-save failed: Input/output error
    Smartctl: SMART Enable Attribute Autosave Failed.

    A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
    3)# smartctl --offlineauto=on -d ata /dev/sda
    produces the following error

    === START OF ENABLE/DISABLE COMMANDS SECTION ===
    Error SMART Enable Automatic Offline failed: Input/output error
    Smartctl: SMART Enable Automatic Offline Failed.
    In my opinion the problem occurs when server tries to go into a sleeping mode or tries to move an HDD into a sleeping mode. Probably in this case Fedora tries to use either auto-save feature or offlineauto feature, or both of them, and as long as they are unavailable, everything crashes and the file system becomes inoperative.

    Another thing which makes me think like that is the log of 5 latest disk errors:

    Error 7495 occurred at disk power-on lifetime: 14194 hours (591 days + 10 hours)
    When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

    ...

    Error 7494 occurred at disk power-on lifetime: 14194 hours (591 days + 10 hours)
    When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

    ...

    Error 7493 occurred at disk power-on lifetime: 14194 hours (591 days + 10 hours)
    When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

    ...

    Error 7492 occurred at disk power-on lifetime: 14194 hours (591 days + 10 hours)
    When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

    ...

    Error 7491 occurred at disk power-on lifetime: 14194 hours (591 days + 10 hours)
    When the command that caused the error occurred, the device was doing SMART Offline or Self-test.
    Now the main question is: what should I do to make my OS running well?

    There are no definite answer to this question that will match my situation.

    1) "Linux-Kernel Archive: FYI: strange libata EH lines in dmesg once after every bootup" advises to change startup scripts.

    But in my case the problem appears not during startup, but after it, usually in several days after startup, so it clearly will not help.

    2) "Problems with SATA2 harddrive? | KernelTrap" gives several advices
    a) disable smartd (smart daemon) to resolve this problem
    b) enable in BIOS S.M.A.R.T.
    c) add "noapic" and "nosmp" to the kernel

    Since "# smartctl --smart=on" succesfully works I don't thik (b) is the case.
    I'm a little afraid to disable smartd, and I don't think it's a good idea at all, so (a) is probably not the best choice either.

    First thing about (c) is that I'm not sure how exactly should I change boot parameters and I'm very afraid to totally break the system with my unqualified actions.
    What do you think about (c)? Could it really help?

    In fact I made the following:

    # smartctl --smart=on --offlineauto=off --
    and the output was:

    === START OF ENABLE/DISABLE COMMANDS SECTION ===
    SMART Enabled.
    SMART Attribute Autosave Disabled.
    SMART Automatic Offline Testing Disabled.
    Could these settings stop the malfunctions?
    I personally doubt that, but I don't understand how all these things are working inside, when an HDD goes to a sleeping mode and probably core will read the settings and will not try to make actions which lead to crash?

    Any ideas appreciated!
    Thank you in advance!

  7. #7
    Linux Guru Lakshmipathi's Avatar
    Join Date
    Sep 2006
    Location
    3rd rock from sun - Often seen near moon
    Posts
    1,568
    I have used smartctl very rarely,following links provides more insight on it
    Monitoring Hard Disks with SMART
    Linux Harddisk Monitoring with SmartMonTools (smartctl)

    I might be wrong ,I guess your hard disk might be started to fail-Just my assumption.




    ACPI deals with advanced configuration and power interface - check this to know more about
    kernel parameters.
    http://www.kernel.org/pub/linux/kern...n_pdf/ch09.pdf
    - Lakshmipathi.G
    -------------------
    FOSS India Award winning ext3fs Undelete tool and tutorials www.giis.co.in
    First they criticize you,Then they laugh at you,Then they fight with you,Then you win. - M.K.Gandhi
    -------------------

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...