Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 11
Hi, I've got a really troublesome machine, it was having SCSI I/O errors a lot, replacing the disks didn't help, replacing the scsi cable didn't help, even replacing the scsi ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Linux Newbie humbletech99's Avatar
    Join Date
    Nov 2005
    Posts
    225

    Trouble with SCSI, Trouble with Sata


    Hi,
    I've got a really troublesome machine, it was having SCSI I/O errors a lot, replacing the disks didn't help, replacing the scsi cable didn't help, even replacing the scsi controller didn't help... (the disk were a jbod). So I've stuck in large Sata disks instead to test how much slower they are, and again I'm having input output errors as follows on newly created ext3 filesystems when I try to use them:
    Code:
    hostname	kern	20:29:48	kernel: EXT3-fs error (device sdc1) in start_transaction: Journal has aborted
    hostname	kern	20:29:48	kernel: EXT3-fs error (device sdc1) in start_transaction: Journal has aborted
    hostname	kern	20:29:48	kernel: ext3_splice_branch: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device sdc1) in ext3_prepare_write: Journal has aborted
    hostname	kern	20:29:48	kernel: Remounting filesystem read-only
    hostname	kern	20:29:48	kernel: EXT3-fs abort (device sdc1): ext3_journal_start: Detected aborted journal
    hostname	kern	20:29:47	kernel: ext3_abort called.
    hostname	kern	20:29:44	kernel: EXT3-fs error (device sdc1) in ext3_ordered_writepage: IO failure
    hostname	kern	20:29:44	kernel: Aborting journal on device sdc1.
    hostname	kern	20:29:44	kernel: EXT3-fs error (device sdc1): ext3_new_block: Allocating block in system zone - block = 112263168
    hostname	kern	17:46:30	kernel: EXT3-fs error (device sdb1): ext3_readdir: bad entry in directory #53710740: directory entry across blocks - offset=0, inode=0, rec_len=13056, name_len=20
    hostname	kern	17:46:30	kernel: EXT3-fs error (device sdb1): ext3_readdir: bad entry in directory #53120466: directory entry across blocks - offset=0, inode=0, rec_len=6948, name_len=91
    hostname	kern	17:46:28	kernel: EXT3-fs error (device sdb1): ext3_readdir: bad entry in directory #41791555: directory entry across blocks - offset=0, inode=0, rec_len=6824, name_len=94
    hostname	kern	17:46:28	kernel: Remounting filesystem read-only
    hostname	kern	17:46:28	kernel: EXT3-fs abort (device sdb1): ext3_journal_start: Detected aborted journal
    hostname	kern	17:46:28	kernel: ext3_abort called.
    hostname	kern	17:46:28	kernel: Aborting journal on device sdb1.
    hostname	kern	17:46:28	kernel: EXT3-fs error (device sdb1): ext3_readdir: bad entry in directory #41791536: directory entry across blocks - offset=0, inode=0, rec_len=15696, name_len=50
    hostname	kern	16:31:33	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:31:33	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:31:33	kernel: ext3_splice_branch: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device sdb1) in ext3_prepare_write: Journal has aborted
    hostname	kern	16:31:33	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:31:33	kernel: Remounting filesystem read-only
    hostname	kern	16:31:33	kernel: EXT3-fs abort (device sdb1): ext3_journal_start: Detected aborted journal
    hostname	kern	16:31:33	kernel: ext3_abort called.
    hostname	kern	16:31:33	kernel: Aborting journal on device sdb1.
    hostname	kern	16:31:33	kernel: EXT3-fs error (device sdb1): ext3_new_block: Allocating block in system zone - block = 6651906
    hostname	kern	16:18:44	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:18:43	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:18:43	kernel: ext3_splice_branch: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device sdb1) in ext3_prepare_write: Journal has aborted
    hostname	kern	16:18:43	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:18:43	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:18:41	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:18:41	kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
    hostname	kern	16:18:39	kernel: Remounting filesystem read-only
    hostname	kern	16:18:36	kernel: EXT3-fs abort (device sdb1): ext3_journal_start: Detected aborted journal
    hostname	kern	16:18:34	kernel: ext3_abort called.
    I'm beginning to think this machine is cursed, I've already changed the motherboard which inexplicably worked until I shut the machine down, then switched it back on (or at least tried) and it just didn't switch back on, no static discharge or anything like that, just wouldn't play ball...

    What on earth could be causing all these problems, practiccally everything in the whole case is new... the last thing to try is changing the psu which I've just done and will have to see if the errors recur...

  2. #2
    Linux Guru antidrugue's Avatar
    Join Date
    Oct 2005
    Location
    Montreal, Canada
    Posts
    3,211
    Quote Originally Posted by humbletech99
    What on earth could be causing all these problems, practiccally everything in the whole case is new... the last thing to try is changing the psu which I've just done and will have to see if the errors recur...
    Wow, that's really something. Did you try a BIOS upgrade before changing the motherboards? Which Linux distro are you using? Which kernel? Which motherboard? Which hard drive model?
    "To express yourself in freedom, you must die to everything of yesterday. From the 'old', you derive security; from the 'new', you gain the flow."

    -Bruce Lee

  3. #3
    Linux Newbie humbletech99's Avatar
    Join Date
    Nov 2005
    Posts
    225
    suse 9.0 (crap I know, it wasn't my choice), kernel 2.6.4, MSI K8 Master-F, hard drives scsi from fujitsu, 7073NP, 3036NP I think, and sata seagate barracudas 500gb drives. no I didn't try changing the bios, I don't think that will make a difference and it risks the motherboard...

  4. $spacer_open
    $spacer_close
  5. #4
    Linux Guru antidrugue's Avatar
    Join Date
    Oct 2005
    Location
    Montreal, Canada
    Posts
    3,211
    Quote Originally Posted by humbletech99
    no I didn't try changing the bios, I don't think that will make a difference and it risks the motherboard...
    If you know what you are doing, it shoudn't. Just do it safely.

    Whenever something doesn't work right, that's one of the things I try first: upgrade the BIOS. It's "easy" and it always does some good.

    I didn't get your exact board model, but it should be there:
    http://www.msi.com.tw/program/suppor...20K:cool:&ID=2

    You could try a newer kernel too, and perhaps Suse 10.1.

    It's possible that this problem was fixed in a newer version of the Linux Kernel.
    "To express yourself in freedom, you must die to everything of yesterday. From the 'old', you derive security; from the 'new', you gain the flow."

    -Bruce Lee

  6. #5
    Linux Newbie humbletech99's Avatar
    Join Date
    Nov 2005
    Posts
    225
    but I have 3 other machines doing the same job with the same version of suse, 2 of which ran the exact same scsi cards and the exact same disk models but didn't have scsi errors.

    flashing the bios doesn't always to good, i did this on an old motherboard to get large disk support, but instead of simply not detecting the disk, it was worse and the machine actually hung when that disk was plugged in. So I flashed it back to the original using the save I had, only for the motherboard to never start up again, it just made a high pitch continuous beep when switched on.

    so I'm reluctant to do that again considering it's negligible gain for more risk. lucky it was an old board on my own old machine that I never paid for...

  7. #6
    Linux Guru antidrugue's Avatar
    Join Date
    Oct 2005
    Location
    Montreal, Canada
    Posts
    3,211
    Quote Originally Posted by humbletech99
    but I have 3 other machines doing the same job with the same version of suse, 2 of which ran the exact same scsi cards and the exact same disk models but didn't have scsi errors.
    Ok, I understand your concerns then.

    Unfortunatly I don't have a solution for you on this one. Hang in there. I'm sure someone will suggest something useful.
    "To express yourself in freedom, you must die to everything of yesterday. From the 'old', you derive security; from the 'new', you gain the flow."

    -Bruce Lee

  8. #7
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,303
    Hi, humbletech99.

    I've used SCSI for a number of years, but mostly as someone who simply installs them and occasionally the SCSI controller.

    You mentioned that you had replaced the cable and controller. When you wrote jbod, were you talking about a RAID, or just that the disk was nothing special?

    Did you swap cables, controllers, and disks to assure yourself that the equipment actually works?

    Does anything run on that one troublesome box (Windows, FreeBSD, etc.?

    For the disk itself, some time ago I purchased SpinRite and I have used it to look at my disks. It's not cheap, but it can give you a feeling of confidence that your disks (and controller, MB, BIOS, etc.) are functioning correctly. It seems to do numerous reads on the theory that this refreshes the surface of the disk, but it will also attempt recovery and replacement of sectors. I have run it on IDE, SCSI, and SATA disks. It comes with a DOS-like OS, so you don't rely on MS, Linux, etc., complexities and problems. (Hmm, sounds like a sales pitch, but I'm just a satisfied customer.)

    Keep us posted ... cheers, drl

    ( edit 1: typo )
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  9. #8
    Linux Newbie humbletech99's Avatar
    Join Date
    Nov 2005
    Posts
    225
    jbod means straight disks, no raid.

    I've changed just about everything on that machine...

    it's only got linux on it and I don't intend to put anything else on it.

  10. #9
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,303
    Hi.

    OK, so you're saying that Linux is now running on the box, but from some disk other than a SCSI, right?

    When you changed the SCSI controller did you try more than one different slot? ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  11. #10
    Linux Newbie humbletech99's Avatar
    Join Date
    Nov 2005
    Posts
    225
    different controllers in different slots on indeed...

    I've gone back to trying a brand new collection of scsi disks to see how it goes...

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •