Find the answer to your Linux question:
Results 1 to 8 of 8
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    EXT4 Data Corruption Bug Hits Stable Linux Kernels

    [Phoronix] EXT4 Data Corruption Bug Hits Stable Linux Kernels

    read this before panicking though.

    It looks like fscking everything will fix it (it'll replay the buggered
    journal, mangling the metadata, but then fix up the scrambled metadata
    and fix the journal's starting block). So I probably don't need to worry
    about latent corruption hiding waiting to pounce. Phew.
    Link to quote

    So don't
    So it will only show up if the system has been rebooted twice in fairly quick succession. A full conventional distro install probably wouldn't have triggered a bug... although someone who habitually reboots their laptop instead of using suspend/resume or hibernate, or someone who is trying to bisect the kernel looking for some other bug could easily trip over this --- which I guess is how you got hit by it."
    Or mount and unmount quickly any external ext4 file system drives either.
    The bugfix should come quick. My AntiX 64bit test install (3.6 kernel) is ext4 file system with /home sitting in /.

    I am not worried though.

  2. #2
    Linux Guru Jonathan183's Avatar
    Join Date
    Oct 2007
    I have data partitions with several distros mounting them read/write at the moment, don't suspend to disc ever, and would struggle to archive all data on a regular basis ... the root partition I don't care about ... data partitions I do care about !

    I'll have to have a proper think about this one ... thanks for the heads-up roky

    Ed: I've decided to fsck the data partition every time I start the system
    tune2fs -c 1 /dev/sda_data_partition_number
    and fs_passno in /etc/fstab set to 2 ... it adds about 15 to 30 seconds to boot time for me
    Last edited by Jonathan183; 10-25-2012 at 12:09 PM.

  3. #3
    Update 25-10-12:

    Theodore Ts'o has continued his investigation of the bug and has found that the problem was more esoteric than was first thought. The user who reported the problem was using umount -l, which immediately unmounts the filesystem without waiting for it to stop being busy. The bug is now thought to be caused when the machine is being shut down while it is in the process of unmounting the filesystem with an already compromised journal.

    The developers are still working to pinpoint the exact problem and it might actually involve more kernel components than just the ext4 drivers. In any case, it has become clear that the bug needs a very specific configuration to surface and is unlikely to affect most users.
    Stable Linux kernel hit by ext4 data corruption bug - Update - The H Open: News and Features

    Just more info for you..

  4. $spacer_open
  5. #4
    Join Date
    May 2004
    arch linux
    I have 8 different boxes all booting the EXT4 filesystem and haven't seen a trace of the afore-mentioned bug thus far on any of them, so hopefully things will remain that way.

    It's still good to know about potential problems like this just in case things should go weird one day, so thanks for the alert, roky!

  6. #5
    Penguin of trust elija's Avatar
    Join Date
    Jul 2004
    Either at home or at work or down the pub
    Luckily I have a backup every hour... on to a disk using ext4
    Should you be sitting wondering,
    Which Batman is the best,
    There's only one true answer my friend,
    It's Adam Bloody West!

    The Fifth Continent

  7. #6
    My favourite take on this is, the kernel developers don't hide nothing from me. They admit a mistake the day it happens. They jump right on it and try to troubleshoot and rectify the problem. All of this I can see in real time. First time I ever monitored a bug that might affect me. It blows me away on the speed and honesty this bug is being approached with.

    Not like some other operating systems we are familiar with.

  8. #7
    Over 50 rounds in this fight and still counting. I lost track of counting after 50. Almost ready for the knockout.

    Date: 2012-10-27 03:11:47 GMT (1 hour and 51 minutes ago)

    Theodore Ts'o wrote:

    The problem is this code isn't done yet, and journal_checksum is
    really not ready for prime time. When it is ready, my plan is to wire
    it up so it is enabled by default; at the moment, it was intended for
    developer experimentation only. As I said, it's my fault for not
    clearly labelling it "Not for you!", or putting it under an #ifdef to
    prevent unwary civilians from coming across the feature and saying,
    "oooh, shiny!" and turning it on.

    Perhaps a word or two in the mount man page would be appropriate?
    Gmane Loom

  9. #8
    Join Date
    May 2004
    arch linux

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts