linux-ext4 - Re: [PATCH] jbd2: avoid mount failed when commit block is partial submitted

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date: Thu, 11 Apr 2024 15:37:18 +0200
From: Jan Kara <jack@...e.cz>
To: "yebin (H)" <yebin10@...wei.com>
Cc: Jan Kara <jack@...e.cz>, Theodore Ts'o <tytso@....edu>,
	adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] jbd2: avoid mount failed when commit block is partial
 submitted

On Sun 07-04-24 09:37:25, yebin (H) wrote:
> On 2024/4/3 18:11, Jan Kara wrote:
> > On Tue 02-04-24 23:37:42, Theodore Ts'o wrote:
> > > On Tue, Apr 02, 2024 at 03:42:40PM +0200, Jan Kara wrote:
> > > > On Tue 02-04-24 17:09:51, Ye Bin wrote:
> > > > > We encountered a problem that the file system could not be mounted in
> > > > > the power-off scenario. The analysis of the file system mirror shows that
> > > > > only part of the data is written to the last commit block.
> > > > > To solve above issue, if commit block checksum is incorrect, check the next
> > > > > block if has valid magic and transaction ID. If next block hasn't valid
> > > > > magic or transaction ID then just drop the last transaction ignore checksum
> > > > > error. Theoretically, the transaction ID maybe occur loopback, which may cause
> > > > > the mounting failure.
> > > > > 
> > > > > Signed-off-by: Ye Bin <yebin10@...wei.com>
> > > > So this is curious. The commit block data is fully within one sector and
> > > > the expectation of the journaling is that either full sector or nothing is
> > > > written. So what kind of storage were you using that it breaks these
> > > > expectations?
> > > I suppose if the physical sector size is 512 bytes, and the file
> > > system block is 4k, I suppose it's possible that on a crash, that part
> > > of the 4k commit block could be written.
> > I was thinking about that as well but the commit block looks like:
> > 
> > truct commit_header {
> >          __be32          h_magic;
> >          __be32          h_blocktype;
> >          __be32          h_sequence;
> >          unsigned char   h_chksum_type;
> >          unsigned char   h_chksum_size;
> >          unsigned char   h_padding[2];
> >          __be32          h_chksum[JBD2_CHECKSUM_BYTES];
> >          __be64          h_commit_sec;
> >          __be32          h_commit_nsec;
> > };
> > 
> > where JBD2_CHECKSUM_BYTES is 8. So all the data in the commit block
> > including the checksum is in the first 60 bytes. Hence I would be really
> > surprised if some storage can tear that...
> This issue has been encountered a few times in the context of eMMC devices.
> The vendor
> has confirmed that only 512-byte atomicity can be ensured in the firmware.
> Although the valid data is only 60 bytes, the entire commit block is used
> for calculating
> the checksum.
> jbd2_commit_block_csum_verify:
> ...
> calculated = jbd2_chksum(j, j->j_csum_seed, buf, j->j_blocksize);
> ...

Ah, indeed. This is the bit I've missed. Thanks for explanation! Still I
think trying to somehow automatically deal with wrong commit block checksum
is too dangerous because it can result in fs corruption in some (unlikely)
cases. OTOH I understand journal replay failure after a power fail isn't
great either so we need to think how to fix this...

								Honza

-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR