lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 25 Sep 2023 14:08:22 +0200
From:   Jan Kara <jack@...e.cz>
To:     Ritesh Harjani <ritesh.list@...il.com>
Cc:     Jan Kara <jack@...e.cz>, Gao Xiang <hsiangkao@...ux.alibaba.com>,
        linux-ext4@...r.kernel.org, Theodore Ts'o <tytso@....edu>,
        Matthew Bobrowski <mbobrowski@...browski.org>,
        Christoph Hellwig <hch@....de>,
        Joseph Qi <joseph.qi@...ux.alibaba.com>
Subject: Re: [bug report] ext4 misses final i_size meta sync under O_DIRECT |
 O_SYNC semantics after iomap DIO conversion

On Fri 22-09-23 17:33:59, Ritesh Harjani wrote:
> Jan Kara <jack@...e.cz> writes:
> 
> > On Wed 20-09-23 17:08:19, Ritesh Harjani wrote:
> >> Jan Kara <jack@...e.cz> writes:
> >> 
> >> > Hello!
> >> >
> >> > On Tue 19-09-23 14:00:04, Gao Xiang wrote:
> >> >> Our consumer reports a behavior change between pre-iomap and iomap
> >> >> direct io conversion:
> >> >> 
> >> >> If the system crashes after an appending write to a file open with
> >> >> O_DIRECT | O_SYNC flag set, file i_size won't be updated even if
> >> >> O_SYNC was marked before.
> >> >> 
> >> >> It can be reproduced by a test program in the attachment with
> >> >> gcc -o repro repro.c && ./repro testfile && echo c > /proc/sysrq-trigger
> >> >> 
> >> >> After some analysis, we found that before iomap direct I/O conversion,
> >> >> the timing was roughly (taking Linux 3.10 codebase as an example):
> >> >> 
> >> >> 	..
> >> >> 	- ext4_file_dio_write
> >> >> 	  - __generic_file_aio_write
> >> >> 	      ..
> >> >> 	    - ext4_direct_IO  # generic_file_direct_write
> >> >> 	      - ext4_ext_direct_IO
> >> >> 	        - ext4_ind_direct_IO  # final_size > inode->i_size
> >> >> 	          - ..
> >> >> 	          - ret = blockdev_direct_IO()
> >> >> 	          - i_size_write(inode, end) # orphan && ret > 0 &&
> >> >> 	                                   # end > inode->i_size
> >> >> 	          - ext4_mark_inode_dirty()
> >> >> 	          - ...
> >> >> 	  - generic_write_sync  # handling O_SYNC
> >> >> 
> >> >> So the dirty inode meta will be committed into journal immediately
> >> >> if O_SYNC is set.  However, After commit 569342dc2485 ("ext4: move
> >> >> inode extension/truncate code out from ->iomap_end() callback"),
> >> >> the new behavior seems as below:
> >> >> 
> >> >> 	..
> >> >> 	- ext4_dio_write_iter
> >> >> 	  - ext4_dio_write_checks  # extend = 1
> >> >> 	  - iomap_dio_rw
> >> >> 	      - __iomap_dio_rw
> >> >> 	      - iomap_dio_complete
> >> >> 	        - generic_write_sync
> >> >> 	  - ext4_handle_inode_extension  # extend = 1
> >> 
> >> Yes, since ext4_handle_inode_extension() will handle inode i_disksize
> >> update and mark the inode dirty, generic_write_sync() call should happen
> >> after that.
> >> 
> >> That also means then we don't have any generic FS testcase which can
> >> validate this behaviour. 
> >
> > Yeah.
> >
> 
> Ok. Let me then first send a fstest in response to this integrity
> problem with directio and o_sync.

Thanks for working on the testcase! I have written a fix in the meantime
but so far it causes weird inconsistencies in fsstress ENOSPC hitters tests
so I'm still debugging that.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ