lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 Aug 2023 20:56:14 +0800
From:   Baokun Li <libaokun1@...wei.com>
To:     Jan Kara <jack@...e.cz>
CC:     <linux-ext4@...r.kernel.org>, <tytso@....edu>,
        <adilger.kernel@...ger.ca>, <darrick.wong@...cle.com>,
        <yi.zhang@...wei.com>, <yangerkun@...wei.com>,
        <yukuai3@...wei.com>, Baokun Li <libaokun1@...wei.com>
Subject: Re: [PATCH] e2fsck: delay quotas loading in release_orphan_inodes()

On 2023/8/24 18:08, Jan Kara wrote:
> On Thu 24-08-23 10:27:46, Baokun Li wrote:
>> Hello, Jan!
>>
>> On 2023/8/24 1:05, Jan Kara wrote:
>>> On Thu 17-08-23 16:18:28, Baokun Li wrote:
>>>> After 7d79b40b ("e2fsck: adjust quota counters when clearing orphaned
>>>> inodes"), we load all the quotas before we process the orphaned inodes,
>>>> and when we load the quotas, we check the checsum of the bbitmap for each
>>>> group. If one of the bbitmap checksums is wrong, the following error will
>>>> be reported:
>>>>
>>>> “Error initializing quota context in support library:
>>>>    Block bitmap checksum does not match bitmap”
>>>>
>>>> But loading quotas comes before checking the current superblock for the
>>>> EXT2_ERROR_FS flag, which makes it impossible to use e2fsck to repair any
>>>> image that contains orphan inodes and has the wrong bbitmap checksum.
>>>> So delaying quota loading until after the EXT2_ERROR_FS judgment avoids
>>>> the above problem.
>>>>
>>>> Signed-off-by: Baokun Li <libaokun1@...wei.com>
>>> This certainly looks better but I wonder if there still isn't a problem if
>>> the bitmap checksums are wrong but EXT2_ERROR_FS is not set. Shouldn't we
>>> rather move the initialization of the quota files after the call to
>>> e2fsck_read_bitmaps()?
>>>
>>> 								Honza
>> When the bitmap checksums are wrong but EXT2_ERROR_FS is not set, we must
>> have lost some data (error flag or group descriptor or bitmap), so there
>> is something wrong with the kernel at this time, so I don't think we
>> should fix the image directly, but rather let the user realize that
>> something is wrong with the filesystem logic.
> I agree it means there is a problem somewhere (the storage, the kernel, or
> similar). But just ignoring bitmap checksums in release_orphan_inodes() is
> exactly how e2fsck behaves on filesystems without quota feature so I see no
> reason for quota feature to change that because the inconsistency has
> nothing to do with quotas...
>
>> Moreover, if we don't care how this happened, but just want to fix the
>> image, we only need to run "e2fsck -a" twice. After merging in the
>> current patch, we always empty the orphan list before loading the quotas,
>> and EXT2_ERROR_FS is set when loading the quotas fails, so this will be
>> fixed the second time you run e2fsck. It will not happen that every
>> e2fsck will fail like it did before.
> I see, you're right so it isn't as bad as I originally thought but still my
> argument above holds - IMO e2fsck should treat wrong bitmap checksums the
> same way with and without the quota feature.
>
> 								Honza
The original flow that went wrong here is as follows:
e2fsck
  e2fsck_run_ext3_journal
  check_super_block
   release_orphan_inodes
    e2fsck_read_all_quotas
     quota_read_all_dquots
      quota_file_open
       ext2fs_read_bitmaps
        ext2fs_rw_bitmaps
         read_bitmaps_range
          read_bitmaps_range_start
           ext2fs_block_bitmap_csum_verify
            !!! error
  e2fsck_run

Yes, the inconsistency has nothing to do with quota, but quota is loaded 
here to
keep track of space changes during the normal processing of orphan list. 
If quota
was not loaded, we would not have read and check bitmaps until Pass5, and we
had already done a lot of checking and tweaking of inodes, blocks, and 
dirs before
Pass5, and the bitmaps inconsistency may have been fixed during that time.

So even though the inconsistency has nothing to do with the quota, it is 
the loading
of the quota that reveals the inconsistency and causes the program to 
exit. It is also
unnecessary to read the bitmaps ahead of time alone, since the bitmaps 
inconsistency
could be caused by orphan inodes in the orphan list not being handled.

So in my opinion, there has to be a distinction between the two cases of 
enabling and
disabling quota. Unless we don't read the quotas on disk just log the 
quota changes in
memory and then update the quota in memory to disk when all the checking 
and fixing
is done (like the quota handling before and after e2fsck_run()).

Thanks!
-- 
With Best Regards,
Baokun Li
.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ