linux-ext4 - Re: [syzbot] [ext4?] WARNING in mb_cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sat, 4 May 2024 10:00:54 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Jan Kara <jack@...e.cz>
CC: <tytso@....edu>, syzbot
	<syzbot+dd43bd0f7474512edc47@...kaller.appspotmail.com>,
	<adilger.kernel@...ger.ca>, <linux-ext4@...r.kernel.org>,
	<linux-fsdevel@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<llvm@...ts.linux.dev>, <nathan@...nel.org>, <ndesaulniers@...gle.com>,
	<ritesh.list@...il.com>, <syzkaller-bugs@...glegroups.com>,
	<trix@...hat.com>, yangerkun <yangerkun@...wei.com>
Subject: Re: [syzbot] [ext4?] WARNING in mb_cache_destroy

On 2024/5/3 22:09, Jan Kara wrote:
> On Fri 03-05-24 19:38:21, Baokun Li wrote:
>> On 2024/5/3 18:23, Jan Kara wrote:
>>> Hi!
>>>
>>> On Fri 03-05-24 17:51:07, Baokun Li wrote:
>>>> On 2024/5/2 18:33, Jan Kara wrote:
>>>>> On Tue 30-04-24 08:04:03, syzbot wrote:
>>>>>> syzbot has bisected this issue to:
>>>>>>
>>>>>> commit 67d7d8ad99beccd9fe92d585b87f1760dc9018e3
>>>>>> Author: Baokun Li <libaokun1@...wei.com>
>>>>>> Date:   Thu Jun 16 02:13:56 2022 +0000
>>>>>>
>>>>>>        ext4: fix use-after-free in ext4_xattr_set_entry
>>>>> So I'm not sure the bisect is correct since the change is looking harmless.
>>>> Yes, the root cause of the problem has nothing to do with this patch,
>>>> and please see the detailed analysis below.
>>>>> But it is sufficiently related that there indeed may be some relationship.
>>>>> Anyway, the kernel log has:
>>>>>
>>>>> [   44.932900][ T1063] EXT4-fs warning (device loop0): ext4_evict_inode:297: xattr delete (err -12)
>>>>> [   44.943316][ T1063] EXT4-fs (loop0): unmounting filesystem.
>>>>> [   44.949531][ T1063] ------------[ cut here ]------------
>>>>> [   44.955050][ T1063] WARNING: CPU: 0 PID: 1063 at fs/mbcache.c:409 mb_cache_destroy+0xda/0x110
>>>>>
>>>>> So ext4_xattr_delete_inode() called when removing inode has failed with
>>>>> ENOMEM and later mb_cache_destroy() was eventually complaining about having
>>>>> mbcache entry with increased refcount. So likely some error cleanup path is
>>>>> forgetting to drop mbcache entry reference somewhere but at this point I
>>>>> cannot find where. We'll likely need to play with the reproducer to debug
>>>>> that. Baokun, any chance for looking into this?
>>>>>
>>>>> 								Honza
>>>> As you guessed, when -ENOMEM is returned in ext4_sb_bread(),
>>>> the reference count of ce is not properly released, as follows.
>>>>
>>>> ext4_create
>>>>    __ext4_new_inode
>>>>     security_inode_init_security
>>>>      ext4_initxattrs
>>>>       ext4_xattr_set_handle
>>>>        ext4_xattr_block_find
>>>>        ext4_xattr_block_set
>>>>         ext4_xattr_block_cache_find
>>>>           ce = mb_cache_entry_find_first
>>>>               __entry_find
>>>>               atomic_inc_not_zero(&entry->e_refcnt)
>>>>           bh = ext4_sb_bread(inode->i_sb, ce->e_value, REQ_PRIO);
>>>>           if (PTR_ERR(bh) == -ENOMEM)
>>>>               return NULL;
>>>>
>>>> Before merging into commit 67d7d8ad99be("ext4: fix use-after-free
>>>> in ext4_xattr_set_entry"), it will not return early in
>>>> ext4_xattr_ibody_find(),
>>>> so it tries to find it in iboy, fails the check in xattr_check_inode() and
>>>> returns without executing ext4_xattr_block_find(). Thus it will bisect
>>>> the patch, but actually has nothing to do with it.
>>>>
>>>> ext4_xattr_ibody_get
>>>>    xattr_check_inode
>>>>     __xattr_check_inode
>>>>      check_xattrs
>>>>       if (end - (void *)header < sizeof(*header) + sizeof(u32))
>>>>         "in-inode xattr block too small"
>>>>
>>>> Here's the patch in testing, I'll send it out officially after it is tested.
>>>> (PS:  I'm not sure if propagating the ext4_xattr_block_cache_find() errors
>>>> would be better.)
>>> Great! Thanks for debugging this! Some comments to your fix below:
>>>
>>>> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
>>>> index b67a176bfcf9..5c9e751915fd 100644
>>>> --- a/fs/ext4/xattr.c
>>>> +++ b/fs/ext4/xattr.c
>>>> @@ -3113,11 +3113,10 @@ ext4_xattr_block_cache_find(struct inode *inode,
>>>>
>>>>            bh = ext4_sb_bread(inode->i_sb, ce->e_value, REQ_PRIO);
>>>>            if (IS_ERR(bh)) {
>>>> -            if (PTR_ERR(bh) == -ENOMEM)
>>>> -                return NULL;
>>>> +            if (PTR_ERR(bh) != -ENOMEM)
>>>> +                EXT4_ERROR_INODE(inode, "block %lu read error",
>>>> +                         (unsigned long)ce->e_value);
>>>>                bh = NULL;
>>>> -            EXT4_ERROR_INODE(inode, "block %lu read error",
>>>> -                     (unsigned long)ce->e_value);
>>>>            } else if (ext4_xattr_cmp(header, BHDR(bh)) == 0) {
>>>>                *pce = ce;
>>>>                return bh;
>>> So if we get the ENOMEM error, continuing the iteration seems to be
>>> pointless as we'll likely get it for the following entries as well. I think
>>> the original behavior of aborting the iteration in case of ENOMEM is
>>> actually better. We just have to do mb_cache_entry_put(ea_block_cache, ce)
>>> before returning...
>>>
>>> 								Honza
>> Returning NULL here would normally attempt to allocate a new
>> xattr_block in ext4_xattr_block_set(), and when ext4_sb_bread() fails,
>> allocating the new block and inserting it would most likely fail as well,
>> so my initial thought was to propagate the error from ext4_sb_bread()
>> to also make ext4_xattr_block_set() fail when ext4_sb_bread() fails.
> Yes, this would be probably even better solution.
Okay.
>
>> But I noticed that before Ted added the special handling for -ENOMEM,
>> EXT4_ERROR_INODE was called to set the ERROR_FS flag no matter
>> what error ext4_sb_bread() returned, and after we can distinguish
>> between -EIO and -ENOMEM, we don't have to set the ERROR_FS flag
>> in the case of -ENOMEM. So there's this conservative fix now.
>>
>> In short, in my personal opinion, for -EIO and -ENOMEM, they should
>> be the same except whether or not the ERROR_FS flag is set.
>> Otherwise, I think adding mb_cache_entry_put() directly is the easiest
>> and most straightforward fix.  Honza, do you have any other thoughts?
> Yeah. I'd go for adding mb_cache_entry_put() now as a quick fix and then
> work on propagating the error from ext4_xattr_block_cache_find() as a
> cleaner solution...
>
> 								Honza
>
Ok, thank you very much for the suggestion!
I'll send the quick fix out right away.

Cheers,
Baokun