linux-ext4 - Re: [RFC 1/1] ext4: Fix race in ext4_mb_discard_group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 9 Apr 2020 15:37:19 +0200
From:   Jan Kara <jack@...e.cz>
To:     Ritesh Harjani <riteshh@...ux.ibm.com>
Cc:     linux-ext4@...r.kernel.org, tytso@....edu, jack@...e.cz,
        adilger.kernel@...ger.ca, linux-fsdevel@...r.kernel.org,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.ibm.com>,
        sandeen@...deen.net
Subject: Re: [RFC 1/1] ext4: Fix race in
 ext4_mb_discard_group_preallocations()

Hello Ritesh!

On Wed 08-04-20 22:24:10, Ritesh Harjani wrote:
> @@ -3908,16 +3919,13 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
>  
>  	mb_debug(1, "discard preallocation for group %u\n", group);
>  
> -	if (list_empty(&grp->bb_prealloc_list))
> -		return 0;
> -

OK, so ext4_mb_discard_preallocations() is now going to lock every group
when we try to discard preallocations. That's likely going to increase lock
contention on the group locks if we are running out of free blocks when
there are multiple processes trying to allocate blocks. I guess we don't
care about the performace of this case too deeply but I'm not sure if the
cost won't be too big - probably we should check how much the CPU usage
with multiple allocating process trying to find free blocks grows...

>  	bitmap_bh = ext4_read_block_bitmap(sb, group);
>  	if (IS_ERR(bitmap_bh)) {
>  		err = PTR_ERR(bitmap_bh);
>  		ext4_set_errno(sb, -err);
>  		ext4_error(sb, "Error %d reading block bitmap for %u",
>  			   err, group);
> -		return 0;
> +		goto out_dbg;
>  	}
>  
>  	err = ext4_mb_load_buddy(sb, group, &e4b);
> @@ -3925,7 +3933,7 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
>  		ext4_warning(sb, "Error %d loading buddy information for %u",
>  			     err, group);
>  		put_bh(bitmap_bh);
> -		return 0;
> +		goto out_dbg;
>  	}
>  
>  	if (needed == 0)
> @@ -3967,9 +3975,15 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
>  		goto repeat;
>  	}
>  
> -	/* found anything to free? */
> +	/*
> +	 * If this list is empty, then return the grp->bb_free. As someone
> +	 * else may have freed the PAs and updated grp->bb_free.
> +	 */
>  	if (list_empty(&list)) {
>  		BUG_ON(free != 0);
> +		mb_debug(1, "Someone may have freed PA for this group %u, grp->bb_free %d\n",
> +			 group, grp->bb_free);
> +		free = grp->bb_free;
>  		goto out;
>  	}

OK, but this still doesn't reliably fix the problem, does it? Because
bb_free can be still zero and another process just has some extents to free
in its local 'list' (e.g. because it has decided it doesn't have enough
extents, some were busy and it decided to cond_resched()), so bb_free will
increase from 0 only once these extents are freed.

Honestly, I don't understand why ext4_mb_discard_group_preallocations()
bothers with the local 'list'. Why doesn't it simply free the preallocation
right away? And that would also basically fix your problem (well, it would
still theoretically exist because there's still freeing of one extent
potentially pending but I'm not sure if that will still be a practical
issue).

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR