lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Sep 2023 15:02:09 +0800
From:   Kemeng Shi <shikemeng@...weicloud.com>
To:     Ritesh Harjani <ritesh.list@...il.com>, tytso@....edu,
        adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 05/11] ext4: Separate block bitmap and buddy bitmap
 freeing in ext4_mb_clear_bb()



on 9/4/2023 11:00 AM, Kemeng Shi wrote:
> 
> 
> on 9/1/2023 5:34 PM, Ritesh Harjani wrote:
>> Kemeng Shi <shikemeng@...weicloud.com> writes:
>>
>>> This patch separates block bitmap and buddy bitmap freeing in order to
>>> udpate block bitmap with ext4_mb_mark_context in following patch.
>> ^^^ update.
>>
>>>
>>> Separated freeing is safe with concurrent allocation as long as:
>>> 1. Firstly allocate block in buddy bitmap, and then in block bitmap.
>>> 2. Firstly free block in block bitmap, and then buddy bitmap.
>>> Then freed block will only be available to allocation when both buddy
>>> bitmap and block bitmap are updated by freeing.
>>> Allocation obeys rule 1 already, just do sperated freeing with rule 2.
>>
>> So we also don't need ext4_load_buddy_gfp() before freeing on-disk
>> bitmap right. Continue below...
>>
>>>
>>> Separated freeing has no race with generate_buddy as:
>>> Once ext4_mb_load_buddy_gfp is executed successfully, the update-to-date
>>> buddy page can be found in sbi->s_buddy_cache and no more buddy
>>> initialization of the buddy page will be executed concurrently until
>>> buddy page is unloaded. As we always do free in "load buddy, free,
>>> unload buddy" sequence, separated freeing has no race with generate_buddy.
>>>
>>> Signed-off-by: Kemeng Shi <shikemeng@...weicloud.com>
>>> ---
>>>  fs/ext4/mballoc.c | 18 ++++++++----------
>>>  1 file changed, 8 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>>> index e650eac22237..01ad36a1cc96 100644
>>> --- a/fs/ext4/mballoc.c
>>> +++ b/fs/ext4/mballoc.c
>>> @@ -6519,6 +6519,14 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode,
>>>  	if (err)
>>>  		goto error_return;
>>
>>
>> Let me add the a piece of code before the added changes and continue...
>>
>> 	err = ext4_mb_load_buddy_gfp(sb, block_group, &e4b,
>> 				     GFP_NOFS|__GFP_NOFAIL);
>> 	if (err)
>> 		goto error_return;
>>>  
>>> +	ext4_lock_group(sb, block_group);
>>> +	mb_clear_bits(bitmap_bh->b_data, bit, count_clusters);
>>> +	ret = ext4_free_group_clusters(sb, gdp) + count_clusters;
>>> +	ext4_free_group_clusters_set(sb, gdp, ret);
>>> +	ext4_block_bitmap_csum_set(sb, gdp, bitmap_bh);
>>> +	ext4_group_desc_csum_set(sb, block_group, gdp);
>>> +	ext4_unlock_group(sb, block_group);
>>> +
>>
>> ...Is it required for ext4_mb_load_buddy_gfp() to be called before
>> freeing on-disk bitmap blocks? Can you explain why please?
>> At least it is not very clear to me that why do we need
>> ext4_mb_load_buddy_gfp() before clearing of bitmap blocks. If it's not
>> needed then I think we should separate out bitmap freeing logic and
>> buddy freeing logic even further.
> Yes, ext4_mb_load_buddy_gfp is no needed for clearing of bitmap, put
> it before bit clearing for catching error and aborting mark early
> to reduce functional change.
Hi Ritesh, sorry for bother. I'm going to push a new version and I perfer
to call ext4_mb_load_buddy_gfp early to catch potential error in advance.
Just wonder is this good to you. Like to hear any advice. Thanks!
>>
>> Thoughts?
>>
>>>  	/*
>>>  	 * We need to make sure we don't reuse the freed block until after the
>>>  	 * transaction is committed. We make an exception if the inode is to be
>>> @@ -6541,13 +6549,8 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode,
>>>  		new_entry->efd_tid = handle->h_transaction->t_tid;
>>>  
>>>  		ext4_lock_group(sb, block_group);
>>> -		mb_clear_bits(bitmap_bh->b_data, bit, count_clusters);
>>>  		ext4_mb_free_metadata(handle, &e4b, new_entry);
>>>  	} else {
>>> -		/* need to update group_info->bb_free and bitmap
>>> -		 * with group lock held. generate_buddy look at
>>> -		 * them with group lock_held
>>> -		 */
>>>  		if (test_opt(sb, DISCARD)) {
>>>  			err = ext4_issue_discard(sb, block_group, bit,
>>>  						 count_clusters, NULL);
>>> @@ -6560,14 +6563,9 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode,
>>>  			EXT4_MB_GRP_CLEAR_TRIMMED(e4b.bd_info);
>>>  
>>>  		ext4_lock_group(sb, block_group);
>>> -		mb_clear_bits(bitmap_bh->b_data, bit, count_clusters);
>>>  		mb_free_blocks(inode, &e4b, bit, count_clusters);
>>>  	}
>>>  
>>> -	ret = ext4_free_group_clusters(sb, gdp) + count_clusters;
>>> -	ext4_free_group_clusters_set(sb, gdp, ret);
>>> -	ext4_block_bitmap_csum_set(sb, gdp, bitmap_bh);
>>> -	ext4_group_desc_csum_set(sb, block_group, gdp);
>>>  	ext4_unlock_group(sb, block_group);
>>>  
>>>  	if (sbi->s_log_groups_per_flex) {
>>
>>
>> Adding piece of code here...
>>
>> 	if (sbi->s_log_groups_per_flex) {
>> 		ext4_group_t flex_group = ext4_flex_group(sbi, block_group);
>> 		atomic64_add(count_clusters,
>> 			     &sbi_array_rcu_deref(sbi, s_flex_groups,
>> 						  flex_group)->free_clusters);
>> 	}
>>
>> <...>
>>
>> 	/* We dirtied the bitmap block */
>> 	BUFFER_TRACE(bitmap_bh, "dirtied bitmap block");
>> 	err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
>> 	/* And the group descriptor block */
>> 	BUFFER_TRACE(gd_bh, "dirtied group descriptor block");
>> 	ret = ext4_handle_dirty_metadata(handle, NULL, gd_bh);
>> 	if (!err)
>> 		err = ret;
>>
>> I was thinking even this can go around bitmap freeing logic above. So
>> the next patch becomes very clear that all of the bitmap freeing logic
>> is just simply moved into ext4_mb_mark_context() function.
>>
>> -ritesh
>>
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ