linux-ext4 - Re: [PATCH] jbd2: Avoid long hold times of j_state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Thu, 8 Nov 2018 13:30:31 +0100
From:   Jan Kara <jack@...e.cz>
To:     "Theodore Y. Ts'o" <tytso@....edu>
Cc:     Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org,
        Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [PATCH] jbd2: Avoid long hold times of j_state_lock while
 committing a transaction

On Tue 06-11-18 18:21:19, Jan Kara wrote:
> On Tue 06-11-18 11:47:59, Theodore Y. Ts'o wrote:
> > On Tue, Nov 06, 2018 at 11:22:30AM +0100, Jan Kara wrote:
> > >> So the buffer is on BJ_Shadow list while the assertion in
> > > jbd2_journal_dirty_metadata() expects it to be in BJ_Metadata list. This is
> > > really weird as we have also checked that jh->b_transaction ==
> > > handle->h_transaction so the transaction couldn't have passed to commit
> > > phase... Oh, I see, the code in start_this_handle() got racy with the
> > > removal of j_state_lock protection from journal_commit_transaction() so now
> > > transaction can start even though there are handles outstanding! I'll think
> > > about the best solution for this. Thanks for report!
> > 
> > Thanks for the analysis!  I finished the bisection last night and it
> > was too late for me to dive into how this was going on.  I should have
> > realized this before I had suggested the approach in the patch.
> > 
> > The original complaint which Andrian made was that the long hold times
> > of j_state_lock at the beginning of the commit.  What he didn't
> > mention was what the other "high priority tasks" were blocked on, but
> > they were almost certainly start_this_handle.  And that's fundamental;
> > when we are trying to at the beginning of the commit process is
> > waiting for the outstanding handles to close; and so we can't let new
> > handles start.
> 
> As Adrian mentioned, the problem is really with j_state_lock hold times,
> not with waiting for outstanding handles as such (because that happens with
> j_state_lock droppped). And the holding of j_state_lock while checking
> for outstanding handles is not a real source of latency so we can keep
> that. We just have to introduce new transaction state so that once we have
> checked there are no outstanding handles and are going to drop
> j_state_lock, we switch to this new state to prevent new reserved handles
> from joining the transaction. I'll send a patch tomorrow...

OK, took a bit longer to go through dioread_nolock test run but everything
seems to work fine now for me. I've sent v2 of the patch.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR