[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 08 Aug 2008 12:28:29 +0900
From: Hisashi Hifumi <hifumi.hisashi@....ntt.co.jp>
To: Chris Mason <chris.mason@...cle.com>
Cc: Mingming Cao <cmm@...ibm.com>, Jan Kara <jack@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] jbd jbd2: fix dio write
returningEIOwhentry_to_release_page fails
At 19:21 08/08/07, Chris Mason wrote:
>On Thu, 2008-08-07 at 12:15 +0900, Hisashi Hifumi wrote:
>> >/*
>> > * This is like invalidate_complete_page(), except it ignores the page's
>> > * refcount. We do this because invalidate_inode_pages2() needs
>> >stronger
>> > * invalidation guarantees, and cannot afford to leave pages behind
>> >because
>> > * shrink_page_list() has a temp ref on them, or because they're
>> >transiently
>> > * sitting in the lru_cache_add() pagevecs.
>> > */
>> >
>> >
>> >I am wondering why we need stronger invalidate hurantees for DIO->
>> >invalidate_inode_pages_range(),which force the page being removed from
>> >page cache? In case of bh is busy due to ext3 writeout,
>> >journal_try_to_free_buffers() could return different error number(EBUSY)
>> >to try_to_releasepage() (instead of EIO). In that case, could we just
>> >leave the page in the cache, clean pageuptodate() (to force later buffer
>> >read to read from disk) and then invalidate_complete_page2() return
>> >successfully? Any issue with this way?
>>
>> My idea is that journal_try_to_free_buffers returns EBUSY if it fails due to
>> bh busy, and dio write falls back to buffered write. This is easy to fix.
>>
>>
>
>What about the invalidates done after the DIO has already run
>non-buffered?
Dio write falls back to buffered IO when writing to a hole on ext3, I think. I want to
apply this mechanism to fix this issue. When try_to_release_page fails on a page
due to bh busy, dio write does buffered write, sync_page_range, and
wait_on_page_writeback, imvalidates page cache to preserve dio semantics.
Even if page invalidation that is carried out after wait_on_page_writeback fails,
there is no inconsistency between HDD and page cache.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists