lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 23 Oct 2023 14:15:01 +0200
From:   Jan Kara <jack@...e.cz>
To:     Andy Shevchenko <andy.shevchenko@...il.com>
Cc:     Kees Cook <kees@...nel.org>, Jan Kara <jack@...e.cz>,
        Baokun Li <libaokun1@...wei.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Nathan Chancellor <nathan@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        Ferry Toth <ftoth@...londelft.nl>,
        linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org
Subject: Re: [GIT PULL] ext2, quota, and udf fixes for 6.6-rc1

On Mon 23-10-23 14:45:05, Andy Shevchenko wrote:
> On Sat, Oct 21, 2023 at 04:36:19PM -0700, Kees Cook wrote:
> > On October 20, 2023 1:36:36 PM PDT, andy.shevchenko@...il.com wrote:
> > >That said, if you or anyone has ideas how to debug futher, I'm all ears!
> > 
> > I don't think this has been tried yet:
> > 
> > When I've had these kind of hard-to-find glitches I've used manual
> > built-binary bisection. Assuming you have a source tree that works when built
> > with Clang and not with GCC:
> > - build the tree with Clang with, say, O=build-clang
> > - build the tree with GCC, O=build-gcc
> > - make a new tree for testing: cp -a build-clang build-test
> > - pick a suspect .o file (or files) to copy from build-gcc into build-test
> > - perform a relink: "make O=build-test" should DTRT since the copied-in .o
> > files should be newer than the .a and other targets
> > - test for failure, repeat
> > 
> > Once you've isolated it to (hopefully) a single .o file, then comes the
> > byte-by-byte analysis or something similar...
> > 
> > I hope that helps! These kinds of bugs are super frustrating.
> 
> I'm sorry, but I can't see how this is not an error prone approach.
> If it's a timing issue then the arbitrary object change may help and it doesn't
> prove anything. As earlier I tried to comment out the error message, and it
> worked with GCC as well. The difference is so little (according to Linus) that
> it may not be suspectible. Maybe I am missing the point...

Given how reliably you can hit the problem with some kernels while you
cannot hit them with others (only slightly different in a code that doesn't
even get executed on your system) I suspect this is really more a code
placement issue than a timing issue. Like if during the linking phase of
vmlinux some code ends up at some position, the kernel fails, otherwise it
boots fine. Not sure how to debug such thing though. Maybe some playing
with the linker and the order of object files linked could reveal something
but I'm just guessing.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ