lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 28 May 2022 18:10:48 -0700
From:   Andi Kleen <ak@...ux.intel.com>
To:     Yu-Jen Chang <arthurchang09@...il.com>, jdike@...ux.intel.com
Cc:     tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
        keescook@...omium.org, linux-kernel@...r.kernel.org,
        linux-hardening@...r.kernel.org, richard@....at,
        anton.ivanov@...bridgegreys.com, johannes@...solutions.net,
        linux-um@...ts.infradead.org, jserv@...s.ncku.edu.tw
Subject: Re: [PATCH 0/2] x86: Optimize memchr() for x86-64


On 5/28/2022 1:12 AM, Yu-Jen Chang wrote:
> *** BLURB HERE ***
> These patch series add an optimized "memchr()" for x86-64 and
> USER-MODE LINUX (UML).
>   
> There exists an assemebly implementation for x86-32. However,
> for x86-64, there isn't any optimized version. We implement word-wise
> comparison so that 8 characters can be compared at the same time on
> x86-64 CPU. The optimized “memchr()” is nearly 4x faster than the
> orginal implementation for long strings.
>
> We test the optimized “memchr()” in UML and also recompile the 5.18
> Kernel with the optimized “memchr()”. They run correctly.
>
> In this patch we add a new file "string_64.c", which only contains
> "memchr()". We can add more optimized string functions in it in the
> future.

Are there any workloads that care? From a quick grep I don't see any 
that look performance critical.

It would be good to describe what you optimized it for. For example 
optimization for small input strings is quite different than large 
strings. I don't know what is more common in the kernel.

I assume you ran it through some existing test suites for memchr (like 
glibc etc.) for correctness testing?

(bugs in optimized string functions are often subtle, it might be also 
worth trying some randomized testing comparing against a known reference)

-Andi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ