dev-util/Tensile: fix compilation of sci-libs/rocBLAS on gfx906

Clang-20 disallowed op_sel in some VOP3P dot instructions.
See: https://github.com/llvm/llvm-project/pull/100485

As ROCm maintains a fork of Clang, these changes did not reach official ROCm releases.
However Gentoo uses original Clang-20, which has these incompatible changes.
Luckilly, in Tensile these op_sel do nothing. Generally, they allow to shuffle vector elements before multiplication, but with values 0,0/1,1 shuffling is disabled and op_sel can be removed.

Closes: https://bugs.gentoo.org/949817
Signed-off-by: Sv. Lockal <lockalsash@gmail.com>
Part-of: https://github.com/gentoo/gentoo/pull/42887
Closes: https://github.com/gentoo/gentoo/pull/42887
Signed-off-by: Sam James <sam@gentoo.org>
This commit is contained in:
Sv. Lockal 2025-07-05 14:20:35 +00:00 committed by Sam James
parent dc4e3e2842
commit 4539cbe9ff
No known key found for this signature in database
GPG Key ID: 738409F520DF9190

View File

@ -81,10 +81,13 @@ src_prepare() {
sed -e "s|os\.path\.dirname.*$|\"${EPREFIX}/usr/share/Tensile/Source\", end='')|" -i __init__.py || die
# bug 949817: fix v_dot4_i32_i8 syntax for clang-20
sed 's/ op_sel:\[0,0\] op_sel_hi:\[1,1\]//' -i Components/MAC_I8X4.py || die
popd || die
sed -e "/package_data/d" -e "/data_files/d" -i setup.py || die
use client && PATCHES= cmake_src_prepare # do not apply patches again in cmake_src_prepare
use client && PATCHES='' cmake_src_prepare # do not apply patches again in cmake_src_prepare
}
src_configure() {