==> Building on minun ==> Checking for remote environment... ==> Syncing package to remote host... sending incremental file list ./ .SRCINFO 670 100% 0.00kB/s 0:00:00 670 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=3/5) .nvchecker.toml 128 100% 125.00kB/s 0:00:00 128 100% 125.00kB/s 0:00:00 (xfr#2, to-chk=2/5) PKGBUILD 1,400 89% 1.34MB/s 0:00:00 1,567 100% 1.49MB/s 0:00:00 (xfr#3, to-chk=1/5) composable-kernel-6.4.1-1.log 192 100% 187.50kB/s 0:00:00 192 100% 187.50kB/s 0:00:00 (xfr#4, to-chk=0/5) sent 1,460 bytes received 125 bytes 1,056.67 bytes/sec total size is 2,557 speedup is 1.61 ==> Patching arch to riscv64... ==> Running pkgctl build --arch riscv64 --repo extra on remote host... ==> WARNING: unsupported architecture: riscv64 ==> Building composable-kernel  -> repo: extra  -> arch: riscv64  -> worker: felix-0 ==> Building composable-kernel for [extra] (riscv64) ]2;🔵 Container arch-nspawn-795138 on minun.felixc.at\[?25l:: Synchronizing package databases... core downloading... extra downloading... error: restricting filesystem access failed because landlock is not supported by the kernel! :: Starting full system upgrade... there is nothing to do [?25h==> Building in chroot for [extra] (riscv64)... ==> Synchronizing chroot copy [/var/lib/archbuild/extra-riscv64/root] -> [felix-0]...done ==> Making package: composable-kernel 6.4.1-1 (Sat Jun 21 21:12:51 2025) ==> Retrieving sources...  -> Downloading composable-kernel-6.4.1.tar.gz... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 11576 0 11576 0 0 23529 0 --:--:-- --:--:-- --:--:-- 23529 100 4270k 0 4270k 0 0 4334k 0 --:--:-- --:--:-- --:--:-- 8639k ==> Validating source files with sha256sums... composable-kernel-6.4.1.tar.gz ... Passed ]2;🔵 Container arch-nspawn-796203 on minun.felixc.at\==> Making package: composable-kernel 6.4.1-1 (Sat Jun 21 19:13:05 2025) ==> Checking runtime dependencies... ==> Installing missing dependencies... [?25lresolving dependencies... looking for conflicting packages... warning: dependency cycle detected: warning: libglvnd will be installed before its mesa dependency Package (34) New Version Net Change Download Size extra/comgr 6.4.1-1 172.51 MiB extra/default-cursors 3-1 0.00 MiB 0.00 MiB extra/fmt 11.2.0-1 0.67 MiB extra/gflags 2.2.2-5 5.39 MiB extra/google-glog 0.7.1-1 0.34 MiB extra/hsa-rocr 6.4.1-1 3.53 MiB extra/libdrm 2.4.125-1 1.21 MiB core/libedit 20250104_3.1-1 0.25 MiB extra/libglvnd 1.7.0-3 3.99 MiB extra/libpciaccess 0.18.1-2 0.05 MiB extra/libx11 1.8.12-1 9.73 MiB extra/libxau 1.0.12-1 0.02 MiB extra/libxcb 1.17.0-1 3.69 MiB extra/libxdmcp 1.1.5-1 0.13 MiB extra/libxext 1.3.6-1 0.29 MiB extra/libxshmfence 1.3.3-1 0.01 MiB extra/libxxf86vm 1.1.6-1 0.03 MiB extra/llvm-libs 20.1.6-3 143.60 MiB extra/lm_sensors 1:3.6.2-1 0.43 MiB extra/mesa 1:25.1.4-1 29.62 MiB core/mpdecimal 4.0.1-1 0.31 MiB extra/numactl 2.0.19-1 0.20 MiB core/pciutils 3.13.0-2 0.34 MiB core/python 3.13.3-1 108.92 MiB extra/rocm-device-libs 6.4.1-1 3.19 MiB extra/rocm-llvm 6.4.1-1 8190.70 MiB extra/rocminfo 6.4.1-1 0.06 MiB extra/rocprofiler-register 6.4.1-1 0.28 MiB extra/spirv-tools 1:1.4.313.0-1 6.44 MiB extra/wayland 1.23.1-2 0.79 MiB extra/xcb-proto 1.17.0-3 1.02 MiB extra/xorgproto 2024.1-2 1.46 MiB extra/hip-runtime-amd 6.4.1-1 8.93 MiB extra/rocm-core 6.4.1-1 0.03 MiB Total Download Size: 0.00 MiB Total Installed Size: 8698.19 MiB :: Proceed with installation? [Y/n] :: Retrieving packages... default-cursors-3-1-any downloading... error: restricting filesystem access failed because landlock is not supported by the kernel! checking keyring... checking package integrity... loading package files... checking for file conflicts... :: Processing package changes... installing rocm-core... installing numactl... installing libpciaccess... installing libdrm... Optional dependencies for libdrm cairo: needed for modetest tool installing xcb-proto... installing xorgproto... installing libxdmcp... installing libxau... installing libxcb... installing libx11... installing libxext... installing libglvnd... installing libxshmfence... installing libxxf86vm... installing libedit... installing llvm-libs... installing lm_sensors... Optional dependencies for lm_sensors rrdtool: for logging with sensord perl: for sensor detection and configuration convert [installed] installing spirv-tools... installing default-cursors... Optional dependencies for default-cursors adwaita-cursors: default cursor theme installing wayland... installing mesa... Optional dependencies for mesa opengl-man-pages: for the OpenGL API man pages installing rocm-llvm... installing rocm-device-libs... installing comgr... installing pciutils... Optional dependencies for pciutils which: for update-pciids [installed] grep: for update-pciids [installed] curl: for update-pciids [installed] installing mpdecimal... installing python... Optional dependencies for python python-setuptools: for building Python packages using tooling that is usually bundled with Python python-pip: for installing Python packages using tooling that is usually bundled with Python python-pipx: for installing Python software not packaged on Arch Linux sqlite: for a default database integration [installed] xz: for lzma [installed] tk: for tkinter installing hsa-rocr... installing rocminfo... installing fmt... installing gflags... installing google-glog... installing rocprofiler-register... installing hip-runtime-amd... Optional dependencies for hip-runtime-amd inetutils: Print hostname in hipconfig :: Running post-transaction hooks... (1/2) Reloading system manager configuration... Skipped: Current root is not booted. (2/2) Arming ConditionNeedsUpdate... [?25h==> Checking buildtime dependencies... ==> Installing missing dependencies... [?25lresolving dependencies... looking for conflicting packages... Package (14) New Version Net Change Download Size extra/cppdap 1.58.0-2 1.48 MiB extra/hicolor-icon-theme 0.18-1 0.05 MiB 0.01 MiB extra/jsoncpp 1.9.6-3 3.16 MiB extra/libuv 1.51.0-1 0.60 MiB extra/perl-error 0.17030-1 0.04 MiB extra/perl-mailtools 2.22-1 0.10 MiB extra/perl-timedate 2.33-7 0.08 MiB extra/rhash 1.4.4-1 0.31 MiB extra/zlib-ng 2.2.4-1 0.21 MiB extra/cmake 4.0.3-1 76.42 MiB extra/git 2.50.0-1 28.58 MiB extra/ninja 1.12.1-2 0.31 MiB extra/openmp 20.1.6-1 1.91 MiB 0.58 MiB extra/rocm-cmake 6.4.1-1 0.12 MiB Total Download Size: 0.60 MiB Total Installed Size: 113.38 MiB :: Proceed with installation? [Y/n] :: Retrieving packages... openmp-20.1.6-1-riscv64 downloading... hicolor-icon-theme-0.18-1-any downloading... error: restricting filesystem access failed because landlock is not supported by the kernel! checking keyring... checking package integrity... loading package files... checking for file conflicts... :: Processing package changes... installing perl-error... installing perl-timedate... installing perl-mailtools... installing zlib-ng... installing git... Optional dependencies for git git-zsh-completion: upstream zsh completion tk: gitk and git gui openssh: ssh transport and crypto man: show help with `git command --help` perl-libwww: git svn perl-term-readkey: git svn and interactive.singlekey setting perl-io-socket-ssl: git send-email TLS support perl-authen-sasl: git send-email TLS support perl-mediawiki-api: git mediawiki support perl-datetime-format-iso8601: git mediawiki support perl-lwp-protocol-https: git mediawiki https support perl-cgi: gitweb (web interface) support python: git svn & git p4 [installed] subversion: git svn org.freedesktop.secrets: keyring credential helper libsecret: libsecret credential helper [installed] installing cppdap... installing hicolor-icon-theme... installing jsoncpp... Optional dependencies for jsoncpp jsoncpp-doc: documentation installing libuv... installing rhash... installing cmake... Optional dependencies for cmake make: for unix Makefile generator [installed] ninja: for ninja generator [pending] qt6-base: cmake-gui installing ninja... installing rocm-cmake... installing openmp... Optional dependencies for openmp cuda: offloading to NVIDIA GPUs hsa-rocr: offloading to AMD GPUs [installed] :: Running post-transaction hooks... (1/4) Creating system user accounts... Creating group 'git' with GID 971. Creating user 'git' (git daemon user) with UID 971 and GID 971. (2/4) Reloading system manager configuration... Skipped: Current root is not booted. (3/4) Arming ConditionNeedsUpdate... (4/4) Checking for old perl modules... [?25h==> Retrieving sources...  -> Found composable-kernel-6.4.1.tar.gz ==> WARNING: Skipping all source file integrity checks. ==> Extracting sources...  -> Extracting composable-kernel-6.4.1.tar.gz with bsdtar ==> Starting prepare()... ==> Starting build()... -- The CXX compiler identification is Clang 19.0.0 -- The HIP compiler identification is Clang 19.0.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /opt/rocm/bin/hipcc - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting HIP compiler ABI info -- Detecting HIP compiler ABI info - done -- Check for working HIP compiler: /opt/rocm/lib/llvm/bin/clang++ - skipped -- Detecting HIP compile features -- Detecting HIP compile features - done -- Found Python3: /usr/bin/python3.13 (found suitable version "3.13.3", minimum required is "3.8") found components: Interpreter -- Found Git: /usr/bin/git (found version "2.50.0") fatal: not a git repository (or any of the parent directories): .git GPU_TARGETS= GPU_ARCHS= -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success hip_version_flat=600443483 checking which targets are supported -- Performing Test COMPILER_HAS_TARGET_ID_gfx908 -- Performing Test COMPILER_HAS_TARGET_ID_gfx908 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx942 -- Performing Test COMPILER_HAS_TARGET_ID_gfx942 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1101 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1101 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1200 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1200 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1201 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1201 - Success Building CK for the following targets: gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1102;gfx1200;gfx1201 Enabling XDL instances Enabling FP8 gemms on native architectures Enabling WMMA instances -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK - Success Adding the fno-offload-uniform-block compiler flag -- Performing Test HAS_LSR_DROP_SOLUTION -- Performing Test HAS_LSR_DROP_SOLUTION - Success Adding the lsr-drop-solution=1 compiler flag -- Performing Test HAS_ENABLE_POST_MISCHED -- Performing Test HAS_ENABLE_POST_MISCHED - Success Adding the enable-post-misched=0 compiler flag -- Performing Test check-coerce -- Performing Test check-coerce - Success Adding the amdgpu-coerce-illegal-types=1 Adding -amdgpu-early-inline-all=true and -amdgpu-function-calls=false CMAKE_CXX_COMPILER: /opt/rocm/bin/hipcc CMAKE_HIP_COMPILER: /opt/rocm/bin/hipcc OpenMP_CXX_LIB_NAMES: libomp;libgomp;libiomp5 OpenMP_gomp_LIBRARY: OpenMP_pthread_LIBRARY: OpenMP_CXX_FLAGS: -fopenmp=libomp -Wno-unused-command-line-argument -- Build with HIP -- Clang tidy found: 19.0.0git -- Clang tidy checks: *,-abseil-*,-android-cloexec-fopen,-cert-msc30-c,-bugprone-exception-escape,-bugprone-macro-parentheses,-cert-env33-c,-cert-msc32-c,-cert-msc50-cpp,-cert-msc51-cpp,-cert-dcl37-c,-cert-dcl51-cpp,-clang-analyzer-alpha.core.CastToStruct,-clang-analyzer-optin.performance.Padding,-clang-diagnostic-deprecated-declarations,-clang-diagnostic-extern-c-compat,-clang-diagnostic-unused-command-line-argument,-cppcoreguidelines-avoid-c-arrays,-cppcoreguidelines-avoid-magic-numbers,-cppcoreguidelines-explicit-virtual-functions,-cppcoreguidelines-init-variables,-cppcoreguidelines-macro-usage,-cppcoreguidelines-non-private-member-variables-in-classes,-cppcoreguidelines-pro-bounds-array-to-pointer-decay,-cppcoreguidelines-pro-bounds-constant-array-index,-cppcoreguidelines-pro-bounds-pointer-arithmetic,-cppcoreguidelines-pro-type-member-init,-cppcoreguidelines-pro-type-reinterpret-cast,-cppcoreguidelines-pro-type-union-access,-cppcoreguidelines-pro-type-vararg,-cppcoreguidelines-special-member-functions,-fuchsia-*,-google-explicit-constructor,-google-readability-braces-around-statements,-google-readability-todo,-google-runtime-int,-google-runtime-references,-hicpp-vararg,-hicpp-braces-around-statements,-hicpp-explicit-conversions,-hicpp-named-parameter,-hicpp-no-array-decay,-hicpp-avoid-c-arrays,-hicpp-signed-bitwise,-hicpp-special-member-functions,-hicpp-uppercase-literal-suffix,-hicpp-use-auto,-hicpp-use-equals-default,-hicpp-use-override,-llvm-header-guard,-llvm-include-order,-llvmlibc-restrict-system-libc-headers,-llvmlibc-callee-namespace,-llvmlibc-implementation-in-namespace,-llvm-else-after-return,-llvm-qualified-auto,-misc-misplaced-const,-misc-non-private-member-variables-in-classes,-misc-no-recursion,-modernize-avoid-bind,-modernize-avoid-c-arrays,-modernize-pass-by-value,-modernize-use-auto,-modernize-use-default-member-init,-modernize-use-equals-default,-modernize-use-trailing-return-type,-modernize-use-transparent-functors,-performance-unnecessary-value-param,-readability-braces-around-statements,-readability-else-after-return,-readability-function-cognitive-complexity,-readability-isolate-declaration,-readability-magic-numbers,-readability-named-parameter,-readability-uppercase-literal-suffix,-readability-convert-member-functions-to-static,-readability-qualified-auto,-readability-redundant-string-init,-bugprone-narrowing-conversions,-cppcoreguidelines-narrowing-conversions,-altera-struct-pack-align,-cppcoreguidelines-prefer-member-initializer CMAKE_CXX_FLAGS: adding instance device_avg_pool2d_bwd_instance add_instance_library device_avg_pool2d_bwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd adding instance device_avg_pool3d_bwd_instance add_instance_library device_avg_pool3d_bwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd adding instance device_batched_gemm_instance add_instance_library device_batched_gemm_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm adding instance device_batched_gemm_add_relu_gemm_add_instance add_instance_library device_batched_gemm_add_relu_gemm_add_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add adding instance device_batched_gemm_bias_permute_instance add_instance_library device_batched_gemm_bias_permute_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm_bias_permute adding instance device_batched_gemm_gemm_instance add_instance_library device_batched_gemm_gemm_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm_gemm Found only dl instances, but DL_KERNELS is not set. Skipping. skip_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d adding instance device_batched_gemm_reduce_instance add_instance_library device_batched_gemm_reduce_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm_reduce adding instance device_batched_gemm_softmax_gemm_instance add_instance_library device_batched_gemm_softmax_gemm_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm adding instance device_batched_gemm_softmax_gemm_permute_instance add_instance_library device_batched_gemm_softmax_gemm_permute_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute adding instance device_batchnorm_instance add_instance_library device_batchnorm_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/batchnorm instance should be built for all types! adding instance device_column_to_image_instance add_instance_library device_column_to_image_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/column_to_image adding instance device_contraction_bilinear_instance add_instance_library device_contraction_bilinear_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/contraction_bilinear adding instance device_contraction_scale_instance add_instance_library device_contraction_scale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/contraction_scale adding instance device_conv1d_bwd_data_instance add_instance_library device_conv1d_bwd_data_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/conv1d_bwd_data adding instance device_conv2d_bwd_data_instance removing dl instance device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp removing dl instance device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp removing dl instance device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp add_instance_library device_conv2d_bwd_data_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/conv2d_bwd_data adding instance device_conv2d_fwd_instance add_instance_library device_conv2d_fwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/conv2d_fwd adding instance device_conv2d_fwd_bias_relu_instance add_instance_library device_conv2d_fwd_bias_relu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/conv2d_fwd_bias_relu adding instance device_conv2d_fwd_bias_relu_add_instance add_instance_library device_conv2d_fwd_bias_relu_add_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/conv2d_fwd_bias_relu_add adding instance device_conv3d_bwd_data_instance add_instance_library device_conv3d_bwd_data_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/conv3d_bwd_data instance should be built for all types! adding instance device_elementwise_instance add_instance_library device_elementwise_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/elementwise adding instance device_elementwise_normalization_instance add_instance_library device_elementwise_normalization_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/elementwise_normalization adding instance device_gemm_instance removing dpp instance device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp removing dpp instance device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp add_instance_library device_gemm_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm adding instance device_gemm_ab_scale_instance add_instance_library device_gemm_ab_scale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_ab_scale adding instance device_gemm_add_instance add_instance_library device_gemm_add_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_add adding instance device_gemm_add_add_fastgelu_instance add_instance_library device_gemm_add_add_fastgelu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_add_add_fastgelu adding instance device_gemm_add_fastgelu_instance add_instance_library device_gemm_add_fastgelu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_add_fastgelu adding instance device_gemm_add_multiply_instance add_instance_library device_gemm_add_multiply_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_add_multiply adding instance device_gemm_add_relu_instance add_instance_library device_gemm_add_relu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_add_relu adding instance device_gemm_add_relu_add_layernorm_instance add_instance_library device_gemm_add_relu_add_layernorm_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm adding instance device_gemm_add_silu_instance add_instance_library device_gemm_add_silu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_add_silu adding instance device_gemm_b_scale_instance add_instance_library device_gemm_b_scale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_b_scale adding instance device_gemm_bias_add_reduce_instance add_instance_library device_gemm_bias_add_reduce_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_bias_add_reduce adding instance device_gemm_bilinear_instance add_instance_library device_gemm_bilinear_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_bilinear adding instance device_gemm_fastgelu_instance add_instance_library device_gemm_fastgelu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_fastgelu adding instance device_gemm_multi_abd_instance add_instance_library device_gemm_multi_abd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_multi_abd adding instance device_gemm_multiply_add_instance add_instance_library device_gemm_multiply_add_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_multiply_add adding instance device_gemm_multiply_multiply_instance add_instance_library device_gemm_multiply_multiply_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_multiply_multiply adding instance device_gemm_reduce_instance add_instance_library device_gemm_reduce_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_reduce adding instance device_gemm_splitk_instance add_instance_library device_gemm_splitk_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_splitk adding instance device_gemm_streamk_instance add_instance_library device_gemm_streamk_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_streamk adding instance device_gemm_universal_instance add_instance_library device_gemm_universal_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_universal adding instance device_gemm_universal_batched_instance add_instance_library device_gemm_universal_batched_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_universal_batched adding instance device_gemm_universal_reduce_instance add_instance_library device_gemm_universal_reduce_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_universal_reduce adding instance device_gemm_universal_streamk_instance add_instance_library device_gemm_universal_streamk_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_universal_streamk adding instance device_grouped_conv1d_bwd_weight_instance add_instance_library device_grouped_conv1d_bwd_weight_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight adding instance device_grouped_conv1d_fwd_instance add_instance_library device_grouped_conv1d_fwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv1d_fwd adding instance device_grouped_conv2d_bwd_data_instance add_instance_library device_grouped_conv2d_bwd_data_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data adding instance device_grouped_conv2d_bwd_weight_instance add_instance_library device_grouped_conv2d_bwd_weight_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight adding instance device_grouped_conv2d_fwd_instance removing dl instance dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp add_instance_library device_grouped_conv2d_fwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd adding instance device_grouped_conv2d_fwd_dynamic_op_instance add_instance_library device_grouped_conv2d_fwd_dynamic_op_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd_dynamic_op adding instance device_grouped_conv3d_bwd_data_instance add_instance_library device_grouped_conv3d_bwd_data_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data adding instance device_grouped_conv3d_bwd_data_bilinear_instance add_instance_library device_grouped_conv3d_bwd_data_bilinear_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data_bilinear adding instance device_grouped_conv3d_bwd_data_scale_instance add_instance_library device_grouped_conv3d_bwd_data_scale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data_scale adding instance device_grouped_conv3d_bwd_weight_instance add_instance_library device_grouped_conv3d_bwd_weight_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight adding instance device_grouped_conv3d_bwd_weight_bilinear_instance add_instance_library device_grouped_conv3d_bwd_weight_bilinear_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight_bilinear adding instance device_grouped_conv3d_bwd_weight_scale_instance add_instance_library device_grouped_conv3d_bwd_weight_scale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight_scale adding instance device_grouped_conv3d_fwd_instance add_instance_library device_grouped_conv3d_fwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd adding instance device_grouped_conv3d_fwd_bilinear_instance add_instance_library device_grouped_conv3d_fwd_bilinear_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_bilinear adding instance device_grouped_conv3d_fwd_convinvscale_instance add_instance_library device_grouped_conv3d_fwd_convinvscale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convinvscale adding instance device_grouped_conv3d_fwd_convscale_instance add_instance_library device_grouped_conv3d_fwd_convscale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convscale adding instance device_grouped_conv3d_fwd_convscale_add_instance add_instance_library device_grouped_conv3d_fwd_convscale_add_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convscale_add adding instance device_grouped_conv3d_fwd_convscale_relu_instance add_instance_library device_grouped_conv3d_fwd_convscale_relu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_convscale_relu adding instance device_grouped_conv3d_fwd_dynamic_op_instance add_instance_library device_grouped_conv3d_fwd_dynamic_op_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_dynamic_op adding instance device_grouped_conv3d_fwd_scale_instance add_instance_library device_grouped_conv3d_fwd_scale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_scale adding instance device_grouped_conv3d_fwd_scaleadd_ab_instance add_instance_library device_grouped_conv3d_fwd_scaleadd_ab_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_scaleadd_ab adding instance device_grouped_conv3d_fwd_scaleadd_scaleadd_relu_instance add_instance_library device_grouped_conv3d_fwd_scaleadd_scaleadd_relu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd_scaleadd_scaleadd_relu adding instance device_grouped_gemm_instance add_instance_library device_grouped_gemm_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_gemm adding instance device_grouped_gemm_bias_instance add_instance_library device_grouped_gemm_bias_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_gemm_bias adding instance device_grouped_gemm_fastgelu_instance add_instance_library device_grouped_gemm_fastgelu_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_gemm_fastgelu adding instance device_grouped_gemm_fixed_nk_instance add_instance_library device_grouped_gemm_fixed_nk_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_gemm_fixed_nk adding instance device_grouped_gemm_fixed_nk_multi_abd_instance add_instance_library device_grouped_gemm_fixed_nk_multi_abd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_gemm_fixed_nk_multi_abd adding instance device_grouped_gemm_tile_loop_instance add_instance_library device_grouped_gemm_tile_loop_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/grouped_gemm_tile_loop instance should be built for all types! adding instance device_image_to_column_instance add_instance_library device_image_to_column_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/image_to_column adding instance device_max_pool_bwd_instance add_instance_library device_max_pool_bwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/max_pool_bwd instance should be built for all types! -- Found Python3: /usr/bin/python3.13 (found version "3.13.3") found components: Interpreter Development Development.Module Development.Embed adding instance device_mha_instance add_instance_library device_mha_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/mha adding instance device_normalization_bwd_data_instance add_instance_library device_normalization_bwd_data_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/normalization_bwd_data adding instance device_normalization_bwd_gamma_beta_instance add_instance_library device_normalization_bwd_gamma_beta_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta adding instance device_normalization_fwd_instance add_instance_library device_normalization_fwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/normalization_fwd adding instance device_permute_scale_instance add_instance_library device_permute_scale_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/permute_scale adding instance device_pool2d_fwd_instance add_instance_library device_pool2d_fwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/pool2d_fwd adding instance device_pool3d_fwd_instance add_instance_library device_pool3d_fwd_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/pool3d_fwd adding instance device_quantization_instance removing dl instance conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp removing dl instance conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp removing dl instance conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp removing dl instance conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp add_instance_library device_quantization_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/quantization adding instance device_reduce_instance add_instance_library device_reduce_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/reduce adding instance device_softmax_instance add_instance_library device_softmax_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/softmax instance should be built for all types! adding instance device_transpose_instance add_instance_library device_transpose_instance add_instance_directory /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/transpose Adding --offload-compress flag for ckProfiler -- Configuring done (389.7s) -- Generating done (182.1s) CMake Warning: Manually-specified variables were not used by the project: INSTANCES_ONLY -- Build files have been written to: /build/composable-kernel/src/build [1/4327] Generating mha kernel (cpp) files now ... [2/4327] Building CXX object library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o [3/4327] Building CXX object library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o [4/4327] Building CXX object library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o [5/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o [6/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o [7/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o [8/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o [9/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o [10/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o [11/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o [12/4327] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o [13/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gmk_gkn_gmn_instance.cpp.o [14/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gkm_gkn_gmn_instance.cpp.o [15/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gkm_gnk_gmn_instance.cpp.o [16/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gmk_gnk_gmn_instance.cpp.o [17/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gmk_gnk_gmn_instance.cpp.o [18/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o [19/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o [20/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o [21/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_gemm/CMakeFiles/device_batched_gemm_gemm_instance.dir/device_batched_gemm_gemm_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [22/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o [23/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gmk_gnk_gmn_instance.cpp.o [24/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o [25/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gmk_gkn_gmn_instance.cpp.o [26/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_gemm/CMakeFiles/device_batched_gemm_gemm_instance.dir/device_batched_gemm_gemm_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp.o [27/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o [28/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gkm_gnk_gmn_instance.cpp.o [29/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gkm_gkn_gmn_instance.cpp.o [30/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gkm_gnk_gmn_instance.cpp.o [31/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add/CMakeFiles/device_batched_gemm_add_relu_gemm_add_instance.dir/device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [32/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o [33/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_bias_permute/CMakeFiles/device_batched_gemm_bias_permute_instance.dir/device_batched_gemm_bias_permute_m2_n3_k1_xdl_c_shuffle_f16_f16_f16_f16_instance.cpp.o [34/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gmk_gkn_gmn_instance.cpp.o [35/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gkm_gkn_gmn_instance.cpp.o [36/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add/CMakeFiles/device_batched_gemm_add_relu_gemm_add_instance.dir/device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp.o [37/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o [38/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o [39/4327] Building CXX object library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o [40/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o [41/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o [42/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o [43/4327] Building CXX object library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o [44/4327] Building CXX object library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o [45/4327] Building CXX object library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o [46/4327] Building CXX object library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o [47/4327] Building CXX object library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o [48/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_kknn_instance.cpp.o [49/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o [50/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o [51/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_kknn_instance.cpp.o [52/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o [53/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o [54/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_knnn_instance.cpp.o [55/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_mknn_instance.cpp.o [56/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_mnnn_instance.cpp.o [57/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_knnn_instance.cpp.o [58/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_mknn_instance.cpp.o [59/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_kknn_instance.cpp.o [60/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_mnnn_instance.cpp.o [61/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [62/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_kknn_instance.cpp.o [63/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_knnn_instance.cpp.o [64/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp.o [65/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_knnn_instance.cpp.o [66/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_mnnn_instance.cpp.o [67/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_kknn_instance.cpp.o [68/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_mknn_instance.cpp.o [69/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm/CMakeFiles/device_batched_gemm_softmax_gemm_instance.dir/device_batched_gemm_softmax_gemm_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [70/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_mknn_instance.cpp.o [71/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_knnn_instance.cpp.o [72/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_mknn_instance.cpp.o [73/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_mnnn_instance.cpp.o [74/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o [75/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_mnnn_instance.cpp.o [76/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_kknn_instance.cpp.o [77/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_kknn_instance.cpp.o [78/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_knnn_instance.cpp.o [79/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_mnnn_instance.cpp.o [80/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_mknn_instance.cpp.o [81/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_knnn_instance.cpp.o [82/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_mnnn_instance.cpp.o [83/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/2D/device_contraction_bilinear_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_mknn_instance.cpp.o [84/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_kknn_instance.cpp.o [85/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_kknn_instance.cpp.o [86/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_knnn_instance.cpp.o [87/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_kknn_instance.cpp.o [88/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [89/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_mnnn_instance.cpp.o [90/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_knnn_instance.cpp.o [91/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_mknn_instance.cpp.o [92/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_mnnn_instance.cpp.o [93/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_mknn_instance.cpp.o [94/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_kknn_instance.cpp.o [95/4327] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp.o [96/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_knnn_instance.cpp.o [97/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_kknn_instance.cpp.o [98/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o [99/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_mnnn_instance.cpp.o [100/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_knnn_instance.cpp.o [101/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o [102/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_f64_compute_f32_mknn_instance.cpp.o [103/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_mknn_instance.cpp.o [104/4327] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o [105/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_f16_mnnn_instance.cpp.o [106/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_knnn_instance.cpp.o [107/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_mnnn_instance.cpp.o [108/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_f32_compute_bf16_mknn_instance.cpp.o [109/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_kknn_instance.cpp.o [110/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_kkn_instance.cpp.o [111/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_knnn_instance.cpp.o [112/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_knn_instance.cpp.o [113/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_kknn_instance.cpp.o [114/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_kkn_instance.cpp.o [115/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_mkn_instance.cpp.o [116/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_knn_instance.cpp.o [117/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_mnn_instance.cpp.o [118/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_mknn_instance.cpp.o [119/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_f16_compute_f32_mnnn_instance.cpp.o [120/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_compute_f32_kkn_instance.cpp.o [121/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_f16_kkn_instance.cpp.o [122/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_compute_f32_mnn_instance.cpp.o [123/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_compute_f32_knn_instance.cpp.o [124/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f64_f64_f64_compute_f32_mkn_instance.cpp.o [125/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_mkn_instance.cpp.o [126/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_bf16_kkn_instance.cpp.o [127/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_mnn_instance.cpp.o [128/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_knnn_instance.cpp.o [129/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_f16_knn_instance.cpp.o [130/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_f16_mnn_instance.cpp.o [131/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_mknn_instance.cpp.o [132/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_f16_mkn_instance.cpp.o [133/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_bf16_mnn_instance.cpp.o [134/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_bilinear/CMakeFiles/device_contraction_bilinear_instance.dir/6D/device_contraction_bilinear_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_bf16_compute_f32_mnnn_instance.cpp.o [135/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_bf16_knn_instance.cpp.o [136/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f32_f32_f32_compute_bf16_mkn_instance.cpp.o [137/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_compute_f32_kkn_instance.cpp.o [138/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_kkn_instance.cpp.o [139/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_compute_f32_knn_instance.cpp.o [140/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_compute_f32_mnn_instance.cpp.o [141/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_f16_f16_f16_compute_f32_mkn_instance.cpp.o [142/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_knn_instance.cpp.o [143/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_mkn_instance.cpp.o [144/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/2D/device_contraction_scale_m2_n2_k2_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_mnn_instance.cpp.o [145/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_kkn_instance.cpp.o [146/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_kkn_instance.cpp.o [147/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_mnn_instance.cpp.o [148/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_knn_instance.cpp.o [149/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_compute_f32_kkn_instance.cpp.o [150/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_mkn_instance.cpp.o [151/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_f16_kkn_instance.cpp.o [152/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_compute_f32_knn_instance.cpp.o [153/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_compute_f32_mnn_instance.cpp.o [154/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_bf16_kkn_instance.cpp.o [155/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f64_f64_f64_compute_f32_mkn_instance.cpp.o [156/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_knn_instance.cpp.o [157/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_mnn_instance.cpp.o [158/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_mkn_instance.cpp.o [159/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_f16_knn_instance.cpp.o [160/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_f16_mnn_instance.cpp.o [161/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_f16_mkn_instance.cpp.o [162/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_bf16_mnn_instance.cpp.o [163/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_bf16_knn_instance.cpp.o [164/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_compute_f32_kkn_instance.cpp.o [165/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f32_f32_f32_compute_bf16_mkn_instance.cpp.o [166/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_kkn_instance.cpp.o [167/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_compute_f32_mnn_instance.cpp.o [168/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_compute_f32_mkn_instance.cpp.o [169/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_f16_f16_f16_compute_f32_knn_instance.cpp.o [170/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_knn_instance.cpp.o [171/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_mkn_instance.cpp.o [172/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f64_f64_f64_mk_kn_mn_instance.cpp.o [173/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f64_f64_f64_km_kn_mn_instance.cpp.o [174/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f64_f64_f64_km_nk_mn_instance.cpp.o [175/4327] Building CXX object library/src/tensor_operation_instance/gpu/contraction_scale/CMakeFiles/device_contraction_scale_instance.dir/6D/device_contraction_scale_m6_n6_k6_xdl_c_shuffle_bf16_bf16_bf16_compute_f32_mnn_instance.cpp.o [176/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f64_f64_f64_mk_nk_mn_instance.cpp.o [177/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_km_kn_mn_instance.cpp.o [178/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_km_nk_mn_instance.cpp.o [179/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_mk_kn_mn_instance.cpp.o [180/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_lds_direct_load_f32_f32_f32_mk_nk_mn_instance.cpp.o [181/4327] Building CXX object library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o [182/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_f32_instance.cpp.o [183/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o [184/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f32_f32_f32_mk_kn_mn_instance.cpp.o [185/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f32_f32_f32_km_kn_mn_instance.cpp.o [186/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_f16_instance.cpp.o [187/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f32_f32_f32_mk_nk_mn_instance.cpp.o [188/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f32_f32_f32_mk_nk_mn_instance.cpp.o [189/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_bf16_instance.cpp.o [190/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f32_f32_f32_mk_kn_mn_instance.cpp.o [191/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f32_f32_f32_km_nk_mn_instance.cpp.o [192/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o [193/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_xdl_nhwc_kyxc_nhwk_f32_instance.cpp.o [194/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f32_f32_f32_km_kn_mn_instance.cpp.o [195/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o [196/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_fwd/CMakeFiles/device_conv2d_fwd_instance.dir/device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_f32_instance.cpp.o [197/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o [198/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_lds_direct_load_f16_f16_f16_mk_nk_mn_instance.cpp.o [199/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o [200/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f32_f32_f32_km_nk_mn_instance.cpp.o [201/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o [202/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o [203/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_fwd/CMakeFiles/device_conv2d_fwd_instance.dir/device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_f16_instance.cpp.o [204/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_fwd/CMakeFiles/device_conv2d_fwd_instance.dir/device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_bf16_instance.cpp.o [205/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o [206/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o [207/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o [208/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o [209/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o [210/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o [211/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o [212/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o [213/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o [214/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_fwd/CMakeFiles/device_conv2d_fwd_instance.dir/device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_int8_instance.cpp.o [215/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o [216/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_xdl_nhwc_kyxc_nhwk_bf16_instance.cpp.o [217/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_xdl_nhwc_kyxc_nhwk_f16_instance.cpp.o [218/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o [219/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv3d_bwd_data/CMakeFiles/device_conv3d_bwd_data_instance.dir/device_conv3d_bwd_data_xdl_ndhwc_kzyxc_ndhwk_f32_instance.cpp.o [220/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_2_stage_f16_f16_f16_mk_nk_mn_instance.cpp.o [221/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o [222/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv3d_bwd_data/CMakeFiles/device_conv3d_bwd_data_instance.dir/device_conv3d_bwd_data_xdl_ndhwc_kzyxc_ndhwk_f16_instance.cpp.o [223/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o [224/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f16_f16_f16_km_kn_mn_instance.cpp.o [225/4327] Building CXX object library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o [226/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f16_f16_f16_km_nk_mn_instance.cpp.o [227/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o [228/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o [229/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv3d_bwd_data/CMakeFiles/device_conv3d_bwd_data_instance.dir/device_conv3d_bwd_data_xdl_ndhwc_kzyxc_ndhwk_bf16_instance.cpp.o [230/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o [231/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o [232/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_xdl_nhwc_kyxc_nhwk_int8_instance.cpp.o [233/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o [234/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o [235/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o [236/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o [237/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_fwd/CMakeFiles/device_conv2d_fwd_instance.dir/device_conv2d_fwd_xdl_c_shuffle_nhwc_kyxc_nhwk_f16_instance.cpp.o [238/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv3d_bwd_data/CMakeFiles/device_conv3d_bwd_data_instance.dir/device_conv3d_bwd_data_xdl_ndhwc_kzyxc_ndhwk_int8_instance.cpp.o [239/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o [240/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o [241/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_fwd_bias_relu/CMakeFiles/device_conv2d_fwd_bias_relu_instance.dir/device_conv2d_fwd_xdl_c_shuffle_bias_relu_nhwc_kyxc_nhwk_f16_instance.cpp.o [242/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o [243/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o [244/4327] Building CXX object library/src/tensor_operation_instance/gpu/conv2d_fwd_bias_relu_add/CMakeFiles/device_conv2d_fwd_bias_relu_add_instance.dir/device_conv2d_fwd_xdl_c_shuffle_bias_relu_add_nhwc_kyxc_nhwk_f16_instance.cpp.o [245/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o [246/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o [247/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_bf16_bf16_bf16_km_kn_mn_instance.cpp.o [248/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o [249/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o [250/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_bf16_bf16_bf16_km_nk_mn_instance.cpp.o [251/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o [252/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f16_f16_f16_mk_nk_mn_instance.cpp.o [253/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add/CMakeFiles/device_gemm_add_instance.dir/device_gemm_add_xdl_c_shuffle_f16_i8_f16_f16_mk_kn_mn_mn_instance.cpp.o [254/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add/CMakeFiles/device_gemm_add_instance.dir/device_gemm_add_xdl_c_shuffle_bf16_i8_bf16_bf16_mk_kn_mn_mn_instance.cpp.o [255/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_ab_scale/CMakeFiles/device_gemm_ab_scale_instance.dir/device_gemm_ab_scale_xdl_f8_f8_bf16/device_gemm_ab_scale_xdl_f8_f8_bf16_mk_nk_mn_128_128_128_comp_default_instance.cpp.o [256/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_ab_scale/CMakeFiles/device_gemm_ab_scale_instance.dir/device_gemm_ab_scale_xdl_f8_f8_bf16/device_gemm_ab_scale_xdl_f8_f8_bf16_mk_nk_mn_128_128_128_comp_mnpadding_instance.cpp.o [257/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_ab_scale/CMakeFiles/device_gemm_ab_scale_instance.dir/device_gemm_ab_scale_xdl_f8_f8_bf16/device_gemm_ab_scale_xdl_f8_f8_bf16_mk_nk_mn_128_128_128_comp_kpadding_instance.cpp.o [258/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_ab_scale/CMakeFiles/device_gemm_ab_scale_instance.dir/device_gemm_ab_scale_xdl_f8_f8_bf16/device_gemm_ab_scale_xdl_f8_f8_bf16_mk_nk_mn_128_128_128_comp_mnkpadding_instance.cpp.o [259/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_fastgelu/CMakeFiles/device_gemm_add_fastgelu_instance.dir/device_gemm_add_fastgelu_xdl_c_shuffle_bf16_i8_bf16_bf16_mk_kn_mn_mn_instance.cpp.o [260/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_f16_f16_f16_mk_kn_mn_instance.cpp.o [261/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o [262/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_default_instance.cpp.o [263/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_padded_instance.cpp.o [264/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_interwave_default_instance.cpp.o [265/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_km_kn_mn_instance.cpp.o [266/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_km_nk_mn_instance.cpp.o [267/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_interwave_padded_instance.cpp.o [268/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_add_fastgelu/CMakeFiles/device_gemm_add_add_fastgelu_instance.dir/device_gemm_add_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_f16_mk_nk_mn_mn_mn_instance.cpp.o [269/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_add_fastgelu/CMakeFiles/device_gemm_add_add_fastgelu_instance.dir/device_gemm_add_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_f16_km_kn_mn_mn_mn_instance.cpp.o [270/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_fastgelu/CMakeFiles/device_gemm_add_fastgelu_instance.dir/device_gemm_add_fastgelu_xdl_c_shuffle_f16_i8_f16_f16_mk_kn_mn_mn_instance.cpp.o [271/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_nk_mn_instance.cpp.o [272/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_fastgelu/CMakeFiles/device_gemm_add_fastgelu_instance.dir/device_gemm_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_km_kn_mn_mn_instance.cpp.o [273/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_add_fastgelu/CMakeFiles/device_gemm_add_add_fastgelu_instance.dir/device_gemm_add_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_f16_km_nk_mn_mn_mn_instance.cpp.o [274/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_fastgelu/CMakeFiles/device_gemm_add_fastgelu_instance.dir/device_gemm_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_km_nk_mn_mn_instance.cpp.o [275/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_add_fastgelu/CMakeFiles/device_gemm_add_add_fastgelu_instance.dir/device_gemm_add_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_f16_mk_kn_mn_mn_mn_instance.cpp.o [276/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_relu/CMakeFiles/device_gemm_add_relu_instance.dir/device_gemm_add_relu_xdl_c_shuffle_f16_i8_f16_f16_mk_kn_mn_mn_instance.cpp.o [277/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_fastgelu/CMakeFiles/device_gemm_add_fastgelu_instance.dir/device_gemm_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_mk_nk_mn_mn_instance.cpp.o [278/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o [279/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_relu/CMakeFiles/device_gemm_add_relu_instance.dir/device_gemm_add_relu_xdl_c_shuffle_bf16_i8_bf16_bf16_mk_kn_mn_mn_instance.cpp.o [280/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v2_default_instance.cpp.o [281/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o [282/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v2_padded_instance.cpp.o [283/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_silu/CMakeFiles/device_gemm_add_silu_instance.dir/device_gemm_add_silu_xdl_c_shuffle_f16_i8_f16_f16_mk_kn_mn_mn_instance.cpp.o [284/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_fastgelu/CMakeFiles/device_gemm_add_fastgelu_instance.dir/device_gemm_add_fastgelu_xdl_c_shuffle_f16_f16_f16_f16_mk_kn_mn_mn_instance.cpp.o [285/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_silu/CMakeFiles/device_gemm_add_silu_instance.dir/device_gemm_add_silu_xdl_c_shuffle_bf16_i8_bf16_bf16_mk_kn_mn_mn_instance.cpp.o [286/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o [287/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o [288/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_ab_scale/CMakeFiles/device_gemm_ab_scale_instance.dir/device_gemm_ab_scale_xdl_f8_f8_bf16/device_gemm_ab_scale_xdl_f8_f8_bf16_mk_nk_mn_128_128_128_mem_v1_default_instance.cpp.o [289/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_multiply/CMakeFiles/device_gemm_add_multiply_instance.dir/device_gemm_add_multiply_xdl_c_shuffle_f16_f16_f16_f16_f16_km_kn_mn_mn_mn_instance.cpp.o [290/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_ab_scale/CMakeFiles/device_gemm_ab_scale_instance.dir/device_gemm_ab_scale_xdl_f8_f8_bf16/device_gemm_ab_scale_xdl_f8_f8_bf16_mk_nk_mn_128_128_128_mem_v1_kpadding_instance.cpp.o [291/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_multiply/CMakeFiles/device_gemm_add_multiply_instance.dir/device_gemm_add_multiply_xdl_c_shuffle_f16_f16_f16_f16_f16_km_nk_mn_mn_mn_instance.cpp.o [292/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_multiply/CMakeFiles/device_gemm_add_multiply_instance.dir/device_gemm_add_multiply_xdl_c_shuffle_f16_f16_f16_f16_f16_mk_kn_mn_mn_mn_instance.cpp.o [293/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_ab_scale/CMakeFiles/device_gemm_ab_scale_instance.dir/device_gemm_ab_scale_xdl_f8_f8_bf16/device_gemm_ab_scale_xdl_f8_f8_bf16_mk_nk_mn_128_128_128_mem_v1_mnkpadding_instance.cpp.o [294/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bias_add_reduce/CMakeFiles/device_gemm_bias_add_reduce_instance.dir/device_gemm_bias_add_mean_squaremean_xdl_cshuffle_f16_f16_f16_f32_f32_mk_nk_mn_instance.cpp.o [295/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bias_add_reduce/CMakeFiles/device_gemm_bias_add_reduce_instance.dir/device_gemm_bias_add_mean_squaremean_xdl_cshuffle_f16_f16_f16_f32_f32_mk_kn_mn_instance.cpp.o [296/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bias_add_reduce/CMakeFiles/device_gemm_bias_add_reduce_instance.dir/device_gemm_bias_add_mean_squaremean_xdl_cshuffle_f16_f16_f16_f32_f32_km_kn_mn_instance.cpp.o [297/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bias_add_reduce/CMakeFiles/device_gemm_bias_add_reduce_instance.dir/device_gemm_bias_add_mean_squaremean_xdl_cshuffle_f16_f16_f16_f32_f32_km_nk_mn_instance.cpp.o [298/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_multiply/CMakeFiles/device_gemm_add_multiply_instance.dir/device_gemm_add_multiply_xdl_c_shuffle_f16_f16_f16_f16_f16_mk_nk_mn_mn_mn_instance.cpp.o [299/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_km_kn_mn_mn_instance.cpp.o [300/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_km_nk_mn_mn_instance.cpp.o [301/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_mk_kn_mn_mn_instance.cpp.o [302/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o [303/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o [304/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o [305/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o [306/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o [307/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_xdl_c_shuffle_f16_f16_f16_f16_mk_nk_mn_mn_instance.cpp.o [308/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm/CMakeFiles/device_gemm_add_relu_add_layernorm_instance.dir/device_gemm_add_relu_add_xdl_c_shuffle_layernorm_f16_mk_nk_mn_mn_mn_instance.cpp.o [309/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_fastgelu/CMakeFiles/device_gemm_fastgelu_instance.dir/device_gemm_fastgelu_xdl_c_shuffle_f16_f16_f16_mk_nk_mn_instance.cpp.o [310/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_fastgelu/CMakeFiles/device_gemm_fastgelu_instance.dir/device_gemm_fastgelu_xdl_c_shuffle_f16_f16_f16_km_kn_mn_instance.cpp.o [311/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o [312/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o [313/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o [314/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o [315/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_fastgelu/CMakeFiles/device_gemm_fastgelu_instance.dir/device_gemm_fastgelu_xdl_c_shuffle_f16_f16_f16_mk_kn_mn_instance.cpp.o [316/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_fastgelu/CMakeFiles/device_gemm_fastgelu_instance.dir/device_gemm_fastgelu_xdl_c_shuffle_f16_f16_f16_km_nk_mn_instance.cpp.o [317/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o [318/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o [319/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm/CMakeFiles/device_gemm_add_relu_add_layernorm_instance.dir/device_gemm_add_relu_add_xdl_c_shuffle_layernorm_f16_mk_kn_mn_mn_mn_instance.cpp.o [320/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm/CMakeFiles/device_gemm_add_relu_add_layernorm_instance.dir/device_gemm_add_relu_add_xdl_c_shuffle_layernorm_f16_km_kn_mn_mn_mn_instance.cpp.o [321/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm/CMakeFiles/device_gemm_add_relu_add_layernorm_instance.dir/device_gemm_add_relu_add_xdl_c_shuffle_layernorm_f16_km_nk_mn_mn_mn_instance.cpp.o [322/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_add/CMakeFiles/device_gemm_multiply_add_instance.dir/device_gemm_multiply_add_xdl_c_shuffle_f16_f16_f16_f16_f16_mk_kn_mn_mn_mn_instance.cpp.o [323/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_bias_gelu_bf16_i8_bf16_mk_nk_mn_v1_instance.cpp.o [324/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o [325/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_add/CMakeFiles/device_gemm_multiply_add_instance.dir/device_gemm_multiply_add_xdl_c_shuffle_f16_f16_f16_f16_f16_mk_nk_mn_mn_mn_instance.cpp.o [326/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o [327/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_add/CMakeFiles/device_gemm_multiply_add_instance.dir/device_gemm_multiply_add_xdl_c_shuffle_f16_f8_f32_f32_f16_mk_kn_mn_mn_mn_instance.cpp.o [328/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_xdl_f16_i4_f16/device_gemm_b_scale_xdl_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [329/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_add/CMakeFiles/device_gemm_multiply_add_instance.dir/device_gemm_multiply_add_xdl_c_shuffle_f16_f8_f32_f32_f16_mk_nk_mn_mn_mn_instance.cpp.o [330/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f32_f32_f32_mk_kn_mn_instance.cpp.o [331/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_reduce/CMakeFiles/device_gemm_reduce_instance.dir/device_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_mk_nk_mn_instance.cpp.o [332/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f32_f32_f32_mk_nk_mn_instance.cpp.o [333/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f32_f32_f32_km_kn_mn_instance.cpp.o [334/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_reduce/CMakeFiles/device_gemm_reduce_instance.dir/device_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_mk_kn_mn_instance.cpp.o [335/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_reduce/CMakeFiles/device_gemm_reduce_instance.dir/device_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_km_kn_mn_instance.cpp.o [336/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_reduce/CMakeFiles/device_gemm_reduce_instance.dir/device_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_km_nk_mn_instance.cpp.o [337/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f32_f32_f32_km_nk_mn_instance.cpp.o [338/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [339/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_gelu_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [340/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_bias_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [341/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_bias_gelu_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [342/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_multiply_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [343/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_multiply_bias_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [344/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_multiply_gelu_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [345/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multi_abd/CMakeFiles/device_gemm_multi_abd_instance.dir/device_gemm_xdl_multi_abd_multiply_bias_gelu_bf16_i8_bf16_mk_kn_mn_v1_instance.cpp.o [346/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_nk_mn_v1_instance.cpp.o [347/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_i8_i8_bf16/device_gemm_multiply_multiply_xdl_i8_i8_bf16_mk_nk_mn_mem_v1_default_instance.cpp.o [348/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_kn_mn_v1_instance.cpp.o [349/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_kn_mn_v2_instance.cpp.o [350/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_i8_i8_bf16/device_gemm_multiply_multiply_xdl_i8_i8_bf16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [351/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_kn_mn_v1_interwave_instance.cpp.o [352/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_i8_i8_bf16/device_gemm_multiply_multiply_xdl_i8_i8_bf16_mk_nk_mn_mem_v2_default_instance.cpp.o [353/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_i8_i8_bf16/device_gemm_multiply_multiply_xdl_i8_i8_bf16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [354/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_km_nk_mn_instance.cpp.o [355/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_km_kn_mn_instance.cpp.o [356/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_nk_mn_v1_interwave_instance.cpp.o [357/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_nk_mn_v2_instance.cpp.o [358/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_lds_direct_load_f16_f16_f16_mk_nk_mn_instance.cpp.o [359/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_fp8_f16_f16_km_kn_mn_instance.cpp.o [360/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_fp8_f16_f16_km_nk_mn_instance.cpp.o [361/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_kn_mn_irregular_instance.cpp.o [362/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_nk_mn_v1_irregular_instance.cpp.o [363/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_fp8_f16_f16_mk_nk_mn_instance.cpp.o [364/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_kn_mn_v1_irregular_instance.cpp.o [365/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_i8_i8_bf16/device_gemm_multiply_multiply_xdl_i8_i8_bf16_mk_nk_mn_comp_default_instance.cpp.o [366/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_kn_mn_v1_interwave_irregular_instance.cpp.o [367/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_i8_i8_bf16/device_gemm_multiply_multiply_xdl_i8_i8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o [368/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_kn_mn_v2_irregular_instance.cpp.o [369/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_km_nk_mn_instance.cpp.o [370/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_km_kn_mn_instance.cpp.o [371/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_nk_mn_v1_interwave_irregular_instance.cpp.o [372/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_fp8_f16_f16_mk_kn_mn_v1_interwave_instance.cpp.o [373/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_nk_mn_kpb128_instance.cpp.o [374/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_mk_nk_mn_v2_irregular_instance.cpp.o [375/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_fp8_f16_f16_mk_kn_mn_v1_instance.cpp.o [376/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_fp8_f16_f16_mk_kn_mn_v2_instance.cpp.o [377/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_comp_fp8_km_kn_mn_instance.cpp.o [378/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_streamk/CMakeFiles/device_gemm_streamk_instance.dir/device_gemm_xdl_streamk_f16_f16_f16_mk_kn_mn_instance.cpp.o [379/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_comp_fp8_km_nk_mn_instance.cpp.o [380/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_kn_mn_v2_instance.cpp.o [381/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_kn_mn_v1_interwave_instance.cpp.o [382/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_kn_mn_v1_instance.cpp.o [383/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_nk_mn_v1_interwave_instance.cpp.o [384/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_comp_fp8_mk_nk_mn_instance.cpp.o [385/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_nk_mn_v1_instance.cpp.o [386/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_f16_f16_comp_fp8_mk_kn_mn_instance.cpp.o [387/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_splitk/CMakeFiles/device_gemm_splitk_instance.dir/device_gemm_xdl_splitk_f16_fp8_f16_mk_nk_mn_v2_instance.cpp.o [388/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_bf16/device_gemm_multiply_multiply_xdl_f8_f8_bf16_mk_nk_mn_mem_v1_default_instance.cpp.o [389/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_bf16/device_gemm_multiply_multiply_xdl_f8_f8_bf16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [390/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_mem_v1_default_instance.cpp.o [391/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [392/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o [393/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o [394/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o [395/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_bf16/device_gemm_multiply_multiply_xdl_f8_f8_bf16_mk_nk_mn_mem_v2_default_instance.cpp.o [396/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_bf16/device_gemm_multiply_multiply_xdl_f8_f8_bf16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [397/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [398/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [399/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [400/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v1_default_instance.cpp.o [401/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [402/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v1_default_instance.cpp.o [403/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o [404/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v2_default_instance.cpp.o [405/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [406/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o [407/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [408/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [409/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_nk_mn_mem_v1_default_instance.cpp.o [410/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [411/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [412/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [413/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o [414/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [415/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [416/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o [417/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [418/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v2_default_instance.cpp.o [419/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o [420/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v1_default_instance.cpp.o [421/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [422/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [423/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [424/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v2_default_instance.cpp.o [425/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [426/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o [427/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o [428/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o [429/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o [430/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_mem_v1_default_instance.cpp.o [431/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o [432/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [433/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_mem_v1_default_instance.cpp.o [434/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_mem_v1_kpadding_instance.cpp.o [435/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o [436/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_mem_v2_default_instance.cpp.o [437/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [438/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_mem_v1_mnkpadding_instance.cpp.o [439/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_mem_v2_kpadding_instance.cpp.o [440/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o [441/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o [442/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_mem_v2_mnkpadding_instance.cpp.o [443/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o [444/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [445/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o [446/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_mem_v2_default_instance.cpp.o [447/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o [448/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_mem_v1_default_instance.cpp.o [449/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_mem_v2_default_instance.cpp.o [450/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_mem_v1_kpadding_instance.cpp.o [451/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_mem_v1_mkpadding_instance.cpp.o [452/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [453/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [454/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o [455/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o [456/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o [457/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [458/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o [459/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o [460/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_mem_v2_kpadding_instance.cpp.o [461/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_mem_v2_mkpadding_instance.cpp.o [462/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [463/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_mem_v1_default_instance.cpp.o [464/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_nk_mn_mem_v1_default_instance.cpp.o [465/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [466/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_mem_v2_default_instance.cpp.o [467/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [468/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [469/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [470/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [471/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_f8_f16/device_gemm_xdl_universal_f16_f8_f16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [472/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_nk_mn_mem_v1_default_instance.cpp.o [473/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [474/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [475/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f16_f16/device_gemm_xdl_universal_f8_f16_f16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [476/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f16_i4_f16/device_gemm_xdl_universal_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [477/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_i4_bf16/device_gemm_xdl_universal_bf16_i4_bf16_mk_nk_mn_mem_v2_default_instance.cpp.o [478/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_mem_v1_default_instance.cpp.o [479/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [480/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o [481/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_comp_mpadding_instance.cpp.o [482/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_mem_v1_nkpadding_instance.cpp.o [483/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_mem_v2_default_instance.cpp.o [484/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [485/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o [486/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_km_nk_mn_comp_mkpadding_instance.cpp.o [487/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_mem_v2_nkpadding_instance.cpp.o [488/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o [489/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o [490/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_kn_mn_comp_nkpadding_instance.cpp.o [491/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_batched/CMakeFiles/device_gemm_universal_batched_instance.dir/device_batched_gemm_xdl_universal_f8_f8_bf16/device_batched_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o [492/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_batched/CMakeFiles/device_gemm_universal_batched_instance.dir/device_batched_gemm_xdl_universal_f8_f8_bf16/device_batched_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_mem_v1_default_instance.cpp.o [493/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_mem_v1_default_instance.cpp.o [494/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o [495/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [496/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_mem_v2_default_instance.cpp.o [497/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o [498/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_xdl_universal_f8_f8_bf16/device_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [499/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_batched/CMakeFiles/device_gemm_universal_batched_instance.dir/device_batched_gemm_xdl_universal_f8_f8_bf16/device_batched_gemm_xdl_universal_f8_f8_bf16_mk_nk_mn_mem_v2_default_instance.cpp.o [500/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_batched/CMakeFiles/device_gemm_universal_batched_instance.dir/device_batched_gemm_xdl_universal_bf16_bf16_bf16/device_batched_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v1_default_instance.cpp.o [501/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_batched/CMakeFiles/device_gemm_universal_batched_instance.dir/device_batched_gemm_xdl_universal_bf16_bf16_bf16/device_batched_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v2_default_instance.cpp.o [502/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_i8_bf16/device_gemm_xdl_universal_bf16_i8_bf16_mk_kn_mn_comp_default_instance.cpp.o [503/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_i8_bf16/device_gemm_xdl_universal_bf16_i8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o [504/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_i8_bf16/device_gemm_xdl_universal_bf16_i8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o [505/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_i8_bf16/device_gemm_xdl_universal_bf16_i8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [506/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_i8_bf16/device_gemm_xdl_universal_bf16_i8_bf16_mk_kn_mn_mem_v2_default_instance.cpp.o [507/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_i8_bf16/device_gemm_xdl_universal_bf16_i8_bf16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [508/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_i8_bf16/device_gemm_xdl_universal_bf16_i8_bf16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [509/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_batched/CMakeFiles/device_gemm_universal_batched_instance.dir/device_batched_gemm_xdl_universal_bf16_bf16_bf16/device_batched_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o [510/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o [511/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o [512/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o [513/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [514/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o [515/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o [516/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o [517/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [518/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v2_default_instance.cpp.o [519/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [520/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_bf16_bf16_bf16/device_gemm_xdl_universal_bf16_bf16_bf16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [521/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v1_default_instance.cpp.o [522/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [523/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [524/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v2_default_instance.cpp.o [525/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o [526/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [527/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_reduce/CMakeFiles/device_gemm_universal_reduce_instance.dir/device_gemm_xdl_universal_f16_f16_f16/device_gemm_xdl_universal_f16_f16_f16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [528/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o [529/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o [530/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [531/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_mem_v1_default_instance.cpp.o [532/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o [533/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [534/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o [535/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_mem_v2_default_instance.cpp.o [536/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o [537/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [538/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [539/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o [540/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [541/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o [542/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o [543/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o [544/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [545/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o [546/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o [547/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_mem_v1_default_instance.cpp.o [548/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o [549/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_mem_v1_default_instance.cpp.o [550/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o [551/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [552/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_mem_v1_mnkpadding_instance.cpp.o [553/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [554/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o [555/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_mem_v2_mnkpadding_instance.cpp.o [556/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f8_f16/device_gemm_xdl_universal_streamk_f16_f8_f16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [557/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_mem_v1_default_instance.cpp.o [558/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o [559/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o [560/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o [561/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [562/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [563/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_mem_v2_default_instance.cpp.o [564/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [565/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [566/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_mem_v1_kpadding_instance.cpp.o [567/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_mem_v2_default_instance.cpp.o [568/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o FAILED: library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o /opt/rocm/bin/hipcc -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_FNUZ_FP8 -DCK_USE_GFX94 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/include -I/build/composable-kernel/src/composable_kernel-rocm-6.4.1/include -I/build/composable-kernel/src/build/include -O3 -DNDEBUG -std=c++17 -fPIC -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-deprecated-declarations -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics --offload-compress -x hip --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o -c /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp clang++: error: unable to execute command: Killed clang++: error: clang frontend command failed due to signal (use -v to see invocation) clang version 19.0.0git (/startdir/rocm-llvm c87081df219c42dc27c5b6d86c0525bc7d01f727) Target: riscv64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm/lib/llvm/bin Build config: +assertions clang++: note: diagnostic msg: Error generating preprocessed source(s). failed to execute:/opt/rocm/lib/llvm/bin/clang++ --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_FNUZ_FP8 -DCK_USE_GFX94 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/include -I/build/composable-kernel/src/composable_kernel-rocm-6.4.1/include -I/build/composable-kernel/src/build/include -O3 -DNDEBUG -std=c++17 -fPIC -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-deprecated-declarations -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics --offload-compress -x hip -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o "library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp.o" -c /build/composable-kernel/src/composable_kernel-rocm-6.4.1/library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_default_instance.cpp [569/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_mem_v1_mnkpadding_instance.cpp.o [570/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_mem_v1_default_instance.cpp.o [571/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_mem_v2_kpadding_instance.cpp.o [572/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_mem_v1_default_instance.cpp.o [573/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_kn_mn_mem_v2_mnkpadding_instance.cpp.o [574/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [575/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_mem_v1_kpadding_instance.cpp.o [576/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [577/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_mem_v1_mnkpadding_instance.cpp.o [578/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_mem_v1_mnkpadding_instance.cpp.o [579/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_mem_v2_default_instance.cpp.o [580/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_mem_v2_mnkpadding_instance.cpp.o [581/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f8_f16_f16/device_gemm_xdl_universal_streamk_f8_f16_f16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [582/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_mem_v2_kpadding_instance.cpp.o [583/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_mem_v2_mnkpadding_instance.cpp.o [584/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o [585/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o [586/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o [587/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_f16_f16_f16/device_gemm_xdl_universal_streamk_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o [588/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_mem_v1_kpadding_instance.cpp.o [589/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_mem_v1_default_instance.cpp.o [590/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_mem_v2_default_instance.cpp.o [591/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_mem_v2_kpadding_instance.cpp.o [592/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_mem_v1_mnkpadding_instance.cpp.o [593/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o [594/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o [595/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_mem_v2_mnkpadding_instance.cpp.o [596/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o [597/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o [598/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_universal_streamk/CMakeFiles/device_gemm_universal_streamk_instance.dir/device_gemm_xdl_universal_streamk_bf16_bf16_bf16/device_gemm_xdl_universal_streamk_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o [599/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_bf16/device_gemm_multiply_multiply_xdl_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o [600/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_f16/device_gemm_multiply_multiply_xdl_f8_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o [601/4327] Building CXX object library/src/tensor_operation_instance/gpu/gemm_multiply_multiply/CMakeFiles/device_gemm_multiply_multiply_instance.dir/device_gemm_multiply_multiply_xdl_f8_f8_bf16/device_gemm_multiply_multiply_xdl_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o ninja: build stopped: subcommand failed. ==> ERROR: A failure occurred in build().  Aborting... ==> ERROR: Build failed, check /var/lib/archbuild/extra-riscv64/felix-0/build [?25h[?25h[?25hreceiving incremental file list composable-kernel-6.4.1-1-riscv64-build.log composable-kernel-6.4.1-1-riscv64-prepare.log sent 62 bytes received 11,430 bytes 7,661.33 bytes/sec total size is 176,296 speedup is 15.34