Linker error when compiling CTranslate2 with "-march=corei7-avx" (CXX_FLAGS)

Hi,
just tried to compile a recent version of CTranslate2 (commit 3f53d02) under Ubuntu 18.04 (GCC 7.5.0) with CUDA 10.1 using the following cmake command (adapted from Dockerfile.ubuntu-gpu) :

cmake -DCMAKE_PREFIX_PATH="/usr/local/dnnl;/usr/local/mkl-dnn" -DWITH_CUDA=ON -DWITH_DNNL=ON -DOPENMP_RUNTIME=COMP -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=corei7-avx" -DCUDA_NVCC_FLAGS="-Xfatbin -compress-all" -DCUDA_ARCH_LIST=“Common” …

… and then …

VERBOSE=1 make -j $(nproc)

… leading to the following linking error :

[…]
[ 96%] Linking CXX shared library libctranslate2.so
[…]
CMakeFiles/ctranslate2.dir/kernels_avx.cc.o: In function void ctranslate2::cpu::exp<(ctranslate2::cpu::CpuIsa)1>(float const*, float*, long)':* *kernels_avx.cc:(.text+0x310): multiple definition of void ctranslate2::cpu::exp<(ctranslate2::cpu::CpuIsa)1>(float const, float*, long)’*
CMakeFiles/ctranslate2.dir/src/cpu/kernels.cc.o:kernels.cc:(.text+0x310): first defined here
CMakeFiles/ctranslate2.dir/kernels_avx.cc.o: In function void ctranslate2::cpu::log<(ctranslate2::cpu::CpuIsa)1>(float const*, float*, long)':* *kernels_avx.cc:(.text+0x370): multiple definition of void ctranslate2::cpu::log<(ctranslate2::cpu::CpuIsa)1>(float const, float*, long)’*
CMakeFiles/ctranslate2.dir/src/cpu/kernels.cc.o:kernels.cc:(.text+0x370): first defined here
CMakeFiles/ctranslate2.dir/kernels_avx.cc.o: In function void ctranslate2::cpu::sin<(ctranslate2::cpu::CpuIsa)1>(float const*, float*, long)':* *kernels_avx.cc:(.text+0x3d0): multiple definition of void ctranslate2::cpu::sin<(ctranslate2::cpu::CpuIsa)1>(float const, float*, long)’*
CMakeFiles/ctranslate2.dir/src/cpu/kernels.cc.o:kernels.cc:(.text+0x3d0): first defined here
CMakeFiles/ctranslate2.dir/kernels_avx.cc.o: In function void ctranslate2::cpu::cos<(ctranslate2::cpu::CpuIsa)1>(float const*, float*, long)':* *kernels_avx.cc:(.text+0x430): multiple definition of void ctranslate2::cpu::cos<(ctranslate2::cpu::CpuIsa)1>(float const, float*, long)’*
CMakeFiles/ctranslate2.dir/src/cpu/kernels.cc.o:kernels.cc:(.text+0x430): first defined here
CMakeFiles/ctranslate2.dir/kernels_avx.cc.o: In function void ctranslate2::cpu::softmax<(ctranslate2::cpu::CpuIsa)1>(float const*, int const*, float*, long, long, long, bool, float)':* *kernels_avx.cc:(.text+0xfe0): multiple definition of void ctranslate2::cpu::softmax<(ctranslate2::cpu::CpuIsa)1>(float const, int const*, float*, long, long, long, bool, float)’*
CMakeFiles/ctranslate2.dir/src/cpu/kernels.cc.o:kernels.cc:(.text+0xfe0): first defined here
collect2: error: ld returned 1 exit status

Compiling with the default flag “-march=x86-64” succeeds without a problem, but “-march=corei7-avx” and “-march=corei7-avx2” both fail.

Any idea of how to fix that error?
Thanks in advance.

Kind regards,
Martin

Hi,

When targeting an architecture with -march, the internal CPU architecture dispatching should be disabled with -DENABLE_CPU_DISPATCH=OFF. This is (briefly) covered in the description of this build option.

However, note that there is little benefit in targeting a fixed architecture because CTranslate2 already dispatches to the appropriate one at runtime.

Ahh, did not realize that this was already mentioned somewhere in the docs, sorry.

Another question that bothers me : I’m dynamically linking libctranslate2.so to my application and would like to catch exceptions thrown in this lib. Compiling with the flag “-fnon-call-exceptions” should do the job, but it fails with the following compiler error :

/usr/local/CTranslate2-dev/include/ctranslate2/half_float/half.hpp(1067): warning: non-POD class type passed through ellipsis

  •      detected during instantiation of "unsigned int half_float::detail::float2half<R,T>(T) [with R=std::round_to_nearest, T=long double]"*
    

(2279): here

/usr/local/CTranslate2-dev/third_party/thrust/thrust/system/cuda/detail/extrema.h: In function \u2018ItemsIt thrust::cuda_cub::__extrema::element(thrust::cuda_cub::execution_policy&, ItemsIt, ItemsIt, BinaryPred) [with ArgFunctor = thrust::cuda_cub::__extrema::arg_max_f; Derived = thrust::detail::execute_with_allocator<ctranslate2::cuda::ThrustAllocator, thrust::cuda_cub::execute_on_stream_base>; ItemsIt = const float; BinaryPred = ctranslate2::cuda::maximum]\u2019:*
/usr/local/CTranslate2-dev/third_party/thrust/thrust/system/cuda/detail/extrema.h:380:1: internal compiler error: Segmentation fault

  • element(execution_policy &policy,*
  • ^ ~~~~~*
    Please submit a full bug report,
    with preprocessed source if appropriate.
    See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
    – Removing /usr/local/CTranslate2-build/CMakeFiles/ctranslate2.dir/src/primitives/./ctranslate2_generated_cuda.cu.o
    /usr/local/cmake-3.18.2-Linux-x86_64/bin/cmake -E rm -f /usr/local/CTranslate2-build/CMakeFiles/ctranslate2.dir/src/primitives/./ctranslate2_generated_cuda.cu.o
    CMake Error at ctranslate2_generated_cuda.cu.o.Release.cmake:280 (message):
  • Error generating file*
    /usr/local/CTranslate2-build/CMakeFiles/ctranslate2.dir/src/primitives/./ctranslate2_generated_cuda.cu.o

CMakeFiles/ctranslate2.dir/build.make:82: recipe for target ‘CMakeFiles/ctranslate2.dir/src/primitives/ctranslate2_generated_cuda.cu.o’ failed
make[2]: *** [CMakeFiles/ctranslate2.dir/src/primitives/ctranslate2_generated_cuda.cu.o] Error 1
make[2]: Leaving directory ‘/usr/local/CTranslate2-build’
CMakeFiles/Makefile2:114: recipe for target ‘CMakeFiles/ctranslate2.dir/all’ failed
make[1]: *** [CMakeFiles/ctranslate2.dir/all] Error 2
make[1]: Leaving directory ‘/usr/local/CTranslate2-build’
Makefile:148: recipe for target ‘all’ failed
make: *** [all] Error 2
make: Leaving directory ‘/usr/local/CTranslate2-build’

Is there something that I can do to make this work?

Kind regards,
Martin

Not sure this a CTranslate2 issue. For example, the tests are linked to libctranslate2.so and are able to catch exceptions.

How are you compiling your application and linking to CTranslate2? Do you have a standalone code example where catching the exception does not work?

Just tried to catch a std::runtime_error in a small sample application thrown after having changed some filename in model directory : indeed, it works fine without “-fnon-call-exceptions”.
But then I remembered from earlier work with CTranslate (OpenNMT-lua) that the real problem was not how to deal with exceptions but finding out how to get notifications about and handle signals in my application, e.g. SIGSEGV from the lib. With CTranslate it was quite probable to run into such a scenario, e.g. with accidentally mal-formed source features or other rare cases of unpredictable instabilities.
Although it’s rather unlikely that such things could happen with CTranslate2 due to its improved exception handling, it just would have been nice to have this additional security of being able to handle signals.

Yes, we expect CTranslate2 to be more robust than CTranslate.

I’m not very familiar with signal handling, but I think you can do it in your application even though it is happening in a shared library, no?

In any case, if you encounter a segmentation fault I think it’s better to report it as a bug that we will fix rather than try working around it in your application.

Also not being a specialist with signal handling, I found out, that one approach could be Throwing an exception from within a signal handler which worked pretty well with CTranslate.
But I agree that it’s way better to make a bug report - promised next time :innocent:

Thanks for your time!

Best regards
Martin