As said in the first link you posted, OpenMP is used for intra_threads which configures the level of parallelism within each translation. If you enable parallel translations by increasing inter_threads, then OpenMP is not required.
Asynchronous translations can be implemented in C++ with the TranslatorPool class. Here intra_threads corresponds to num_threads_per_translator and inter_threads corresponds to num_translators.
The class usage should be pretty straightforward (see the method translate_batch_async).