Hello,
usually with the version TensorFlow 1, I used to launch the vocabulary extraction using the character tokenizer like this:
onmt-build-vocab --size 48000 --save_vocab vocab_char.src train.src --tokenizer CharacterTokenizer
but now I get the following error:
Traceback (most recent call last):
File “/usr/local/bin/onmt-build-vocab”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/opennmt/bin/build_vocab.py”, line 73, in main
tokenizer = tokenizers.make_tokenizer(args.tokenizer_config)
File “/usr/local/lib/python3.6/dist-packages/opennmt/tokenizers/tokenizer.py”, line 297, in make_tokenizer
raise ValueError(“Invalid tokenization configuration: %s” % str(config))
ValueError: Invalid tokenization configuration: CharacterTokenizer
It seems I miss a lot of things, can you help me about this issue please?