Failing conversion of Small100 (SMALL100Tokenizer does not exist or is not currently imported)

Hello !
This issue is vaguely similar to this one, but I’m getting a different error message, and I think I’m overlooking something very simple, so sorry in advance.

I’ve fine-tuned Small100 for a specific translation task, updated transformers and CTranslate to their latest versions, and still, when running :

ct2-transformers-converter --model "\Small100" --output_dir "\CT-Small100"

I’m getting the following exception :

Traceback (most recent call last):
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\Lib\", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\Lib\", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\Scripts\ct2-transformers-converter.exe\", line 7, in <module>
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\lib\site-packages\ctranslate2\converters\", line 1610, in main
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\lib\site-packages\ctranslate2\converters\", line 57, in convert_from_args
    return self.convert(
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\lib\site-packages\ctranslate2\converters\", line 96, in convert
    model_spec = self._load()
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\lib\site-packages\ctranslate2\converters\", line 121, in _load
    tokenizer = self.load_tokenizer(
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\lib\site-packages\ctranslate2\converters\", line 143, in load_tokenizer
    return tokenizer_class.from_pretrained(model_name_or_path, **kwargs)
  File "C:\Users\Cadenza\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\", line 699, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class SMALL100Tokenizer does not exist or is not currently imported.

Sorry again if it’s very simple, but any help would be appreciated !

Thanks in advance,


Are all tokenizer related configurations and files included in the fine-tuned model directory?

Hi !
Thanks for your answer !

As much as I can guess, I suppose so. Files in this folder are :


Would I be missing something ?

The base model also has a file named

Indeed, and I have it ! I’m just unsure about how to pass it to the conversion tool.
Could you help me with this ?

The CTranslate2 converter simply loads the tokenizer with transformers.AutoTokenizer.from_pretrained.

But I don’t know how to register custom tokenizers to the AutoTokenizer. I suggest that you ask this question on the Transformers forum:

Alright, thank you very much for these directions !