Hello!
I want to ask about -segment_numbers option.
If i put this option when i tokenize, can i check it in my output file?
This is my command,
th tools/tokenize.lua -case_feature true -segment_case true -segment_numbers true -joiner_annotate true < input_test_en.txt > test.tok
and the output is like below.
the│C convention│L in│L 1912│N led│L to│L a│L split│L republican│C party│C ■.│N
I expected 1912 segmented like 1 9 1 2 but there is no change…
Please help me.
Thank you.