I wonder what is the progress of multi-GPU, and (1) Even though the current version can not gain speed improvements, I still want to use it for enabling large batch size. Can I run the train.py
in bug-free mode. I tried once and it returned Error: AttributeError: 'DataParallel' object has no attribute 'generator'
. (2) Can I train the model via multi-GPU with Lua version and load and fine-tune it with Python version? Thanks!
there is a pending PR
if you’re on a hurry to test, you can give a try.
CHeers.
Hi Vincent,
Thanks for your work. I have taken a try but seems that it has the same bug of this
issue.
Yes coverage_attn does not work in multi gpu mode right now.
Hope someone will look into it.
Not only coverge_attn
, copy_attn
also raised the same error.