Training speed with alignment significantly drops down

Hello dear community! I have noticed that the training speed with alignment (using fast_align) on a server with 4 GPUs RTXA4500 20GB significantly drops down. I’ve tried both modes, fp32 and fp16 (Mixed precision). Is it expected / normal behavior?

And why on a server with 1 GPU RTX6000 24GB the behavior is different? Using alignment there doesn’t affect training speed.
Many thanks in advance.

1 Like