Anyone tried serving translation model with FPGA? Seems FPGA is cheaper than GPU, would this be a future direction?
FPGA require very specific work. Usually it is not worth the effort, especially when you consider solutions like CTranslate2 which have competitive performance on a basic CPU.
so if we ignore the specific for now, below should be correct.
cost ($$): cpu =FPGA < GPU
latency : cpu > GPU = FPGA