Serving with FPGA

WangYongzhao · November 19, 2020, 3:38am

Anyone tried serving translation model with FPGA? Seems FPGA is cheaper than GPU, would this be a future direction?

guillaumekln · November 19, 2020, 3:29pm

FPGA require very specific work. Usually it is not worth the effort, especially when you consider solutions like CTranslate2 which have competitive performance on a basic CPU.

WangYongzhao · November 23, 2020, 10:02am

so if we ignore the specific for now, below should be correct.
cost ($$): cpu =FPGA < GPU
latency : cpu > GPU = FPGA