Does CNNEncoder use kernel of (width, 1)?

Does CNNEncoder use kernel of (width, 1) instead of (width, d) ? In the paper(ConvSeq2Seq), W is described 2d X K X d. From my point of view, K is the width and d is the height.In fact, the convolution process is two-dimensional, but it only slides in one dimension. It’s very confusing for me that the source code in openNMT use kernel of (width, 1).