How to set the gradient clipping value

I’ve tried to set max_grad_norm with different numbers. According to the documentation:

-max_grad_norm (default: 5)
Clip the gradients L2-norm to this value. Set to 0 to disable.

However, I think what I’m doing is lack of reasoning.

Would it be a great strategy if I list every gradient and get something like average while training? And I also wonder if there’s any way to get each gradient in training phase.