What is the use case of this? I believe it will also be quite tricky to implement.
I think we can always come back to a fixed number of features by using a dummy label. For example with the case, the
N label is used to mark tokens where the case does not apply.