Let’s assume the model predicts the following for an input sequence.
The O
creation. O
date O
is O
27 B-DATE
Aug I-DATE
2020 I-DATE
and O
update. O
date. O
is O
01-09-2020 B-DATE
How do you pick the best candidate for creation date from logist values?
Can you clarify what you want to achieve? What do you mean by candidate here?
I am trying to extract the best date candidate here, but there are multiple date candidates here, 27 Aug 2020 and 01-09-2020. What is the best was way to aggregate confidence scores across for the candidate 27, Aug and 2020. Should I take average confidence scores for the split tokens (27, Aug, 2020) and compare against 01-09-2020 to decide which is the best candidate for creation date?
A classification model will not help you for that. It was not trained to answer this kind of question.
Maybe you need a syntactic parser to tell which date is associated with “The creation date”?