Text Summarization Question?

kingomalek · May 1, 2017, 2:08pm

After 3 weeks of training textsum of tensorflow i didnt get any good result with my nvidia 1070 gtx , luckly i have found OpenNMT really nice job , I test Text Summarization and it gives a good result with the pretrained model .
I’m planning now on rebuild the model with larger vocab and start the training with pretrained wrd embedding and maybe I well go further by implementing new ideas , my questions are :

How long did u take to achieve the result from Text Summarization pretrained model ?
when i checked the preprocess documentation it requires valid_src & valid_tgt does that mean the model do the validation by itself while training (unlike the tensorflow model ) ? if so , will it stop automatically if it achieves a good avg_loss.
as I can see u guys didnt implement the copy process for the unk tokkens , but u have the phrase_table. however speaking logically this table is used to replace unk with a specific tokken in the result , but what if i check that the tokken is OOV then i will add it to the phrase_table with the same tokken : example

if word “china” is OOV (unk) then I append this to phrase_table :
china|||china

isn’t that a workaround of the copy process.

Thank u

guillaumekln · May 2, 2017, 7:57am

Hi,

I can answer to the last 2 points:

Yes. After iterating through the whole dataset, the training reports the validation perplexity.

By default, the training stops after 13 epochs (-end_epoch option) but you can also configure the training to stop when the learning rate goes below a threshold. Learning rate decay depends on the validation score. See the documentation for more details.

Did you check the -replace_unk option?

kingomalek · May 2, 2017, 10:57am

@guillaumekln thank u for the answers , I checked -replace_unk now , I read this that’s why I asked Implement Copy Mechanism

about the first question , I started the training yesterday it will takes about 5 days to finish the 13 epoch as I can predict . I will share the results with you later .

guillaumekln · May 2, 2017, 11:27am

You are correct, the copy mechanism as described in https://arxiv.org/pdf/1603.06393.pdf is not implemented. -replace_unk is a simple attention-based unknown words replacement.