OpenNMT-Py -phrase-table Translation Option

I have implemented the translation option -phrase_table into the OpenNMT-py version, and today it has been merged into the repository. The -phrase_table was already documented from the Lua version but was not implemented in the PyTorch version.

Description

If the -phrase_table translation argument is provided (with -replace_unk), it will look up the identified source token and give the corresponding target token. If it is not provided (or the identified source token does not exist in the table), then it will copy the source token. Tested with both translate.py and server.py (with conf.json).

The default behaviour of the -replace_unk option is substituting (for an unknown word) with the source word that has the highest attention weight. Adding the option -phrase_table as well, it will look up in the phrase table file for a possible translation instead. If a valid replacement is not found, only then the source token will be copied.

The phrase table file should include a single translated word (token) per line in the format:

source|||target

Example with translate.py

python3 OpenNMT-py/translate.py -model available_models/my.model_step_100000.pt -src source.txt -output prep.txt -replace_unk -phrase_table phrase-table.txt

Example with server.py

python3 OpenNMT-py/server.py --ip "0.0.0.0" --port 5000 --url_root "/translator" --config available_models/conf.json
curl -i -X POST -H "Content-Type: application/json" -d '[{"src": "this is a test for model 100", "id": 100}]' http://127.0.0.1:5000/translator/translate

… where conf.json is:

{
    "models_root": "/home/available_models",
    "models": [
        {
        "id": 100,
        "model": "my.model_step_100000.pt",
        "timeout": 600,
        "on_timeout": "to_cpu",
        "load": true,
        "opt": {
            "beam_size": 1,
            "replace_unk": true,
            "phrase_table": "/home/available_models/phrase-table.txt"
         }
      }
   ]
}

You can find more details about the options at:
http://opennmt.net/OpenNMT-py/options/translate.html?highlight=phrase%20table

You can also refer to this page – from the Lua version, but the concept is the same:
http://opennmt.net/OpenNMT/translation/unknowns/

If you tried the new OpenNMT-py -phrase_table option and got feedback, please let me know.


Rating: 5.0/5. From 1 vote.
Please wait...

2 Replies to “OpenNMT-Py -phrase-table Translation Option”

  1. This is really a useful option while using the model which gives at the locations in the translated target. Presently, while working on the PoC for GERMAN-ENGLISH using OpenNMT, I have faced an umpteen number of issues. But this issue was a long pending issue, which is right now resolved.

    Going forward if more articles on NLP in correspondence with OpenNMT is published then it would be more helpful for beginners like me. In fact, it would be more enlightening.

    GOD BLESS YOU.

    Rating: 5.0/5. From 1 vote.
    Please wait...
    1. Hi Kishor! Thanks for your comment! Glad it was helpful.

      No votes yet.
      Please wait...

Leave a Reply

Your email address will not be published. Required fields are marked *