Yasmin Moslem

Machine Translation Researcher.


Running TensorBoard with OpenNMT

TensorBoard is a tool that provides useful visualization of how training of a deep learning model is going on. It allows you to track and visualize metrics such as accuracy and perplexity. You can use TensorBoard in diverse deep learning frameworks such as TensorFlow and PyTorch. In this tutorial, you will learn how to activate TensorBoard in OpenNMT-tf and OpenNMT-py in different environments.

»

Bash Commands for NLP Engineers

As using Bash commands is inevitable if you work on NLP and MT tasks, I thought it would be useful to list the majority of commands I learnt to use on a daily base, thanks to practice, searching, and helpful colleagues I met over years. Obviously, this is not an exclusive list; however, I hope it includes most of the one-line Bash commands you would need. Please note the majority of these commands have been mainly tested on Linux.

»

Pre-trained Neural Machine Translation (NMT) Models

So I have published the latest versions of my in-domain OpenNMT-py models. These are in-domain models Neural Machine Translation (NMT) models, which means they are trained and tested only on specialised data, and they can act better than generic models for the specified “domain”. In other words, in-domain models can observe terminology and generate translations that are much more in line with the specialised context.

»

WER Score for Machine Translation

Word Error Rate (WER) computes the minimum Edit Distance between the human-generated sentence and the machine-predicted sentence. In other tutorials, I explained how to use Python to compute BLEU and Edit Distance, and this tutorial, I am going to explain how to calculate the WER score.

»

Computing BLEU Score for Machine Translation

In this tutorial, I am going to explain how I compute the BLEU score for the Machine Translation output using Python.

»

Domain Adaptation Techniques for Low-Resource Scenarios

Let’s imagine this scenario. You have a new Machine Translation project, and you feel excited. However, you have realized that your training corpus is too small. Now, you see that if you use such limited corpus, your machine translation model will be very poor, with many out-of-vocabulary words and maybe unidiomatic translations.

»

Domain Adaptation Experiment in Neural Machine Translation

Domain Adaptation is useful for specializing current generic Machine Translation models, mainly when the specialized corpus is too limited to train a separate model. Furthermore, Domain Adaptation techniques can be handy for low-resource languages that share vocabulary and structure with other rich-resource family languages.

»