I’m a Data Science PhD student at New York University and a member of ML² Group at CILVR, co-advised by Sam Bowman and Kyunghyun Cho. I’m broadly interested in deep learning, natural language understanding, and information retrieval. These days, I mainly work on applying transfer learning and multi-task learning methods to NLP problems, and analyzing these methods to understand why and when they work/fail.

Previously, I interned at Facebook AI Research and Grammarly.

Prior to NYU, I developed Information Retrieval systems at Institute of High Performance Computing in Singapore. Before that, I earned my bachelor’s degree in Computer Science at Nanyang Technological University.


* equal contribution

  • Online Hyperparameter Tuning for Multi-Task Learning
    Phu Mon Htut*, Owen Marschall*, Samuel R. Bowman, Douwe Kiela, Edward Grefenstette, Cristina Savin, Kyunghyun Cho.
    The Workshop on Continual Learning, ICML. 2020.

  • Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?
    Yada Pruksachatkun*, Jason Phang*, Haokun Liu*, Phu Mon Htut*, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman.
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL). 2020.

  • English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too
    Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman.
    Preprint. 2020.

  • Do Attention Heads in BERT Track Syntactic Dependencies?
    Phu Mon Htut*, Jason Phang*, Shikha Bordia*, and Samuel R. Bowman.
    Natural Language, Dialog and Speech (NDS) Symposium, The New York Academy of Sciences. 2019. (Extended Abstract).
    [Paper] [Poster]

  • Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs.
    Alex Warstadt*, Yu Cao*, Ioana Grosu*, Wei Peng*, Hagen Blix*, Yining Nie*, Anna Alsop*, Shikha Bordia*, Haokun Liu*, Alicia Parrish*, Sheng-Fu Wang*, Jason Phang*, Anhad Mohananey*, Phu Mon Htut*, Paloma Jeretic* and Samuel R. Bowman.
    Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2019.

  • The Unbearable Weight of Generating Artificial Errors for Grammatical Error Correction.
    Phu Mon Htut, Joel Tetreault.
    The Workshop on Innovative Use of NLP for Building Educational Applications (BEA), ACL. 2019.

  • Inducing Constituency Trees through Neural Machine Translation.
    Phu Mon Htut, Kyunghyun Cho, Samuel R. Bowman.
    Preprint. 2019.

  • Generalized Inner Loop Meta-Learning.
    Edward Grefenstette, Brandon Amos, Denis Yarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, Douwe Kiela, Kyunghyun Cho, Soumith Chintala.
    Preprint. 2019.

  • Grammar Induction with Neural Language Models: An Unusual Replication.
    Phu Mon Htut, Kyunghyun Cho, Samuel R. Bowman.
    Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018.
    Proceedings of the Workshop on the Analysis and Interpretation of Neural Networks for NLP (Blackbox-NLP). 2018. (Extended abstract)
    [Paper] [arXiv] [Code/Output-Parses]

  • Training a Ranking Function for Open-Domain Question Answering.
    Phu Mon Htut, Samuel R. Bowman, Kyunghyun Cho.
    Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): Student Research Workshop. 2018.
    [Paper] [arXiv] [Poster]


  • DS-GA 1012: Natural Language Understanding and Computational Semantics (Spring-2020)
  • DS-GA 1011: Natural Language Processing with Representation Learning (Fall-2018)



  • You can refer to me as she/they.

  • I’m originally from Yangon, Myanmar(Burma).

  • You can call me “Phu” (h sound is silent for both my first and last name). Fun fact: I don’t have a family name.