Resume
I am a final year Alibaba Ph.D. candidate, jointly advised by Prof. Soujanya Poria from Singapore University of Technology and Design, and Dr. Lidong Bing from Alibaba DAMO Academy. My goal is to build artificially intelligent systems that can process and reason over diverse sources and modalities of information in order to aid humans in practical tasks. Concretely, I’m excited to work on applications involving large language models, reasoning, multimodality, retrieval-augmented generation, or information extraction. My research works in these areas currently have more than 900 citations.
Education
Singapore University of Technology and Design
- Ph.D. in Information Systems Technology and Design, Jan 2025 (expected)
- B.Sc. in Information Systems Technology and Design, Sep 2020
Publications
- Yew Ken Chia, Guizhen Chen, Weiwen Xu, Luu Anh Tuan, Soujanya Poria, Lidong Bing, “Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths”, In EMNLP Findings 2024.
- Yew Ken Chia, Hui Chen, Wei Han, Guizhen Chen, Sharifah Mahani Aljunied, Soujanya Poria, Lidong Bing, “Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction”, In EMNLP 2024 SiCon Workshop.
- Yew Ken Chia, Vernon Toh Yan Han, Deepanway Ghosal, Lidong Bing, Soujanya Poria, “Puz- zleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns”, In ACL Findings 2024.
- Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Zhiqiang Hu, Chenhui Shen, Yew Ken Chia, Xingxuan Li, Jianyu Wang, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen Yang, Chaoqun Liu, Hang Zhang, Lidong Bing, “SeaLLMs - Large Language Models for Southeast Asia”, In ACL 2024.
- Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria, “InstructEval: Towards Holistic Evaluation of Instruction-Tuned Large Language Models”, In EACL 2024 Scale-LLM Workshop (Best Paper Award, 525 Stars on GitHub).
- Xingxuan Li, Ruochen Zhao, Yew Ken Chia*, Bosheng Ding, Shafiq Joty and Soujanya Poria, Lidong Bing, “Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources”, In ICLR 2023.
- Wenxuan Zhang, Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing, “M3Exam: A Mul- tilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models”, In NeurIPS 2023 Datasets and Benchmarks.
- Bosheng Ding, Chengwei Qin, Linlin Liu, Yew Ken Chia, Boyang Li, Shafiq Joty, Lidong Bing, “Is GPT-3 a Good Data Annotator?”, In ACL 2023.
- Yew Ken Chia, Lidong Bing, Sharifah Mahani Aljunied, Luo Si, Soujanya Poria, “A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach”, In EMNLP 2022.
- Yew Ken Chia, Lidong Bing, Soujanya Poria, Luo Si, “RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction”, In ACL Findings 2022.
- Lu Xu, Yew Ken Chia, Lidong Bing, “Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction”, In ACL 2021.
- Yew Ken Chia, Sam Witteveen, Martin Andrews, “Transformer to CNN: Label-scarce distillation for efficient text classification”, In NIPS 2018 CDNNRIA Workshop.
Preprints
- Yew Ken Chia, Liying Cheng, Hou Pong Chan, Chaoqun Liu, Maojia Song, Mahani Aljunied, Soujanya Poria, Lidong Bing, “M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework”, In Arxiv 2024.
- Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Deli Zhao, Lidong Bing, “Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions”, In Arxiv 2024.
- Yew Ken Chia, Qi Sun, Lidong Bing, Soujanya Poria, “Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models”, In Arxiv 2024.
Work experience
- Jan 2021 - Jan 2025: NLP Researcher at Alibaba DAMO Academy
- SeaLLMs: Constructed reasoning-based instruction data with contrastive training pipeline for Southeast Asian large language models. The performance on commonsense reasoning tasks increased by up to 11.6% compared to base models, surpassing GPT-3.5-Turbo on multiple tasks.
- M3Exam: Conducted multimodal evaluation for Southeast Asian languages. The benchmark was featured in the OpenAI GPT-4o release for multilingual and multimodal capabilities.
- Oct 2018 - Dec 2020: Deep Learning Researcher at Red Dragon AI
- Label-scarce Distillation for Efficient Text Classification: Developed novel framework for low-resource NLP through transfer learning and model distillation. Achieved 0.9% accuracy improvement over state-of-the-art baseline, 300x inference speedup, and 39x smaller model size.
- Language Model Assisted Explanation Regeneration: Developed novel retrieval framework for efficient multi-hop ranking with language models. Achieved 7.5% improvement over previous baselines.
- May 2018 - Sep 2018: Machine Learning Intern at Handshakes
- Implemented language models based on recurrent and convolutional architectures for sentiment analysis of customer reviews. Achieved 2.2% accuracy improvement over previous baseline.
Skills
- Programming Languages: Python, LATEX, HTML, Javascript
- Frameworks: PyTorch, TensorFlow
- Communication: English, Chinese
Skills
- Alibaba Talent Program Ph.D. Scholarship, 2021 - 2025
- 2nd/100, IMDA Code XtremeApps Hackathon, 2019
- 3rd/5000, EY NextWave Data Science Challenge, 2019
- Top 10/200, Google Fake News Must Die Hackathon, 2017