Xiaoman Pan

Sample logotype

潘小满
email: xiaomanpan [at] tencent [dot] com

He / him / his

My vita.
My publications.

I am currently a researcher at Tencent AI Lab, Bellevue, WA.

I completed my Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign, supervised by Prof. Heng Ji.

My research interests lie in the fields of Machine Learning and Natural Language Processing, with a recent emphasis on aligning large language models.


Tools

AMR Reader
Wikimedia dumps processors


Publications

[33]
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models.
Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Tongshuang Wu, Jianshu Chen.
On ArXiv.
[ paper ]

[32] Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment.
Rui Yang*, Xiaoman Pan*, Feng Luo*, Shuang Qiu*, Han Zhong, Dong Yu, Jianshu Chen.
On ArXiv.
[ paper ]

[31] Chain-of-note: Enhancing robustness in retrieval-augmented language models.
Wenhao Yu, Hongming Zhang, Xiaoman Pan, Kaixin Ma, Hongwei Wang, Dong Yu.
On ArXiv.
[ paper ]

[30] Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention.
Kaiqiang Song, Xiaoyang Wang, Sangwoo Cho, Xiaoman Pan, Dong Yu.
On ArXiv.
[ paper ]

[29] From language modeling to instruction following: Understanding the behavior shift in llms after instruction tuning.
Xuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu.
On ArXiv.
[ paper ]

[28] Laser: Llm agent with state-space exploration for web navigation.
Kaixin Ma, Hongming Zhang, Hongwei Wang, Xiaoman Pan, Dong Yu.
On ArXiv.
[ paper ]

[27] Skills-in-context prompting: Unlocking compositionality in large language models.
Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen.
On ArXiv.
[ paper ]

[26] Mint: Boosting generalization in mathematical reasoning via multi-view fine-tuning.
Zhenwen Liang, Dian Yu, Xiaoman Pan, Wenlin Yao, Qingkai Zeng, Xiangliang Zhang, Dong Yu.
On ArXiv.
[ paper ]

[25] Thrust: Adaptively Propels Large Language Models with External Knowledge.
Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu Chen.
Advances in Neural Information Processing Systems.
[ paper ]

[24] OpenFact: Factuality Enhanced Open Knowledge Extraction.
Linfeng Song, Ante Wang, Xiaoman Pan, Hongming Zhang, Dian Yu, Lifeng Jin, Haitao Mi, Jinsong Su, Yue Zhang, Dong Yu.
Transactions of the Association for Computational Linguistics.
[ paper ]

[23] PIVOINE: Instruction Tuning for Open-world Entity Profiling.
Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, Jianshu Chen.
In Findings of the Association for Computational Linguistics: EMNLP 2023.
[ paper ]

[22] How do Words Contribute to Sentence Semantics? Revisiting Sentence Embeddings with a Perturbation Method.
Wenlin Yao, Lifeng Jin, Hongming Zhang, Xiaoman Pan, Kaiqiang Song, Dian Yu, Dong Yu, Jianshu Chen.
Proc. the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)
[ paper ]

[21] Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models.
Xiaoman Pan, Wenlin Yao, Hongming Zhang, Dian Yu, Dong Yu, Jianshu Chen.
Proc. International Conference on Learning Representations (ICLR), 2023.
[ paper ]

[20] OASum: Large-Scale Open Domain Aspect-based Summarization.
Xianjun Yang, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Xiaoman Pan, Linda Petzold, Dong Yu.
In Findings of the Association for Computational Linguistics: ACL 2023.
[ paper | GitHub ]

[19] Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks.
Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, Heng Ji.
In Findings of the Association for Computational Linguistics: ACL 2023.
[ paper | GitHub ]

[18] ZeroKBC: A Comprehensive Benchmark for Zero-Shot Knowledge Base Completion.
Pei Chen, Wenlin Yao, Hongming Zhang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen.
Proc. The 22nd IEEE International Conference on Data Mining (ICDM).
[ paper | GitHub ]

[17] C-MORE: Pretraining to answer open-domain questions by consulting millions of references.
Xiang Yue, Xiaoman Pan, Wenlin Yao, Dian Yu, Dong Yu, Jianshu Chen.
Proc. The 60th Annual Meeting of the Association for Computational Linguistics (ACL2022).
[ paper | GitHub ]

[16] Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories.
Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, Dong Yu.
Proc. The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP2021).
[ paper | GitHub ]

[15] RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System.
Haoyang Wen, Ying Lin, Tuan M. Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi R. Fung, Piyush Mishra, Qing Lyu, Dídac Surís, Brian Chen, Susan W. Brown, Martha Palmer, Chris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang and Heng Ji.
Proc. The 2021 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT2021) Demo Track.
[ paper | system ]

[14] GAIA: A Fine-grained Multimedia Knowledge Extraction System.
Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski and Marjorie Freedman.
Proc. The 58th Annual Meeting of the Association for Computational Linguistics (ACL2020) Demo Track (Best Demo Paper).
[ paper | system ]

[13] Improving Question Answering with External Knowledge.
Xiaoman Pan*, Kai Sun*, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie and Dong Yu.
Proc. EMNLP2019 Workshop on Machine Reading for Question Answering.
[ paper ]

[12] Cross-lingual Joint Entity and Word Embedding to Improve Entity Linking and Parallel Sentence Mining.
Xiaoman Pan, Thamme Gowda, Heng Ji, Jonathan May and Scott Miller.
Proc. EMNLP2019 Workshop on Deep Learning for Low-Resource Natural Language Processing.
[ paper ]

[11] Describing a Knowledge Base.
Qingyun Wang, Xiaoman Pan, Lifu Huang, Boliang Zhang, Zhiying Jiang, Heng Ji and Kevin Knight.
Proc. The 11th International Conference on Natural Language Generation.
[ paper ]

[10] ELISA-EDL: A Cross-lingual Entity Extraction, Linking and Localization System.
Boliang Zhang, Ying Lin, Xiaoman Pan, Di Lu, Jonathan May, Kevin Knight and Heng Ji.
Proc. The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT2018) Demo Track.
[ paper | demo ]

[9] Cross-lingual Name Tagging and Linking for 282 Languages.
Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight and Heng Ji.
Proc. The 55th Annual Meeting of the Association for Computational Linguistics (ACL2017).
[ paper | resources ]

[8] Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems.
Lifu Huang, Jonathan May, Xiaoman Pan, Heng Ji, Xiang Ren, Jiawei Han, Lin Zhao and James Hendler.
Big Data, Mar 2017, 5(1): 19-31.
[ paper ]

[7] Bitext Name Tagging for Cross-lingual Entity Annotation Projection.
Dongxu Zhang, Boliang Zhang, Xiaoman Pan, Xiaocheng Feng, Heng Ji, Weiran Xu.
Proc. The 26th International Conference on Computational Linguistics (COLING 2016).
[ paper ]

[6] The Gun Violence Database: A new task and data set for NLP.
Ellie Pavlick, Heng Ji, Xiaoman Pan, Chris Callison-Burch.
Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP 2016).
[ paper ]

[5] Leveraging Entity Linking and Related Language Projection to Improve Name Transliteration.
Ying Lin, Xiaoman Pan, Aliya Deri, Heng Ji, Kevin Knight.
Proc. ACL2016 Workshop on Named Entities.
[ paper | system ]

[4] A Multi-media Approach to Cross-lingual Entity Knowledge Transfer.
Di Lu, Xiaoman Pan, Nima Pourdamghani, Shih-Fu Chang, Heng Ji, Kevin Knight.
Proc. The 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).
[ paper ]

[3] CAMR at SemEval-2016 Task 8: An Extended Transition-based AMR Parser.
Chuan Wang, Sameer S Pradhan, Xiaoman Pan, Heng Ji, Nianwen Xue.
Proc. NAACL-HLT 2016 Workshop on Semantic Evaluation (SemEval-2016).
[ paper | GitHub ]

[2] Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning.
Boliang Zhang, Xiaoman Pan, Tianlu Wang, Ashish Vaswani, Heng Ji, Kevin Knight, Daniel Marcu.
Proc. The 2016 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL-HLT 2016).
[ paper ]

[1] Unsupervised Entity Linking with Abstract Meaning Representation.
Xiaoman Pan, Taylor Cassidy, Ulf Hermjakob, Heng Ji, Kevin Knight.
Proc. The 2015 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015).
[ paper | demo | GitHub ]

System Descriptions
A Baseline Fine-Grained Entity Extraction System for TAC-KBP2019.
Ying Lin, Xiaoman Pan, Manling Li and Heng Ji.
Proc. Text Analysis Conference (TAC2019).
[ paper ]

GAIA at SM-KBP 2019 - A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System.
Manling Li, Ying Lin, Ananya Subburathinam, Spencer Whitehead, Xiaoman Pan, Di Lu, Qingyun Wang, Tongtao Zhang, Lifu Huang, Heng Ji, Alireza Zareian, Hassan Akbari, Brian Chen, Bo Wu, Emily Allaway,Shih-Fu Chang, Kathleen McKeown, Yixiang Yao, Jennifer Chen, Eric Berquist, Kexuan Sun, Xujun Peng, Ryan GabbardMarjorie Freedman, Pedro Szekely, T.K. Satish Kumar, Arka Sadhu, Ram Nevatia, Miguel Rodriguez, Yifan Wang, Yang Bai, Ali Sadeghian, Daisy Zhe Wang.
Proc. Text Analysis Conference (TAC2019).
[ paper ]

ELISA System Description for LoReHLT 2019.
Ying Lin, Xiaoman Pan, Di Lu, Lifu Huang, Tongtao Zhang, Heng Ji, et al.
Proc. LoReHLT2019.
[ paper ]

GAIA - A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System.
Tongtao Zhang, Ananya Subburathinam, Ge Shi, Lifu Huang, Di Lu, Xiaoman Pan, Manling Li, Boliang Zhang, Qingyun Wang, Spencer Whitehead, Heng Ji, Alireza Zareian, Hassan Akbari, Brian Chen, Ruiqi Zhong, Steven Shao, Emily Allaway, Shih-Fu Chang, Kathleen McKeown, Dongyu Li, Xin Huang, Xujun Peng, Ryan Gabbard, Marjorie Freedman, Ali Sadeghian, Mayank Kejriwal, Ram Nevatia, Pedro Szekely, Ali Sadeghian and Daisy Zhe Wang.
Proc. Text Analysis Conference (TAC2018).
[ paper ]

Overview of TAC-KBP2017 13 Languages Entity Discovery and Linking.
Heng Ji, Xiaoman Pan, Boliang Zhang, Joel Nothman, James Mayfield, Paul McNamee and Cash Costello.
Proc. Text Analysis Conference (TAC2017).
[ paper ]

TinkerBell: Cross-lingual Cold-Start Knowledge Base Construction.
Mohamed Al-Badrashiny, Jason Bolton, Arun Tejavsi Chaganty, Kevin Clark, Craig Harman, Lifu Huang, Matthew Lamm, Jinhao Lei, Di Lu, Xiaoman Pan, Ashwin Paranjape, Ellie Pavlick, Haoruo Peng, Peng Qi, Pushpendre Rastogi, Abigail See, Kai Sun, Max Thomas, Chen-Tse Tsai, Hao Wu, Boliang Zhang, Chris Callison-Burch, Claire Cardie, Heng Ji, Christopher Manning, Smaranda Muresan, Owen C. Rambow, Dan Roth, Mark Sammons, Benjamin Van Durme.
Proc. Text Analysis Conference (TAC2017).
[ paper ]

RPI BLENDER TAC-KBP2017 13 Languages EDL System.
Boliang Zhang, Xiaoman Pan, Ying Lin, Tongtao Zhang, Kevin Blissett, Samia Kazemi, Spencer Whitehead, Lifu Huang and Heng Ji.
Proc. Text Analysis Conference (TAC2017).
[ paper ]

ELISA System Description for LoReHLT 2017.
Leon Cheung, Thamme Gowda, Ulf Hermjakob, Nelson Liu, Jonathan May, Alexandra Mayn, Nima Pourdamghani, Michael Pust, Kevin Knight, Nikolaos Malandrakis, Pavlos Papadopoulos, Anil Ramakrishna, Karan Singla, Victor Martinez, Colin Vaz, Dogan Can, Shrikanth Narayanan, Kenton Murray, Toan Nguyen, David Chiang, Xiaoman Pan, Boliang Zhang, Ying Lin, Di Lu, Lifu Huang, Kevin Blissett, Tongtao Zhang, Heng Ji, Ondrej Glembek, Murali Karthick Baskar, Santosh Kesiraju, Lukas Burget, Karel Benes, Igor Szoke, Karel Vesely, Jan Honza Cernocky, Camille Goudeseune, Mark Hasegawa Johnson, Leda Sari, Wenda Chen and Angli Liu.
NIST LoReHLT 2017 Workshop.
[ paper ]

Team ELISA System for DARPA LORELEI Speech Evaluation 2016.
Pavlos Papadopoulos, Ruchir Travadi, Colin Vaz, Nikolaos Malandrakis, Ulf Hermjakob, Nima Pourdamghani, Michael Pust, Boliang Zhang, Xiaoman Pan, Di Lu, Ying Lin, Ondrej Glembek, Murali Karthick B, Martin Karafiat, Lukas Burget, Mark Hasegawa-Johnson, Heng Ji, Jonathan May, Kevin Knight and Shrikanth Narayanan.
Proc. Interspeech2017.
[ paper ]

ELISA System Description for LoReHLT 2016.
Ulf Hermjakob, Qiang Li, Jonathan May, Sebastian Mielke, Nima Pourdamghani, Michael Pust, Xing Shi, Kevin Knight, Daniel Marcu, Nikolaos Malandrakis, Anil Ramakrishna, Victor Martinez, Elisabeth Staruk, Tanner Sorensen, Dogan Can, Shrikanth Narayanan, Tomer Levinboim, Kenton Murray, David Chiang, Boliang Zhang, Xiaoman Pan, Di Lu, Lifu Huang, Xiaocheng Feng, Heng Ji.
NIST LoReHLT 2016 Workshop.
[ paper ]

RPI BLENDER TAC-KBP2016 System Description.
Dian Yu, Xiaoman Pan, Boliang Zhang, Lifu Huang, Di Lu, Spencer Whitehead, Heng Ji.
Proc. Text Analysis Conference (TAC 2016).
[ paper ]

RPI BLENDER TAC-KBP2015 System Description.
Yu Hong, Di Lu, Dian Yu, Xiaoman Pan, Xiaobin Wang, Yadong Chen, Heng Ji.
Proc. Text Analysis Conference (TAC 2015).
[ paper ]

RPI-Soochow KBP2014 System Description.
Yu Hong, Xiaobin Wang, Yadong Chen, Jian Wang, Tongtao Zhang, Jin Zheng, Dian Yu, Qi Li, Han Wang, Xiaoman Pan, Heng Ji.
Proc. Text Analysis Conference (TAC 2014).
[ paper ]