blnewman at cs dot washington dot edu
Github | Google Scholar | Blog
Hello! I'm Ben Newman, currently a PhD student at the Allen School at the University of Washington, advised by Yejin Choi. I'm interested in building computational systems that reliably process human language and studying the roles emerging language technologies can play in society more broadly.
I've worked on projects analyzing models' abilities to extrapolate, process syntax, and communicate. I've also thought about how language technologies can usefully augment human language abilities, both for scientific discovery (e.g. testing psycholinguistic hypotheses) and in political science (e.g. countering minsinformation). I've been a course assistant for two of Stanford's NLP classes, CS124 and CS224N, and have co-taught courses in Introductory Linguistics and Computing Fundamentals to high schoolers at Stanford Splash. I was previously a Pre-doctoral Young Investigator at Semantic Scholar Research and a member of the Stanford NLP group.
A Question Answering Framework for Decontextualizing User-facing Snippets from Scientific Documnts
Benjamin Newman, Luca Soldaini, Raymond Fok, Arman Cohan, Kyle Lo
EMNLP 2023
[pdf][tldr]
We propose a question-answering framework for decontextualizing snippets from scientific documents and investigate the performance of state-of-the-art LLMs under this framework.
Conducted during a PYIship at Semantic Scholar
Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication
Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
CHI 2023
[pdf][tldr]
LLMs can best help as email writing assistants for legislative staff members when they provide message-level suggestions compared to sentence-level ones.
P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts
Benjamin Newman, Prafulla Kumar Choubey, Nazneen Rajani
ICLR 2022 (Poster)
[pdf][tldr]
Querying Pretrained NLP models with semantically equivalent prompts (e.g. "[MASK] is Canada's capital" and "Canada, which has capital city [MASK]") do not necessarily lead to the same predictions. Can low-parameter fine-tuning methods increase model consistency? Yes! And they don't need as many annotations as other methods.
Conducted during an internship at Salesforce Research
Refining Targeted Syntactic Evaluation
Benjamin Newman, Kai Siang-Ang, Julia Gong, John Hewitt
NAACL 2021
[pdf] [code] [cite] [blog] [tldr]
How should we evaluate the syntactic understanding of NLP systems? We argue for evaluating models' likely behavior and systematicty, and build off of minimal pair evaluation to address these goals. We find models are better at conjugating verbs they deem likely.
The EOS Decision and Length Extrapolation
Benjamin Newman, John Hewitt, Percy Liang and Chris Manning
Blackbox NLP@EMNLP 2020 (Outstanding Paper)
Why do sequence models struggle to extrapolate? For many reasons, but the decision to train models with End of Sequence tokens at the end of training examples is one of them. We investigate and visualize the effect that this decision has on neural models' extrapolative abilities.
Communication-based Evaluation for Natural Language Generation
Benjamin Newman, Reuben Cohn-Gordon, and Christopher Potts
Society for Computation in Linguistics@LSA 2020
Do n-gram overlap metrics like BLEU capture whether the models are successful communicators? Not really, so we introduce a new method for evaluating communicative effectiveness based on the Rational Speech Acts framework.
Conducted during CS224U and the Center for the Study of Language and Information (CSLI) summer internship.
Holistic Evaluation of Language Models
Bommasani et al., 2021
Disinformation Evaluation: Led the design, implementation, and analysis of the disinformation scenarios and metrics, including leading the human evaluation for disinformation.
[pdf]
On the Opportunities and Risks of Foundation Models
Bommasani et al., 2021
Misuse Section: Antoine Bousselut*, Shelby Grossman*, and Benjamin Newman [pdf]
Unsupervised Recovery of Tree Metrics from Learned Representations
Representations from pretrained language models likely incorporate syntax, but can we access it without training supervised probes? [pdf]
CS229: Machine Learning. Final Project (2019).
English-Chinese Name Machine Transliteration Using Search and Neural Networks
What's your name in Chinese? Translating a name is different from translating an article because name translations are based in phoenetics and lack large corpera. We explore two approaches. [pdf] [code]
CS221: Artificial Intelligence: Principles and Techniques: Final Project with Julia Gong (2018).
Using POMDPs to Learn Language in a Spatial Reference Game
How can you teach computational agents to follow directions without defining what each instruction means? POMDPs! [pdf] [code]
CS238: Decision Making under Uncertainty: Final Project with Suvir Mirchandani and Levi Lian (2018)
Swear Analysis
What we can learn about people's use of swears by looking at their word2vec and GLOVE embeddings? [pdf]
Linguist 130A: Semantics and Pragmatics: Final Project with Julia Gong (2018)
Zero Width Space Encrypter
Hiding secret messages in HTML zero-width space characters. Demo here!