A Theoretical & Empirical Analysis of Transformer Language Model Behavior

Roberts, Jesse Taylor Noah

A Theoretical & Empirical Analysis of Transformer Language Model Behavior

dc.creator	Roberts, Jesse Taylor Noah
dc.date.accessioned	2024-08-15T18:19:49Z
dc.date.available	2024-08-15T18:19:49Z
dc.date.created	2024-08
dc.date.issued	2024-05-16
dc.date.submitted	August 2024
dc.identifier.uri	http://hdl.handle.net/1803/19164
dc.description.abstract	This dissertation presents empirical and theoretical work aimed at enhancing the understanding of transformer-based Large Language Model (LLM) behaviors, with the empirical behaviors compared to established human behaviors. The dissertation introduces PopulationLM, a method employing systematic perturbations to generate model populations, facilitating the characterization of robust LLM cognitive behaviors. Using PopulationLM, the study replicates experiments on typicality and structural priming, demonstrating typicality effects in LLMs and the absence of structural priming in tested models. The dissertation examines human-like strategic behaviors in LLMs, highlighting models capable of value-based preference (VBP) and their responses in scenarios like the prisoner's (PD) and traveler's dilemmas (TD). Findings reveal that robust, VBP-capable LLMs may not exhibit certainty towards weakly dominated strategies, and align with human sensitivities to stake-size (PD) and penalty-size (TD). Moreover, the dissertation advocates for reproducible research, cautioning against reliance on closed-source models due to their lack of long-term reproducibility similar to important but privately held fossils. The theoretical contributions assert the Turing completeness of decoder-only transformers, while also identifying limitations when engaged in certain tasks, prompting exploration of alternative architectures for advancing artificial general intelligence.
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Decoder-only language models
dc.subject	LLM
dc.subject	PopulationLM
dc.subject	Cognitive Science
dc.subject	Machine Learning
dc.subject	Artificial Intelligence
dc.subject	Transformer
dc.subject	Turing Complete
dc.subject	Language Model Behavior
dc.subject	Language Model
dc.subject	Game Theory
dc.subject	Prisoner's Dilemma
dc.subject	Traveler's Dilemma
dc.subject
dc.title	A Theoretical & Empirical Analysis of Transformer Language Model Behavior
dc.type	Thesis
dc.date.updated	2024-08-15T18:19:49Z
dc.type.material	text
thesis.degree.name	PhD
thesis.degree.level	Doctoral
thesis.degree.discipline	Computer Science
thesis.degree.grantor	Vanderbilt University Graduate School
dc.creator.orcid	0000-0002-6210-0678
dc.contributor.committeeChair	Fisher, Doug

Files in this item

Name:: ROBERTS-DISSERTATION-2024.pdf
Size:: 1.820Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Electronic Theses and Dissertations
Electronic theses and dissertations of masters and doctoral students submitted to the Graduate School.

Show simple item record