Skip to content

CaroHolt/GeoTemp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Around the World in 24 Hours: Probing LLM Knowledge of Time and Place

arxiv preprint

Reasoning over time and space is essential for understanding our world. However, the abilities of language models in this area are largely unexplored as previous work has tested their abilities for logical reasoning in terms of time and space in isolation or only in simple or artificial environments. In this paper, we present the first evaluation of the ability of language models to jointly reason over time and space. To enable our analysis, we create GEOTEMP, a dataset of 320k prompts covering 289 cities in 217 countries and 37 time zones. Using GEOTEMP, we evaluate eight open chat models of three different model families for different combinations of temporal and geographic knowledge. We find that most models perform well on reasoning tasks involving only temporal knowledge and that overall performance improves with scale. However, performance remains constrained in tasks that require connecting temporal and geographical information. We do not find clear correlations of performance with specific geographic regions. Instead, we find a significant performance increase for location names with low model perplexity, suggesting their repeated occurrence during model training. We further demonstrate that their performance is heavily influenced by prompt formulation – a direct injection of geographical knowledge leads to performance gains, whereas, surprisingly, techniques like chain-of-thought prompting decrease performance on simpler tasks.

This is a joint work by Carolin Holtermann, Paul Röttger, and Anne Lauscher. For further details, feel free to check out our paper.

Getting Started

We conducted all our experiments with Python 3.10. Before getting started, make sure you install the requirements listed in the requirements.txt file.

pip install -r requirements.txt

Repository Description

This repository contains all code and data needed to reproduce the experiments and results reported in our paper. All data files can be found in the data folder, while all relevant code files can be found in the code folder, both with corresponding readme files. Since the dataset is too large, we cannot publish the files on GitHub directly but provide an LSF storage.

Contact Information

If you have any questions, feel free to reach out: carolin.holtermannl@uni-hamburg.de


License

This project is licensed under the CC-BY-4.0 License - see the LICENSE.md file for details

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages