New Preprint: MuseRAG: Idea Originality Scoring At Scale

We’ve just released our new preprint: MuseRAG: Idea Originality Scoring At Scale.

Abstract: An objective, face-valid way to assess the originality of creative ideas is to measure how rare each idea is within a population—an approach long used in creativity research but difficult to automate at scale. Tabulating response frequencies via manual bucketing of idea rephrasings is labor-intensive, error-prone, and brittle under large corpora. We introduce a fully automated, psychometrically validated pipeline for frequency-based originality scoring. Our method, MuseRAG, combines large language models (LLMs) with an externally orchestrated retrieval-augmented generation (RAG) framework. Given a new idea, the system retrieves semantically similar prior idea buckets and zero-shot prompts the LLM to judge whether the new idea belongs to an existing bucket or forms a new one. The resulting buckets enable computation of frequency-based originality metrics. Across five datasets (N = 1143, n_ideas = 16294), MuseRAG matches human annotators in idea clustering structure and resolution (AMI = 0.59) and in participant-level scoring (r = 0 .89)—while exhibiting strong convergent and external validity. Our work enables intent-sensitive, human-aligned originality scoring at scale to aid creativity research.

Preprint: https://arxiv.org/pdf/2505.16232

Previous
Previous

Paper accepted at Nature Humanities and Social Sciences Communications

Next
Next

Poster Accepted at the International Conference for Computational Social Science (IC2S2) 2025