StoryScore

StoryScore is a Python library for evaluating generated narrative summaries of scientific papers.

It computes a weighted score based on:

BERTScore similarity
Lexical recall
Title alignment
Repetition penalty
Hallucination penalty (PERSON/ORG NER overlap)

Installation

pip install storyscore

For local development:

pip install -e .
python -m spacy download en_core_web_sm

Quick Start

from storyscore import compute_story_score

payload = {
    "outline": [{"title": "Introduction"}],
    "sections": [{"title": "Introduction", "narrative": "This paper studies ..."}],
    "persona": "researcher",
    "paper_title": "Example Paper",
    "paper_markdown": "The paper studies ..."
}

result = compute_story_score(payload)
print(result["storyscore"])

Use A Subset Of Metrics

You can score with only selected metrics by passing enabled_metrics. Weights are automatically renormalized over the selected subset, so the final storyscore stays in [0, 1].

from storyscore import compute_story_score

payload = {
    "outline": [{"title": "Introduction"}],
    "sections": [{"title": "Introduction", "narrative": "This paper studies ..."}],
    "paper_markdown": "The paper studies ..."
}

result = compute_story_score(payload, enabled_metrics=["bertscore", "noloop"])
print(result["weights"])  # {'bertscore': 0.8, 'noloop': 0.2}
print(result["storyscore"])

For CLI usage, include enabled_metrics in the input JSON payload.

You can also import individual metrics directly:

from storyscore import nohallucination, noloop, lexical_recall

CLI Usage

You can also run StoryScore as a CLI by piping a JSON payload to stdin:

cat payload.json | storyscore

Build And Publish (PyPI)

python -m pip install --upgrade build twine
python -m build
python -m twine check dist/*
python -m twine upload dist/*

Notes

The spaCy model en_core_web_sm is required for hallucination scoring.
If the model is missing, install it with:

python -m spacy download en_core_web_sm

Credits

The main contributor of StoryScore is Alex Argese .

If you use this library, please cite the following paper:

Alex Argese, Pasquale Lisena, and Raphael Troncy. Hallucination or creativity: How to evaluate AI-generated scientific stories? In: Proceedings of the Text2Story’26 Workshop, CEUR-WS, March 29th, 2026, Delft, The Netherlands

PDF - BIB

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src/storyscore		src/storyscore
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
storyscore2026argese.bib		storyscore2026argese.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StoryScore

Installation

Quick Start

Use A Subset Of Metrics

CLI Usage

Build And Publish (PyPI)

Notes

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

StoryScore

Installation

Quick Start

Use A Subset Of Metrics

CLI Usage

Build And Publish (PyPI)

Notes

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages