Skip to content

D2KLab/StoryScore

Repository files navigation

StoryScore

StoryScore is a Python library for evaluating generated narrative summaries of scientific papers.

It computes a weighted score based on:

  • BERTScore similarity
  • Lexical recall
  • Title alignment
  • Repetition penalty
  • Hallucination penalty (PERSON/ORG NER overlap)

Installation

pip install storyscore

For local development:

pip install -e .
python -m spacy download en_core_web_sm

Quick Start

from storyscore import compute_story_score

payload = {
    "outline": [{"title": "Introduction"}],
    "sections": [{"title": "Introduction", "narrative": "This paper studies ..."}],
    "persona": "researcher",
    "paper_title": "Example Paper",
    "paper_markdown": "The paper studies ..."
}

result = compute_story_score(payload)
print(result["storyscore"])

Use A Subset Of Metrics

You can score with only selected metrics by passing enabled_metrics. Weights are automatically renormalized over the selected subset, so the final storyscore stays in [0, 1].

from storyscore import compute_story_score

payload = {
    "outline": [{"title": "Introduction"}],
    "sections": [{"title": "Introduction", "narrative": "This paper studies ..."}],
    "paper_markdown": "The paper studies ..."
}

result = compute_story_score(payload, enabled_metrics=["bertscore", "noloop"])
print(result["weights"])  # {'bertscore': 0.8, 'noloop': 0.2}
print(result["storyscore"])

For CLI usage, include enabled_metrics in the input JSON payload.

You can also import individual metrics directly:

from storyscore import nohallucination, noloop, lexical_recall

CLI Usage

You can also run StoryScore as a CLI by piping a JSON payload to stdin:

cat payload.json | storyscore

Build And Publish (PyPI)

python -m pip install --upgrade build twine
python -m build
python -m twine check dist/*
python -m twine upload dist/*

Notes

  • The spaCy model en_core_web_sm is required for hallucination scoring.
  • If the model is missing, install it with:
python -m spacy download en_core_web_sm

Credits

The main contributor of StoryScore is Alex Argese .

If you use this library, please cite the following paper:

Alex Argese, Pasquale Lisena, and Raphael Troncy. Hallucination or creativity: How to evaluate AI-generated scientific stories? In: Proceedings of the Text2Story’26 Workshop, CEUR-WS, March 29th, 2026, Delft, The Netherlands

PDF - BIB

About

Evaluating generated narrative summaries of scientific papers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors