Skip to content

[Synced Transcripts] Add timing fields to transcript entries #5321

Open
sztomek wants to merge 2 commits into
mainfrom
feat/synced-trans-model-parser-timing
Open

[Synced Transcripts] Add timing fields to transcript entries #5321
sztomek wants to merge 2 commits into
mainfrom
feat/synced-trans-model-parser-timing

Conversation

@sztomek
Copy link
Copy Markdown
Contributor

@sztomek sztomek commented May 21, 2026

Description

This PR is part of the Radical Speed Month initiative.
It lays the groundwork for synced transcripts by threading timing data through the transcript pipeline. TranscriptEntry.Text gains startTimeMs/endTimeMs fields (defaulting to -1L so all existing callers are unaffected). All parsers (WebVTT, SRT, JSON) now extract timing from their source formats, and the joinSplitSentences() sanitizer accumulates timing spans when merging fragments.
Also adds the SYNCED_TRANSCRIPTS feature flag (Free tier, debug/prototype default, Firebase remote + dev toggle).

Testing Instructions

Just review the code please.

SCR-20260521-nsyl

Checklist

  • If this is a user-facing change, I have added an entry in CHANGELOG.md
  • Ensure the linter passes (./gradlew spotlessApply to automatically apply formatting/linting)
  • I have considered whether it makes sense to add tests for my changes
  • All strings that need to be localized are in modules/services/localization/src/main/res/values/strings.xml
  • Any jetpack compose components I added or changed are covered by compose previews
  • I have updated (or requested that someone edit) the spreadsheet to reflect any new or changed analytics.

@sztomek sztomek added this to the 8.13 milestone May 21, 2026
@sztomek sztomek requested a review from a team as a code owner May 21, 2026 13:32
@sztomek sztomek requested review from Copilot and removed request for a team May 21, 2026 13:32
@sztomek sztomek added the [Type] Feature Adding a new feature. label May 21, 2026
@sztomek sztomek requested a review from geekygecko May 21, 2026 13:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR lays groundwork for synced transcripts by adding start/end timing (ms) to transcript text entries and threading that timing through subtitle parsers and transcript sanitization, gated behind a new feature flag.

Changes:

  • Added startTimeMs / endTimeMs fields (default -1L) to TranscriptEntry.Text.
  • Updated WebVTT/SRT parsing to attach cue timing to TranscriptEntry.Text, and JSON parsing to convert startTime/endTime into ms.
  • Updated transcript sanitization to accumulate timing spans when joining split sentences; added a new SYNCED_TRANSCRIPTS feature flag.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
modules/services/utils/src/main/java/au/com/shiftyjelly/pocketcasts/utils/featureflag/Feature.kt Adds the SYNCED_TRANSCRIPTS feature flag definition.
modules/services/model/src/main/java/au/com/shiftyjelly/pocketcasts/models/to/TranscriptEntry.kt Extends TranscriptEntry.Text with startTimeMs/endTimeMs (default -1L).
modules/services/repositories/src/main/java/au/com/shiftyjelly/pocketcasts/repositories/transcript/TranscriptParser.kt Threads timing through subtitle parsing and JSON parsing into transcript entries.
modules/services/repositories/src/main/java/au/com/shiftyjelly/pocketcasts/repositories/transcript/TranscripSanitization.kt Accumulates and propagates timing when joining sentence fragments.
modules/services/repositories/src/test/java/au/com/shiftyjelly/pocketcasts/repositories/transcript/WebVttParserTest.kt Updates expected entries to include parsed WebVTT timings.
modules/services/repositories/src/test/java/au/com/shiftyjelly/pocketcasts/repositories/transcript/SrtParserTest.kt Updates expected entries to include parsed SRT timings.

@sztomek sztomek force-pushed the feat/synced-trans-model-parser-timing branch from cd17406 to 95bd8f9 Compare May 25, 2026 14:00
@dangermattic
Copy link
Copy Markdown
Collaborator

1 Warning
⚠️ This PR is assigned to the milestone 8.13. The due date for this milestone has already passed.
Please assign it to a milestone with a later deadline or check whether the release for this milestone has already been finished.

Generated by 🚫 Danger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants