Skip to content

Conversation

@joelpurra
Copy link

  • May add the putio api language code.
  • Handles arbitrary subtitle extensions, including:
    • .srt/application/x-subrip: current default.
    • .vtt/text/vtt: available in the putio api.

Based on original code in #23 by @berkanteber.

Closes #22.

See

- May add the putio api language code.
  - Ignored if the language code is not a valid language.
  - Only included if the filename does not already include the same language code, as detectable by Kodi.
  - Appends the language code to the file's _name_, before the file's extension (putdotio#22).
- Handles arbitrary subtitle extensions, including:
  - `.srt`/`application/x-subrip`: current default.
  - `.vtt`/`text/vtt`: available in the putio api.

Based on original code in putdotio#23 by @berkanteber.

Closes putdotio#22.

See

- putdotio#22
- https://app.swaggerhub.com/apis-docs/putio/putio/2.8.13#/files/get_files__id__subtitles__key_
- https://en.wikipedia.org/wiki/SubRip
- https://en.wikipedia.org/wiki/WebVTT
- putdotio#23
- putdotio@8efb911
@joelpurra
Copy link
Author

joelpurra commented Jul 10, 2025

@berkanteber: sorry, code got stuck in "testing" since last year.

As mentioned, it's based on your code so you can see what I meant in my code review. (To be honest, I've forgotten most of the discussion from then.) Hope you find the code agreeable. Let me know if you would prefer a standalone pull request.

Also tested to never include any language code in the filename, but as you said kodi's language detection isn't always working. Ended up checking if the filename is already included the language code (the way it would be detected by kodi), and appending it if not.

@berkanteber
Copy link
Member

berkanteber commented Jul 22, 2025

Hey, sorry for late response. Just looked at what we've discussed earlier. Changed my mind in couple of things, but not all.

  • Unknown languages: These are most probably folder subtitles (very few cases for other 2). I now think it's acceptable if we don't add und to these. If they are in proper format, they'll work OK, in edge cases users can rename from put.io.
  • Known languages: I think we shouldn't use any extra logic here and always append.
    • I don't know what happens if filename is foo.eng.HI.srt and language_code is eng. If we don't append again, do we have English or Hindi?
    • We send full name or 3-letter codes, which misses 2 letter codes. If filename is foo.en.srt and language_code is eng, we still append it "unnecessarily". Converting these codes are also not an option unless Kodi has some builtin conversion table.
    • It's not desirable to aways think about "what happens if..." in such cases.
    • Kodi will always strip language from name, so there won't be duplicates in name. There may be in filesystem, but I think the important thing is how it is in Kodi. Other than easy fixes like .srt always at the end, we shouldn't think about it.
  • Extensions: Doesn't matter much since all are .srt anyway. (All our subtitle formats are srt, we can specify to download as webvtt, in which case we'll convert on the fly. Since we don't specify here, all returns .srt. We could also explicitly request srt.) But agree, not hardcoding .srt looks better.

@joelpurra
Copy link
Author

  • Known languages: I think we shouldn't use any extra logic here and always append.

Deliberately choosing to duplicate the language identifier makes little sense:

- Video.eng.srt
+ Video.eng.eng.srt

https://github.com/putdotio/putio-kodi/pull/24/files#diff-a6b15b003d6e376a2b0b6757c2c12a4b3b160bc424a981d638475a176c430b36R217

-if subtitle_lang != None and subtitle_lang.lower() not in re.split(r"[. -]", subtitle_name.lower()):
+if subtitle_lang != None:
     subtitle_parts.append(subtitle_lang)

Guess this is the change you suggest. I'd say the logic is simple enough to keep, but sure, if you insist.

@berkanteber
Copy link
Member

Checked a sample from DB.

Out of 108,678 subtitles:

  • 8,983 matches for language code (2-letter) in name parts (we don't return these from API)
  • 7,377 matches for language code (3-letter) in name parts
  • 2,993 matches for language name in name parts

So, duplicates would be < 10%.

I don't know what happens if filename is foo.eng.HI.srt and language_code is eng. If we don't append again, do we have English or Hindi?

Checked this and Kodi takes the last one, so foo.eng.HI.srt would be identified incorrectly. We would need to handle all these special cases or only take the last part. If we only take the last part, it would be 7,849 (3-letter or full) matches, making it 7.2%.

Let's have it only for checking the last part then.

# NOTE: split subtitle name/extension while allowing multiple dots within the name -- both as word/part separator and as ellipsis.
(subtitle_name, subtitle_ext) = os.path.splitext(subtitle['name'])
subtitle_ext = subtitle_ext.lstrip('.') if subtitle_ext != '' else None
subtitle_parts = subtitle_name.split('.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do:

subtitle_parts = [subtitle_name]
if ...:
   subtitle_parts.append(lang_code)
subtitle_parts.append(subtitle_ext)
subtitle_fullname = '.'.join(subtitle_parts)

# NOTE: append known subtitle language to subtitle name, if it is not already a (delimited) part of it
# NOTE: separators (dot, space, dash) found on kodi's wiki.
# https://kodi.wiki/view/Subtitles
if subtitle_lang != None and subtitle_lang.lower() not in re.split(r"[. -]", subtitle_name.lower()):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check only last part (if has parts).

Also, we can check subtitle['language'].

Does returning also the 2-letter code makes sense? If we do, what should it be called (without touching others) in your opinion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants