Skip to content

Document Speech-to-Text integration in Glific#613

Merged
mahajantejas merged 9 commits into
mainfrom
add-speech-to-text-new-webhook
May 13, 2026
Merged

Document Speech-to-Text integration in Glific#613
mahajantejas merged 9 commits into
mainfrom
add-speech-to-text-new-webhook

Conversation

@mahajantejas
Copy link
Copy Markdown
Collaborator

@mahajantejas mahajantejas commented May 8, 2026

Added documentation for Speech-to-Text capabilities in Glific, detailing integration steps, default behavior, and provider options.

Summary by CodeRabbit

  • Documentation
    • Added a guide for Glific's Speech-to-Text and Text-to-Speech capabilities
    • Explains transcription of voice notes in Indian languages with optional translation
    • Provides step-by-step instructions to build voice transcription flows and display results
    • Compares transcription providers (e.g., Gemini vs. ElevenLabs) and offers a decision guide

Review Change Stack

Added documentation for Speech-to-Text capabilities in Glific, detailing integration steps, default behavior, and provider options.
@mahajantejas mahajantejas requested a review from shijithkjayan May 8, 2026 11:11
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Warning

Rate limit exceeded

@mahajantejas has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 57 minutes and 59 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d4775dfc-7f91-4843-bac5-16ae7b21d928

📥 Commits

Reviewing files that changed from the base of the PR and between 5fce28c and 0383f44.

📒 Files selected for processing (1)
  • docs/5. Integrations/Speech to text capabilities in Glific.md
📝 Walkthrough

Walkthrough

Adds a new documentation page describing how to use Speech-to-Text in Glific flows, including flow setup, default Gemini behavior (transcription with optional translation), switching to ElevenLabs via provider/model, and a provider comparison/decision guide.

Changes

Speech-to-Text Documentation

Layer / File(s) Summary
Overview and Introduction
docs/5. Integrations/Speech to text capabilities in Glific.md
New documentation page with metadata and high-level overview of STT transcription and optional translation.
Implementation Guide
docs/5. Integrations/Speech to text capabilities in Glific.md
Step-by-step instructions: Send message → Wait for audio response → Call Webhook (speech_to_text) and reference webhook results to show transcription.
Default Behavior and Parameters
docs/5. Integrations/Speech to text capabilities in Glific.md
Documents default Gemini gemini-2.5-pro transcription behavior, webhook response shape, and output_language parameter to enable translation.
Provider Options and Decision Guide
docs/5. Integrations/Speech to text capabilities in Glific.md
Explains ElevenLabs alternative via provider/model, notes lack of translation and language coverage differences, and provides a Gemini vs ElevenLabs decision guide.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • glific/docs#603: Adds integration docs using Call Webhook + predefined function patterns (speech integrations).
  • glific/docs#537: Related voice/STT-TTS integration documentation updates under docs/5/Integrations.
  • glific/docs#601: Prior edits to the same STT/TTS documentation page.

Suggested reviewers

  • mdshamoon
  • SangeetaMishr

Poem

🐰 I listen to whispers and turn them to text,
From Hindi to English, I do what comes next.
Gemini or Eleven, I hop and I choose,
Documenting flows so devs never lose.
Hop on—transcribe—and let messages cruise.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Document Speech-to-Text integration in Glific' directly and clearly summarizes the main change: adding documentation for the Speech-to-Text feature integration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch add-speech-to-text-new-webhook

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

@github-actions github-actions Bot temporarily deployed to pull request May 8, 2026 11:12 Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
docs/5. Integrations/Speech to text capabilities in Glific.md (1)

32-33: 💤 Low value

Fix awkward line break.

The period is placed on a separate line, which creates awkward formatting.

🔧 Proposed formatting fix
-- Give the webhook result name - you can use any name. In the screenshot example, it's named `result`
-.
+- Give the webhook result name - you can use any name. In the screenshot example, it's named `result`.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/5`. Integrations/Speech to text capabilities in Glific.md around lines
32 - 33, The sentence "Give the webhook result name - you can use any name. In
the screenshot example, it’s named `result`." has the period on its own line;
fix the awkward line break by joining the trailing period to the previous line
so the sentence reads as a single line. Locate the text in the "Speech to text
capabilities in Glific.md" doc (the sentence containing "Give the webhook result
name" and the inline code `result`) and remove the extra newline so the period
follows the closing backtick.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/5`. Integrations/Speech to text capabilities in Glific.md:
- Line 66: In the sentence fragment "Apart from Gemini which is the default
speech engine used by Glific. Eleven Labs can be used as a provider. This can be
configured as the provider to use for transcription purposes be explicitly
providing additional parameters of `provider` and `model`" replace the period
after "Glific" with a comma and change "be explicitly providing" to "by
explicitly providing" so the line reads as one grammatically correct sentence
referring to Eleven Labs and the `provider` and `model` parameters.
- Around line 71-75: The fenced code block containing the JSON snippet with
"provider":"elevenlabs" and "model":"scribe_v2" needs a language identifier and
tidy closing: add "json" immediately after the opening triple backticks
(```json) and remove the extra blank line before the closing triple backticks so
the block is a proper JSON fenced code block.
- Line 12: The sentence in the docs has capitalization and phrasing errors;
update the phrase "Google’s gemini models as default" to "Google’s Gemini models
by default" so the product name is capitalized and the grammar reads
correctly—locate the sentence beginning "This integration in Glific enables..."
and replace "gemini" with "Gemini" and "as default" with "by default".
- Line 28: Update the model reference in the sentence that describes the
speech_to_text function to use the complete model name "gemini-2.5-pro" to match
the usage later in the doc; locate the description of the predefined function
`speech_to_text` and replace "Gemini 2.5 model" with "gemini-2.5-pro" so the
documentation is consistent with the `gemini-2.5-pro` reference on line 52.
- Line 57: The docs contain a critical typo: the parameter name
"output_langauge" should be corrected to "output_language"; update every
occurrence in the document (including prose, examples, YAML/JSON/code blocks and
any parameter tables) to use output_language, and verify related examples and
sample payloads (if any) still validate against the API contract after the
change so consumers copy the correct parameter name.
- Line 11: The document title string "# Speech-to-Text and Text-to-Speech
Capabilities in Glific" incorrectly references TTS; change the title to reflect
only Speech-to-Text (e.g., "# Speech-to-Text Capabilities in Glific" or similar)
by updating the top-level header text in the file so it no longer mentions
Text-to-Speech.
- Line 81: The heading "#### Translation Support" is an h4 placed directly after
an h2, skipping h3; update this and any adjacent h4 headings in the same section
(the "Translation Support" heading and the other h4 on the same subsection
around line 86) to h3 so heading levels increment by one (change #### to ### for
those headings).
- Line 19: Several step headings use h3 (###) immediately after the document h1,
violating heading hierarchy; update each step heading to h2 (##) so levels
increment by one. Specifically, change the headings with the texts "Step 1:
Create a Send message node directing users to send their responses as audio
messages, based on their preference.", and the other step headings matching the
step texts on lines referenced in the review (the remaining step headings at the
same level) from ### to ##; ensure all step headings use ## and that any
subheadings beneath them remain at ### or deeper as appropriate.
- Around line 51-52: Update the sentence under the heading "Default behaviour
for speech to text webhook calls" to hyphenate "speech-to-text" when used as a
compound adjective (e.g., "speech-to-text webhook") and normalize the model name
capitalization for consistency by replacing `gemini-2.5-pro` with a consistently
capitalized form such as `Gemini-2.5-Pro`; edit the sentence to read accordingly
so both the compound adjective and model name are fixed.

---

Nitpick comments:
In `@docs/5`. Integrations/Speech to text capabilities in Glific.md:
- Around line 32-33: The sentence "Give the webhook result name - you can use
any name. In the screenshot example, it’s named `result`." has the period on its
own line; fix the awkward line break by joining the trailing period to the
previous line so the sentence reads as a single line. Locate the text in the
"Speech to text capabilities in Glific.md" doc (the sentence containing "Give
the webhook result name" and the inline code `result`) and remove the extra
newline so the period follows the closing backtick.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 38669631-05bd-4dc9-a499-e5d5c914217b

📥 Commits

Reviewing files that changed from the base of the PR and between 469d0b3 and ae5b494.

📒 Files selected for processing (1)
  • docs/5. Integrations/Speech to text capabilities in Glific.md

Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md Outdated
Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md Outdated
Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md
Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md Outdated
Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md Outdated
Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md
Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md Outdated
Comment on lines +71 to +75
```
"provider":"elevenlabs",
"model":"scribe_v2"

```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Specify language for code block.

The fenced code block should have a language identifier for proper syntax highlighting. Since this appears to be JSON content, add json after the opening fence.

🔧 Proposed fix
-```
+```json
 "provider":"elevenlabs",
 "model":"scribe_v2"
-

Note: Also removed the extra blank line before the closing fence.

</details>

As per static analysis tool markdownlint-cli2: Fenced code blocks should have a language specified (MD040).

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 71-71: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/5`. Integrations/Speech to text capabilities in Glific.md around lines
71 - 75, The fenced code block containing the JSON snippet with
"provider":"elevenlabs" and "model":"scribe_v2" needs a language identifier and
tidy closing: add "json" immediately after the opening triple backticks
(```json) and remove the extra blank line before the closing triple backticks so
the block is a proper JSON fenced code block.

Eleven labs integration does not support translation to any specified output language.

## Main differences between Gemini and Eleven Labs:
#### Translation Support
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix heading level increment.

This heading uses #### (h4) directly after ## (h2) on line 80, skipping h3. Heading levels should increment by one level at a time.

🔧 Proposed heading fix

Change lines 81 and 86 from h4 to h3:

-#### Translation Support
+### Translation Support

And:

-#### Language Support & Flexibility
+### Language Support & Flexibility

As per static analysis tool markdownlint-cli2: Heading levels should only increment by one level at a time (MD001).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#### Translation Support
### Translation Support
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 81-81: Heading levels should only increment by one level at a time
Expected: h3; Actual: h4

(MD001, heading-increment)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/5`. Integrations/Speech to text capabilities in Glific.md at line 81,
The heading "#### Translation Support" is an h4 placed directly after an h2,
skipping h3; update this and any adjacent h4 headings in the same section (the
"Translation Support" heading and the other h4 on the same subsection around
line 86) to h3 so heading levels increment by one (change #### to ### for those
headings).

mahajantejas and others added 6 commits May 8, 2026 17:53
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@github-actions github-actions Bot temporarily deployed to pull request May 8, 2026 12:26 Inactive
Comment thread docs/5. Integrations/Speech to text capabilities in Glific.md Outdated
<img width="676" height="477" alt="Screenshot 2026-05-08 at 4 26 18 PM" src="https://github.com/user-attachments/assets/6e35b738-182c-4de2-99a7-1c1017d72dd7" />


By specifying the `output_language` parameter, this configuration ensures that all voice notes are transcribed and the final text is available consistently in the same language, no matter the language of the incoming voice note.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be transcribed and translated and the final text is available consistently in the specified language ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, good catch, thanks

```
<img width="678" height="484" alt="Screenshot 2026-05-08 at 4 26 45 PM" src="https://github.com/user-attachments/assets/9f35331c-7f39-44ad-9471-8042281acfa8" />

Eleven labs integration does not support translation to any specified output language.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have this in bold to get attention

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

@shijithkjayan shijithkjayan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from the minor comments. Please merge after addressing them.

Co-authored-by: Shijith Karumathil <shijithjayan@gmail.com>
@github-actions github-actions Bot temporarily deployed to pull request May 13, 2026 07:54 Inactive
Clarified language specification in voice note transcription and emphasized Eleven Labs integration limitation.
@mahajantejas mahajantejas merged commit 15f36b3 into main May 13, 2026
7 checks passed
@mahajantejas mahajantejas deleted the add-speech-to-text-new-webhook branch May 13, 2026 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants