Skip to content

Conversation

@bdraco
Copy link
Member

@bdraco bdraco commented Nov 24, 2025

Proposed change

Fix elkm1 integration failing to clean up connection when setup is cancelled or times out, which caused multiple competing Elk instances and connection instability.

Changes:

  • Refactor sync waiting logic into ElkSyncWaiter class to avoid nonlocal variables
  • Add LoginFailed exception for explicit login failure handling
  • Use futures with call_later for timeout handling instead of asyncio.timeout to avoid CancelledError propagation issues from elkm1-lib's internal task cancellation
  • Ensure elk.disconnect() is called when any step fails (timeout, login failure, cancellation)
  • Clean up handlers with remove_handler in finally block to prevent leaks

Type of change

  • Bugfix (non-breaking change which fixes an issue)

Additional information

Checklist

  • I understand the code I am submitting and can explain how it works.
  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Ruff (ruff format homeassistant tests)
  • Tests have been added to verify that the new code works.
  • Any generated code has been carefully reviewed for correctness and compliance with project standards.

@home-assistant
Copy link

Hey there @gwww, mind taking a look at this pull request as it has been labeled with an integration (elkm1) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of elkm1 can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign elkm1 Removes the current integration label and assignees on the pull request, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the pull request.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the pull request.

@bdraco bdraco marked this pull request as ready for review November 24, 2025 22:32
Copilot AI review requested due to automatic review settings November 24, 2025 22:32
Copilot finished reviewing on behalf of bdraco November 24, 2025 22:36
@bdraco bdraco marked this pull request as draft November 24, 2025 22:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the elkm1 integration's connection synchronization logic to fix cleanup issues on setup failure. The main change converts the async_wait_for_elk_to_sync function into an ElkSyncWaiter class that uses futures with call_later for timeout handling instead of asyncio.timeout.

Key changes:

  • Refactored synchronization logic from function to ElkSyncWaiter class with futures-based timeout handling
  • Replaced asyncio.timeout context manager with call_later to avoid CancelledError propagation
  • Moved disconnect logic from callbacks to finally blocks (with issues noted below)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
homeassistant/components/elkm1/init.py Refactored async_wait_for_elk_to_sync function into ElkSyncWaiter class; updated async_setup_entry to use new class
homeassistant/components/elkm1/config_flow.py Updated import and usage to use new ElkSyncWaiter class
Comments suppressed due to low confidence (1)

homeassistant/components/elkm1/init.py:306

  • Missing cleanup in async_setup_entry for all failure paths. The elk connection established at line 262 is not properly cleaned up when setup fails. There are several failure paths without cleanup:
  1. Login failure (line 283): When ElkSyncWaiter.async_wait() returns False (login failed), no disconnect is called either by the waiter or by this function, leaving the connection open.

  2. Exceptions after successful sync (lines 287-304): If an exception occurs after sync succeeds (e.g., when accessing elk.panel.temperature_units at line 287), the elk connection is not disconnected.

  3. Timeout (line 285): This case is handled correctly by ElkSyncWaiter disconnecting in its finally block before raising TimeoutError.

This incomplete cleanup causes the "multiple competing Elk instances" issue mentioned in the PR description.

Recommendation: Add try-finally block to ensure cleanup on all failure paths:

elk = Elk({...})
elk.connect()

try:
    # ... keypad setup ...
    
    if not await ElkSyncWaiter(elk, LOGIN_TIMEOUT, SYNC_TIMEOUT).async_wait():
        return False
    
    # ... rest of setup (lines 287-305) ...
    entry.runtime_data = ELKM1Data(elk=elk, ...)
    await hass.config_entries.async_forward_entry_setups(entry, PLATFORMS)
    return True
except Exception:
    # Clean up on any failure
    elk.disconnect()
    raise

Note: This assumes ElkSyncWaiter will handle its own cleanup for timeout cases, which it currently does at line 385.

    elk.connect()

    def _keypad_changed(keypad: Element, changeset: dict[str, Any]) -> None:
        if (keypress := changeset.get("last_keypress")) is None:
            return

        hass.bus.async_fire(
            EVENT_ELKM1_KEYPAD_KEY_PRESSED,
            {
                ATTR_KEYPAD_NAME: keypad.name,
                ATTR_KEYPAD_ID: keypad.index + 1,
                ATTR_KEY_NAME: keypress[0],
                ATTR_KEY: keypress[1],
            },
        )

    for keypad in elk.keypads:
        keypad.add_callback(_keypad_changed)

    try:
        if not await ElkSyncWaiter(elk, LOGIN_TIMEOUT, SYNC_TIMEOUT).async_wait():
            return False
    except TimeoutError as exc:
        raise ConfigEntryNotReady(f"Timed out connecting to {conf[CONF_HOST]}") from exc

    elk_temp_unit = elk.panel.temperature_units
    if elk_temp_unit == "C":
        temperature_unit = UnitOfTemperature.CELSIUS
    else:
        temperature_unit = UnitOfTemperature.FAHRENHEIT
    config["temperature_unit"] = temperature_unit
    prefix: str = conf[CONF_PREFIX]
    auto_configure: bool = conf[CONF_AUTO_CONFIGURE]
    entry.runtime_data = ELKM1Data(
        elk=elk,
        prefix=prefix,
        mac=entry.unique_id,
        auto_configure=auto_configure,
        config=config,
        keypads={},
    )

    await hass.config_entries.async_forward_entry_setups(entry, PLATFORMS)

    return True

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copilot finished reviewing on behalf of bdraco November 24, 2025 22:50
Comment on lines +284 to +285
_LOGGER.error("ElkM1 login failed for %s", conf[CONF_HOST])
return False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future we could raise ConfigEntryAuthFailed but this integration doesn't support reauth right now

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.


You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@gwww
Copy link
Contributor

gwww commented Nov 24, 2025

Would you be able to include a bump to the elkm1_lib in just a few moments. I have cleaned up a log message.

Copilot finished reviewing on behalf of bdraco November 24, 2025 23:05
@bdraco
Copy link
Member Author

bdraco commented Nov 24, 2025

Would you be able to include a bump to the elkm1_lib in just a few moments. I have cleaned up a log message.

We need to do that in another PR (lib bumps only allow code changes in the same PR to handle breaking changes), however if you open it today, should be no problem getting it into beta on Wednesday

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@bdraco bdraco marked this pull request as ready for review November 24, 2025 23:12
@bdraco bdraco added this to the 2025.12.0b0 milestone Nov 24, 2025
@gwww
Copy link
Contributor

gwww commented Nov 24, 2025

The bump PR is there now.

Big ❤️ for the PR.

@balloob balloob merged commit 878881b into dev Nov 25, 2025
40 of 42 checks passed
@balloob balloob deleted the elkm1_sync_fix branch November 25, 2025 04:12
@bdraco
Copy link
Member Author

bdraco commented Nov 25, 2025

thanks

@github-actions github-actions bot locked and limited conversation to collaborators Nov 26, 2025
@frenck frenck removed this from the 2025.12.0b0 milestone Nov 26, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ELK M1 Integration won't stay connected to alarm panel

5 participants