Skip to content

Conversation

@seanlaw
Copy link
Contributor

@seanlaw seanlaw commented Dec 31, 2025

Fixed #930 and #929

Pull Request Checklist

Below is a simple checklist but please do not hesitate to ask for assistance!

  • Fork, clone, and checkout the newest version of the code
  • Create a new branch
  • Make necessary code changes
  • Install black (i.e., python -m pip install black or conda install -c conda-forge black)
  • Install flake8 (i.e., python -m pip install flake8 or conda install -c conda-forge flake8)
  • Install pytest-cov (i.e., python -m pip install pytest-cov or conda install -c conda-forge pytest-cov)
  • Run black --exclude=".*\.ipynb" --extend-exclude=".venv" --diff ./ in the root stumpy directory
  • Run flake8 --extend-exclude=.venv ./ in the root stumpy directory
  • Run ./setup.sh dev && ./test.sh in the root stumpy directory
  • Reference a Github issue (and create one if one doesn't already exist)

@gitnotebooks
Copy link

gitnotebooks bot commented Dec 31, 2025

Review these changes at https://app.gitnotebooks.com/stumpy-dev/stumpy/pull/1110

@seanlaw
Copy link
Contributor Author

seanlaw commented Dec 31, 2025

@NimaSarajpoor Would you mind providing a review at your earliest convenience?

return


def check_self_join(ignore_trivial):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this function name. Note that there is already another function called core.check_ignore_trivial, which

Check inputs and verify the appropriateness for self-joins vs AB-joins and provides relevant warnings.

The difference is that core.check_self_join only applies to self-joins (i.e., it only uses ignore_trivial) while core.check_ignore_trivial applies to T_A, T_B, and ignore_trivial.

Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlaw

I don't love this function name.

The function name actually makes sense. I think the underlying reason that resulted in this new function is that we first need to know if it is really self-join and then MODIFY ignore_trivial. Note that the following if-block in core.check_ignore_trivial does NOT modify ignore_trivial (the other if-block in that function DOES MODIFY though).

if ignore_trivial is False and are_arrays_equal(T_A, T_B):  # pragma: no cover
    msg = "Arrays T_A, T_B are equal, which implies a self-join. "
    msg += "Try setting `ignore_trivial = True`."
    warnings.warn(msg)

The reason we do not modify ignore_trivial here is because there might be a case where two arrays T_A and T_B have the same values but they are related to AB-join (not self-join). Right? (Probably a very rare case in real-world problems) So, we cannot just modify ignore_trivial here. Therefore, core.check_self_join is actually good.

Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlaw
I've added a few comments for your consideration.

T_B : numpy.ndarray
The time series or sequence that will be used to annotate T_A. For every
subsequence in T_A, its nearest neighbor in T_B will be recorded. Default is
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the last sentence?

Default is None which corresponds to a self-join.

This is true if we look at it from stump's perspective. From this function's standpoint, I feel that we should not provide such statement. Are we allowing the function to accept None for T_B? If yes, then we can modify this function to do what the new function core.check_self_join does since we can identify if we are dealing with self-join or not. (see the response I've provided for your comment in the new function)

return


def check_self_join(ignore_trivial):
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlaw

I don't love this function name.

The function name actually makes sense. I think the underlying reason that resulted in this new function is that we first need to know if it is really self-join and then MODIFY ignore_trivial. Note that the following if-block in core.check_ignore_trivial does NOT modify ignore_trivial (the other if-block in that function DOES MODIFY though).

if ignore_trivial is False and are_arrays_equal(T_A, T_B):  # pragma: no cover
    msg = "Arrays T_A, T_B are equal, which implies a self-join. "
    msg += "Try setting `ignore_trivial = True`."
    warnings.warn(msg)

The reason we do not modify ignore_trivial here is because there might be a case where two arrays T_A and T_B have the same values but they are related to AB-join (not self-join). Right? (Probably a very rare case in real-world problems) So, we cannot just modify ignore_trivial here. Therefore, core.check_self_join is actually good.

"""
if not ignore_trivial:
msg = "`ignore_trivial` cannot be `False` for a self-join and "
msg += "has been automatically overridden and set to `True`."
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the message says:

"has been automatically overridden and set to True."

I wanted to suggest that we should return the (corrected) ignore_trivial in this function, and remove ignore_trivial = True in the caller. However, the downside is that readers might not be able to quickly figure out the value of ignore_trivial when they read:

if T_B is None:
    ignore_trivial = core.check_self_join(ignore_trivial)
    T_B = T_A
    T_B_subseq_isconstant = T_A_subseq_isconstant

and now it seems that ignore_trivial can also take the value "False", but that is not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the trouble is that the function itself never actually sets the value of ignore_trivial so there's a bit of a disconnect as to "when" the warning occurs and the value of ignore_trivial is actually set.

Copy link
Contributor Author

@seanlaw seanlaw Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, instead of saying:

has been automatically overridden and set to True."

we can say

For a self-join, ignore_trivial cannot be False and has therefore been has been disregarded."

This wording "should" (??) imply that the default value of ignore_trivial is used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve ignore_trivial docstring

2 participants