-
Notifications
You must be signed in to change notification settings - Fork 339
Fixed #930 Improve ignore_trivial Docstring and Warnings
#1110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fixed #930 Improve ignore_trivial Docstring and Warnings
#1110
Conversation
|
Review these changes at https://app.gitnotebooks.com/stumpy-dev/stumpy/pull/1110 |
|
@NimaSarajpoor Would you mind providing a review at your earliest convenience? |
| return | ||
|
|
||
|
|
||
| def check_self_join(ignore_trivial): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this function name. Note that there is already another function called core.check_ignore_trivial, which
Check inputs and verify the appropriateness for self-joins vs AB-joins and provides relevant warnings.
The difference is that core.check_self_join only applies to self-joins (i.e., it only uses ignore_trivial) while core.check_ignore_trivial applies to T_A, T_B, and ignore_trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this function name.
The function name actually makes sense. I think the underlying reason that resulted in this new function is that we first need to know if it is really self-join and then MODIFY ignore_trivial. Note that the following if-block in core.check_ignore_trivial does NOT modify ignore_trivial (the other if-block in that function DOES MODIFY though).
if ignore_trivial is False and are_arrays_equal(T_A, T_B): # pragma: no cover
msg = "Arrays T_A, T_B are equal, which implies a self-join. "
msg += "Try setting `ignore_trivial = True`."
warnings.warn(msg)
The reason we do not modify ignore_trivial here is because there might be a case where two arrays T_A and T_B have the same values but they are related to AB-join (not self-join). Right? (Probably a very rare case in real-world problems) So, we cannot just modify ignore_trivial here. Therefore, core.check_self_join is actually good.
NimaSarajpoor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanlaw
I've added a few comments for your consideration.
| T_B : numpy.ndarray | ||
| The time series or sequence that will be used to annotate T_A. For every | ||
| subsequence in T_A, its nearest neighbor in T_B will be recorded. Default is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove the last sentence?
Default is
Nonewhich corresponds to a self-join.
This is true if we look at it from stump's perspective. From this function's standpoint, I feel that we should not provide such statement. Are we allowing the function to accept None for T_B? If yes, then we can modify this function to do what the new function core.check_self_join does since we can identify if we are dealing with self-join or not. (see the response I've provided for your comment in the new function)
| return | ||
|
|
||
|
|
||
| def check_self_join(ignore_trivial): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this function name.
The function name actually makes sense. I think the underlying reason that resulted in this new function is that we first need to know if it is really self-join and then MODIFY ignore_trivial. Note that the following if-block in core.check_ignore_trivial does NOT modify ignore_trivial (the other if-block in that function DOES MODIFY though).
if ignore_trivial is False and are_arrays_equal(T_A, T_B): # pragma: no cover
msg = "Arrays T_A, T_B are equal, which implies a self-join. "
msg += "Try setting `ignore_trivial = True`."
warnings.warn(msg)
The reason we do not modify ignore_trivial here is because there might be a case where two arrays T_A and T_B have the same values but they are related to AB-join (not self-join). Right? (Probably a very rare case in real-world problems) So, we cannot just modify ignore_trivial here. Therefore, core.check_self_join is actually good.
| """ | ||
| if not ignore_trivial: | ||
| msg = "`ignore_trivial` cannot be `False` for a self-join and " | ||
| msg += "has been automatically overridden and set to `True`." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the message says:
"has been automatically overridden and set to
True."
I wanted to suggest that we should return the (corrected) ignore_trivial in this function, and remove ignore_trivial = True in the caller. However, the downside is that readers might not be able to quickly figure out the value of ignore_trivial when they read:
if T_B is None:
ignore_trivial = core.check_self_join(ignore_trivial)
T_B = T_A
T_B_subseq_isconstant = T_A_subseq_isconstant
and now it seems that ignore_trivial can also take the value "False", but that is not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the trouble is that the function itself never actually sets the value of ignore_trivial so there's a bit of a disconnect as to "when" the warning occurs and the value of ignore_trivial is actually set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, instead of saying:
has been automatically overridden and set to True."
we can say
For a self-join,
ignore_trivialcannot beFalseand has therefore been has been disregarded."
This wording "should" (??) imply that the default value of ignore_trivial is used?
Fixed #930 and #929
Pull Request Checklist
Below is a simple checklist but please do not hesitate to ask for assistance!
black(i.e.,python -m pip install blackorconda install -c conda-forge black)flake8(i.e.,python -m pip install flake8orconda install -c conda-forge flake8)pytest-cov(i.e.,python -m pip install pytest-covorconda install -c conda-forge pytest-cov)black --exclude=".*\.ipynb" --extend-exclude=".venv" --diff ./in the root stumpy directoryflake8 --extend-exclude=.venv ./in the root stumpy directory./setup.sh dev && ./test.shin the root stumpy directory