-
-
Notifications
You must be signed in to change notification settings - Fork 472
[16.0][FIX] account_statement_import_sheet_file: fix header_lines_skip_count behavior and fix tests accordingly #819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 16.0
Are you sure you want to change the base?
Conversation
|
Hi @alexey-pelykh, |
7b065b2 to
471e92f
Compare
471e92f to
a255262
Compare
|
I don't manage to fix pre-commit... |
account_statement_import_sheet_file/tests/test_account_statement_import_sheet_file.py
Outdated
Show resolved
Hide resolved
a255262 to
3bd22b3
Compare
account_statement_import_sheet_file/models/account_statement_import_sheet_parser.py
Outdated
Show resolved
Hide resolved
| values.append(cell_value) | ||
| else: | ||
| if index >= footer_line: | ||
| if mapping.no_header and index < data_first_index_line: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this part
If we start the index at first line of data, we have to ignore all the header lines. Then, in the case without header, as long as we are before the data_first_index_line, we continue
| rows = csv_or_xlsx | ||
|
|
||
| lines = [] | ||
| for index, row in enumerate(rows, label_line): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to start with header_skip?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to start with
header_skip?
I did a wrong answer before, sorry for the wrong notification.
The reason why I choose not to start with header_skip is that if we have header lines to skip, I had a problem with the index and row and it removed the first lines of data.
So I prefered to start from the beginning and add conditions for different use cases
Prior to this commit, there was some confusion between indexes and line numbers. We’ve renamed several variables to improve clarity. As a result, we updated the existing tests and added new ones to cover additional use cases.
3bd22b3 to
80a266c
Compare
mondot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello,
I've just ran into this same bug. Thanks for proposing this PR 🙂
| label_line = mapping.header_lines_skip_count | ||
| footer_line = numrows - mapping.footer_lines_skip_count | ||
| header_skip = mapping.header_lines_skip_count | ||
| data_first_index_line = header_skip + (0 if mapping.no_header else 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on what I see about your changes in the existing tests, I think there might be some confusion about the use of the mapping.no_header parameter.
My first thought when reviewing your changes was that the parameter mapping.no_header creates a lot of complications and that your code could be much simpler if you ignored it.
From what I can see in your changes to the existing tests, I believe I’m correct in saying that the no_header param is only meant to serve as a marker indicating: "will I find integers or strings in the colum params ?".
Therefore, when someone uses the header_lines_skip_count and footer_lines_skip_count parameters, they should count the header line (if it exists). That makes your code much more readable, and the changes in this PR as well.
The code would look like
data_first_index_line = mapping.header_lines_skip_count
data_last_line = numrows - mapping.footer_lines_skip_count
data_line_count = data_last_line - data_first_index_line
[...]
if index < data_first_index_line:
continue
if index >= data_first_index_line + data_line_count:
continueThis way, you'd only need to add your new tests without changing the existing ones.
I hope this is clear. Please feel free to let me know if it’s not, or if I’ve misunderstood something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose an alternative in this PR : https://github.com/OCA/bank-statement-import/pull/834/files
|
Hello @cvinh, I hope you're well :) |
|
Yes I'll be able to work on it tomorrow |
Great news :) |
I'll arrive at end of the day |
|
As discussed yesterday with @mondot the approach here is to better understand the 3 parameters no_header , skip_header_line and skip_footer_line |
Hello, I hope everyone is well. Best regards |
Prior to this commit, there was some confusion between indexes and line numbers. We’ve renamed several variables to improve clarity.
As a result, we updated the existing tests and added new ones to cover additional use cases.