-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
I2C klippy's side error handling #7013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
2df59c2 to
b8e2da9
Compare
|
I am currently running it with an i2c filament width sensor and so far it looks good. |
|
Thanks for working on this. Some random comments:
Cheers, |
That way, I could avoid duplicating shutdown strings to pass the message back. I totally could redo it with the use of enumerations. |
d262e3e to
ca9180b
Compare
|
Now it utilizes enum. (I could use try_lookup_command for migration, but it seems that simple enum checking should be enough) |
|
I am very glad to see that someone is solving this problem. file stm32\i2c.c |
|
I ran a bit over 1kg through the initial version of this just fine so far. I did switch to sw i2c because of the shutdown in Line 127 in d34d3b0
|
a3f3d8b to
d830157
Compare
|
I've spent some time thinking about how to properly do that and where to put the complexity. As I2C writes now returns the status, there is an inherent retry from To somehow get it under control and simplify things, it seems less ugly to me. Since i2c_write is now blocking, probably I assume that the I2C timeout is when something really wrong happens (hang of i2c hw) (or MCU too busy with timers). As the whole transfer now exists inside the one class, and there is an async response handler, if it is required. Thanks. |
d830157 to
a68760a
Compare
|
Thanks for working on this. In general it seems useful. I have a few comments. If I understand this PR correctly, it's mostly a "clean up" and it is not blocking other development work or features. Let me know if I got this wrong. If so, I wonder if we could make this change in 3 phases:
For phase 2, I'd guess it would be okay to default to automatically retrying For phase 3, I'd guess we'd want to wait a month or two after implementing phase 2. I'm open to a different timeline, but as a high-level observation, if we run into issues with phase 1&2, it would seem really tricky to fix them if we've already started phase 3. Also, I'm not sure if we need a lot of automatic retransmit support for i2c, as I'd guess most devices just don't care at all (they don't mind a missing report) or really don't want to see any errors at all on the line. I could be wrong though.
For what it is worth, using the low-level
It is for something really wrong at the mcu i2c hardware level. Some mcu hw have really bad i2c implementations. In particular, if the low-level software doesn't check for a timeout and the hardware doesn't respond as expected, then it leads to hard to debug watchdog reboots. The i2c timeout is not related to mcu being busy nor timer activity. -Kevin |
I'm not sure I understood what you suggested here or how it would mess with it. Callers who depend on The first phaseI added the #7020 with migration to synchronous i2c_write(). The second phaseThis PR. Optionally, there is an NACK retry. temp -> 0 - https://github.com/Klipper3d/klipper/blob/master/klippy/extras/ads1x1x.py#L302 Not sure which one is better. The 3rd phaseI'm unaware of any devices that would require that.
It just feels a little bit ugly there would be another level of
Yes, I agree there should be a check, like a timeout for that. This is rare from a probability point; this just seems like a possible failure path. Thanks. i2c write/read merged to i2c_transfer. Probably, I can propagate the exception from serialhdl instead of converting it to a string. |
d325ffd to
e6783a3
Compare
Yeah, I was saying we might want to refactor pca9533 and mcp4728 to use an explicit
Thanks. Looks good to me.
My suggestion is to move the retry logic in this PR to a third phase. That is, I'm suggesting this PR have a host def i2c_transfer(self, write, read_len=0, minclock=0, reqclock=0,
retry=True):
params = self.i2c_transfer_cmd.send([self.oid, write, read_len], minclock=minclock,
reqclock=reqclock, retry=retry)
if params["i2c_bus_status"] != "SUCCESS":
self.mcu.get_printer().shutdown("I2C mcu %s reports error %s" % (self.mcu.get_name(), params["i2c_bus_status"]))
return paramsThat is, if there's no rush, can we make this PR have less behavioural changes. It'll still change the mcu->host protocol and still introduce redundant i2c writes on mcu to host message loss, but wont fundamentally change the error handling. Let me know if that doesn't work well for rollout timing or if I'm missing something. Thanks again, |
4b15329 to
0ab378a
Compare
0ab378a to
610ff20
Compare
|
Thanks. Looks fine to me. How about we give a week or two to see that #7020 doesn't cause any surprises, and then look to commit this. -Kevin |
610ff20 to
f4ce92b
Compare
|
Thank you for your contribution to Klipper. Unfortunately, a reviewer has not assigned themselves to this GitHub Pull Request. All Pull Requests are reviewed before merging, and a reviewer will need to volunteer. Further information is available at: https://www.klipper3d.org/CONTRIBUTING.html There are some steps that you can take now:
Unfortunately, if a reviewer does not assign themselves to this GitHub Pull Request then it will be automatically closed. If this happens, then it is a good idea to move further discussion to the Klipper Discourse server. Reviewers can reach out on that forum to let you know if they are interested and when they are available. Best regards, PS: I'm just an automated script, not a human being. |
3d390a9 to
5b2bd8e
Compare
|
I added an optional write-only flag to the I2C class as PoC. So, technically, things like PCA9*, sx1509, displays & etc, can still use the |
5b2bd8e to
2467b0d
Compare
Thanks. Seems fine to me. If we go this route though, I think Cheers, |
2467b0d to
ecfb30f
Compare
e0e09b2 to
260aafc
Compare
260aafc to
51e6e47
Compare
Signed-off-by: Timofey Titovets <[email protected]>
If the bus is write only, with new i2c_transfer code it is possible to at least provide some feedback on error Signed-off-by: Timofey Titovets <[email protected]>
51e6e47 to
54e203a
Compare
Migration of the I2C read error to the host side. So, if the host requests the data, and the MCU is unable to read it, the host raises an exception and shuts down.
There is an optional retry for the reads, as they are currently retriable by semantics.
Because there is a retry with pause underneath, I'm not sure if it is really necessary to implement the same retry pause loop in the bus.py.
Something similar could be done to i2c_write. I'm a little unhappy that it would make i2c_write chains less effective; on the other hand, I'm not aware of the code, where they would be practically reasonable.
Generally, I would assume, I2C write is retriable, in the sense that it writes the final register state.
I'm a little unsure if we can still implement i2c_write chaining and if it is really required by any device/usage.
I do think it is possible to register an Async callback, which would raise the error inside the bus.py.
But it seems a little tricky, and there is a chance that the MCU would lose the error feedback in the communication.
But if we do always respond with the error/success feedback, I do not see a non-blocking way to do so in any meaningful manner, which would allow retrying the last write.
Anyway, I'm open to suggestions.
I think that it is okay for now to just raise an exception and stop the machine from the host side, instead of the MCU.
It is a bonus if we do implement auto-retry here for all other modules.
Thanks.
Depends on: #7012, #7020
Child of: #6689