-
Notifications
You must be signed in to change notification settings - Fork 1.7k
can: cache DBC parsing to speed up car interface tests #2888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing to opendbc! In order for us to review your PR as quickly as possible, check the following:
- Convert your PR to a draft unless it's ready to review
- Read the contributing docs
- Before marking as "ready for review", ensure:
- the goal is clearly stated in the description
- all the tests are passing
- include a route or your device' dongle ID if relevant
- Keep a class-level cache of parsed DBC files so each path is parsed once per process - Return early in CANParser.update when no frames arrive to avoid needless work - Cuts test_car_interfaces runtime significantly
853a460 to
02c0e1e
Compare
opendbc/can/dbc.py
Outdated
| def __new__(cls, name: str): | ||
| dbc_path = name | ||
| if not os.path.exists(dbc_path): | ||
| dbc_path = os.path.join(DBC_PATH, name + ".dbc") | ||
| candidate = os.path.join(DBC_PATH, name + ".dbc") | ||
| if os.path.exists(candidate): | ||
| dbc_path = candidate | ||
| else: | ||
| raise FileNotFoundError(f"DBC file not found: {name}") | ||
| dbc_path = os.path.abspath(dbc_path) | ||
|
|
||
| cached = cls._CACHE.get(dbc_path) | ||
| if cached is not None: | ||
| return cached | ||
|
|
||
| self = super().__new__(cls) | ||
| self._dbc_path = dbc_path | ||
| return self | ||
|
|
||
| def __init__(self, name: str): | ||
| if getattr(self, "_initialized", False): | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid implementing the singleton pattern in this manner as it can have hidden side effects and makes reasoning about the logic a bit harder; you can read this article on why singletons are a bad idea: https://nedbatchelder.com/blog/202204/singleton_is_a_bad_idea.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a PR with a cleaner implementation, hope you don't mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me make these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge my PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KRRT7 I don't see your PR, where is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened it against their fork & branch logesh45#1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KRRT7 Thank you for your inputs. I refactored further to use module level factory to keep DBC a pure data class and make Cache independent and easier to test.
|
@maxime-desroches This is the PR I was talking about today. I cant run the main openpilot repo GitHub tests with these changes because this is a submodule. |
KRRT7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice :)
jyoung8607
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a meaningful improvement! I started looking at the same problem. I'm not sure this is the best or prettiest way to solve it, but it seems to be the simplest.
@adeebshihadeh This sidesteps a great deal of duplicate parsing work. Right now for every single car port we have to parse the full text of the same DBC file several times: two or three CANParsers, a CANDefine, and a CANPacker or two. That's a problem for ports with big DBCs, like Ford with a 12,700 line DBC. Then multiply all that for hypothesis.
Before, on my workstation, pretty consistent:
(opendbc) jyoung@jy-workstation-ubuntu:~/openpilot/opendbc_repo$ pytest -n1 opendbc/car/tests/test_car_interfaces.py -k test_car_interfaces[FORD_MAVERICK_MK1]
Cannot read termcap database;
using dumb terminal settings.
================================================================================= test session starts ==================================================================================
platform linux -- Python 3.12.0, pytest-8.4.2, pluggy-1.6.0
Using --randomly-seed=2812254386
rootdir: /home/jyoung/openpilot/opendbc_repo
configfile: pyproject.toml
plugins: randomly-4.0.1, cov-7.0.0, xdist-3.7.1.dev24+g2b4372b, subtests-0.15.0, mock-3.15.1, hypothesis-6.47.5
initialized: 1/1 workerCannot read termcap database;
using dumb terminal settings.
1 worker [1 item]
. [100%]
================================================================================= slowest 10 durations =================================================================================
1.01s call opendbc/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces[FORD_MAVERICK_MK1]
(2 durations < 0.005s hidden. Use -vv to show these durations.)
================================================================================== 1 passed in 1.59s ===================================================================================
After, on my workstation, also pretty consistent:
(opendbc) jyoung@jy-workstation-ubuntu:~/openpilot/opendbc_repo$ pytest -n1 opendbc/car/tests/test_car_interfaces.py -k test_car_interfaces[FORD_MAVERICK_MK1]
Cannot read termcap database;
using dumb terminal settings.
================================================================================= test session starts ==================================================================================
platform linux -- Python 3.12.0, pytest-8.4.2, pluggy-1.6.0
Using --randomly-seed=133558739
rootdir: /home/jyoung/openpilot/opendbc_repo
configfile: pyproject.toml
plugins: randomly-4.0.1, cov-7.0.0, xdist-3.7.1.dev24+g2b4372b, subtests-0.15.0, mock-3.15.1, hypothesis-6.47.5
initialized: 1/1 workerCannot read termcap database;
using dumb terminal settings.
1 worker [1 item]
. [100%]
================================================================================= slowest 10 durations =================================================================================
0.14s call opendbc/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces[FORD_MAVERICK_MK1]
(2 durations < 0.005s hidden. Use -vv to show these durations.)
================================================================================== 1 passed in 0.68s ===================================================================================
opendbc/can/parser.py
Outdated
| if not strings: | ||
| return set() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? Is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change handles the case where strings could be None, preventing a potential TypeError when trying to check strings[0] on line 220. This makes the code more robust against unexpected inputs.
Current tests pass empty lists, other code paths might pass None at a later time, and this prevents runtime errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's type hint strings to make it clear then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this is a separate, unrelated change to this PR? you can open a new PR for it
|
You can just decorate with functools.cache, no? #2922 |
it's not at all used like a dataclass
|
you usually get better results if you first try to simplify, then optimize! Firstly, we weren't using any of the benefits of dataclass here, re-implementing But to make it super simple, just going to memoize it like this for now |
Performance
Cuts
test_car_interfaces.pyruntime from ~38.84s to ~15.98s (≈2.4× faster). The slowest test drops from ~6.52s to ~0.81s (8× faster), meeting the target of ≤0.2s avg and <1s max per car.Changes
Verification
Ran the test suite before and after the change:
Before:
After:
