Skip to content

Conversation

@logesh45
Copy link
Contributor

@logesh45 logesh45 commented Nov 7, 2025

Performance

Cuts test_car_interfaces.py runtime from ~38.84s to ~15.98s (≈2.4× faster). The slowest test drops from ~6.52s to ~0.81s (8× faster), meeting the target of ≤0.2s avg and <1s max per car.

Changes

  • DBC caching: Module-level cache with explicit get_dbc() factory function. Each DBC file is parsed only once per process, with subsequent calls reusing the cached instance.
  • CANParser fast path: Early-return in CANParser.update when called with an empty frame list, avoiding needless buffer clearing.

Verification

Ran the test suite before and after the change:

pytest --durations=10 selfdrive/car/tests/test_car_interfaces.py

Before:

image

After:
image

@github-actions github-actions bot added the can related to CAN tools, aka opendbc/can/ label Nov 7, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing to opendbc! In order for us to review your PR as quickly as possible, check the following:

  • Convert your PR to a draft unless it's ready to review
  • Read the contributing docs
  • Before marking as "ready for review", ensure:
    • the goal is clearly stated in the description
    • all the tests are passing
    • include a route or your device' dongle ID if relevant

@logesh45 logesh45 marked this pull request as draft November 7, 2025 16:03
- Keep a class-level cache of parsed DBC files so each path is parsed once per process
- Return early in CANParser.update when no frames arrive to avoid needless work
- Cuts test_car_interfaces runtime significantly
@logesh45 logesh45 force-pushed the perf/cache-dbc-parsing branch from 853a460 to 02c0e1e Compare November 7, 2025 16:24
@logesh45 logesh45 marked this pull request as ready for review November 7, 2025 16:26
Comment on lines 85 to 105
def __new__(cls, name: str):
dbc_path = name
if not os.path.exists(dbc_path):
dbc_path = os.path.join(DBC_PATH, name + ".dbc")
candidate = os.path.join(DBC_PATH, name + ".dbc")
if os.path.exists(candidate):
dbc_path = candidate
else:
raise FileNotFoundError(f"DBC file not found: {name}")
dbc_path = os.path.abspath(dbc_path)

cached = cls._CACHE.get(dbc_path)
if cached is not None:
return cached

self = super().__new__(cls)
self._dbc_path = dbc_path
return self

def __init__(self, name: str):
if getattr(self, "_initialized", False):
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd avoid implementing the singleton pattern in this manner as it can have hidden side effects and makes reasoning about the logic a bit harder; you can read this article on why singletons are a bad idea: https://nedbatchelder.com/blog/202204/singleton_is_a_bad_idea.html

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a PR with a cleaner implementation, hope you don't mind

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make these changes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to merge my PR

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KRRT7 I don't see your PR, where is it?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened it against their fork & branch logesh45#1

Copy link
Contributor Author

@logesh45 logesh45 Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KRRT7 Thank you for your inputs. I refactored further to use module level factory to keep DBC a pure data class and make Cache independent and easier to test.

@logesh45 logesh45 closed this Nov 9, 2025
@logesh45 logesh45 deleted the perf/cache-dbc-parsing branch November 9, 2025 04:39
@logesh45 logesh45 restored the perf/cache-dbc-parsing branch November 9, 2025 04:41
@logesh45 logesh45 reopened this Nov 9, 2025
@logesh45
Copy link
Contributor Author

logesh45 commented Nov 9, 2025

@maxime-desroches This is the PR I was talking about today. I cant run the main openpilot repo GitHub tests with these changes because this is a submodule.

Copy link

@KRRT7 KRRT7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice :)

Copy link
Collaborator

@jyoung8607 jyoung8607 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a meaningful improvement! I started looking at the same problem. I'm not sure this is the best or prettiest way to solve it, but it seems to be the simplest.

@adeebshihadeh This sidesteps a great deal of duplicate parsing work. Right now for every single car port we have to parse the full text of the same DBC file several times: two or three CANParsers, a CANDefine, and a CANPacker or two. That's a problem for ports with big DBCs, like Ford with a 12,700 line DBC. Then multiply all that for hypothesis.

Before, on my workstation, pretty consistent:

(opendbc) jyoung@jy-workstation-ubuntu:~/openpilot/opendbc_repo$ pytest -n1 opendbc/car/tests/test_car_interfaces.py -k test_car_interfaces[FORD_MAVERICK_MK1]
Cannot read termcap database;
using dumb terminal settings.
================================================================================= test session starts ==================================================================================
platform linux -- Python 3.12.0, pytest-8.4.2, pluggy-1.6.0
Using --randomly-seed=2812254386
rootdir: /home/jyoung/openpilot/opendbc_repo
configfile: pyproject.toml
plugins: randomly-4.0.1, cov-7.0.0, xdist-3.7.1.dev24+g2b4372b, subtests-0.15.0, mock-3.15.1, hypothesis-6.47.5
initialized: 1/1 workerCannot read termcap database;
using dumb terminal settings.
1 worker [1 item]      
.                                                                                                                                                                                [100%]
================================================================================= slowest 10 durations =================================================================================
1.01s call     opendbc/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces[FORD_MAVERICK_MK1]

(2 durations < 0.005s hidden.  Use -vv to show these durations.)
================================================================================== 1 passed in 1.59s ===================================================================================

After, on my workstation, also pretty consistent:

(opendbc) jyoung@jy-workstation-ubuntu:~/openpilot/opendbc_repo$ pytest -n1 opendbc/car/tests/test_car_interfaces.py -k test_car_interfaces[FORD_MAVERICK_MK1]
Cannot read termcap database;
using dumb terminal settings.
================================================================================= test session starts ==================================================================================
platform linux -- Python 3.12.0, pytest-8.4.2, pluggy-1.6.0
Using --randomly-seed=133558739
rootdir: /home/jyoung/openpilot/opendbc_repo
configfile: pyproject.toml
plugins: randomly-4.0.1, cov-7.0.0, xdist-3.7.1.dev24+g2b4372b, subtests-0.15.0, mock-3.15.1, hypothesis-6.47.5
initialized: 1/1 workerCannot read termcap database;
using dumb terminal settings.
1 worker [1 item]      
.                                                                                                                                                                                [100%]
================================================================================= slowest 10 durations =================================================================================
0.14s call     opendbc/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces[FORD_MAVERICK_MK1]

(2 durations < 0.005s hidden.  Use -vv to show these durations.)
================================================================================== 1 passed in 0.68s ===================================================================================

Comment on lines 217 to 219
if not strings:
return set()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Is it necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change handles the case where strings could be None, preventing a potential TypeError when trying to check strings[0] on line 220. This makes the code more robust against unexpected inputs.

Current tests pass empty lists, other code paths might pass None at a later time, and this prevents runtime errors

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's type hint strings to make it clear then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this is a separate, unrelated change to this PR? you can open a new PR for it

@sshane sshane mentioned this pull request Nov 30, 2025
@sshane
Copy link
Contributor

sshane commented Nov 30, 2025

You can just decorate with functools.cache, no? #2922

@sshane
Copy link
Contributor

sshane commented Nov 30, 2025

you usually get better results if you first try to simplify, then optimize! Firstly, we weren't using any of the benefits of dataclass here, re-implementing __init__. I also think DBC.from_file which would have handled the memoization in 5 lines or so would have been a better pattern than making a function separate to the class. See how https://github.com/commaai/openpilot/blob/85a162dd4320f14cb8aba1bcd7ff427d9b753ce5/system/ui/lib/wifi_manager.py#L80-L93 works since it does some preprocessing before storing in the frozen dataclass

But to make it super simple, just going to memoize it like this for now

@sshane sshane merged commit 18c1f08 into commaai:master Nov 30, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can related to CAN tools, aka opendbc/can/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants