-
Notifications
You must be signed in to change notification settings - Fork 469
feat(llmobs): emitting reasoning tokens metric for openai integration #15478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 247 ± 2 ms. The average import time from base is: 251 ± 5 ms. The import time difference between this PR and base is: -3.6 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate xinyuan/add-resoning-tokens (4951688) with baseline main (9ab207c) 📈 Performance Regressions (3 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.403µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -0.1% Memory: ✅ 40.243MB (SLO: <41.500MB -3.0%) vs baseline: +5.0% ✅ add_inplace_aspectTime: ✅ 0.409µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.7% Memory: ✅ 40.166MB (SLO: <41.500MB -3.2%) vs baseline: +4.5% ✅ add_inplace_noaspectTime: ✅ 0.314µs (SLO: <10.000µs 📉 -96.9%) vs baseline: -2.4% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +4.3% ✅ add_noaspectTime: ✅ 0.277µs (SLO: <10.000µs 📉 -97.2%) vs baseline: -0.4% Memory: ✅ 40.104MB (SLO: <41.500MB -3.4%) vs baseline: +4.5% ✅ bytearray_aspectTime: ✅ 1.361µs (SLO: <10.000µs 📉 -86.4%) vs baseline: -0.3% Memory: ✅ 40.163MB (SLO: <41.500MB -3.2%) vs baseline: +4.9% ✅ bytearray_extend_aspectTime: ✅ 1.509µs (SLO: <10.000µs 📉 -84.9%) vs baseline: -0.3% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +4.7% ✅ bytearray_extend_noaspectTime: ✅ 0.614µs (SLO: <10.000µs 📉 -93.9%) vs baseline: +0.7% Memory: ✅ 40.146MB (SLO: <41.500MB -3.3%) vs baseline: +3.8% ✅ bytearray_noaspectTime: ✅ 0.484µs (SLO: <10.000µs 📉 -95.2%) vs baseline: +0.2% Memory: ✅ 40.185MB (SLO: <41.500MB -3.2%) vs baseline: +4.3% ✅ bytes_aspectTime: ✅ 1.291µs (SLO: <10.000µs 📉 -87.1%) vs baseline: +0.7% Memory: ✅ 39.967MB (SLO: <41.500MB -3.7%) vs baseline: +4.1% ✅ bytes_noaspectTime: ✅ 0.494µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.8% Memory: ✅ 40.029MB (SLO: <41.500MB -3.5%) vs baseline: +3.6% ✅ bytesio_aspectTime: ✅ 1.362µs (SLO: <10.000µs 📉 -86.4%) vs baseline: +1.7% Memory: ✅ 40.124MB (SLO: <41.500MB -3.3%) vs baseline: +4.5% ✅ bytesio_noaspectTime: ✅ 0.499µs (SLO: <10.000µs 📉 -95.0%) vs baseline: -0.1% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +3.8% ✅ capitalize_aspectTime: ✅ 0.741µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +0.2% Memory: ✅ 40.203MB (SLO: <41.500MB -3.1%) vs baseline: +3.8% ✅ capitalize_noaspectTime: ✅ 0.438µs (SLO: <10.000µs 📉 -95.6%) vs baseline: ~same Memory: ✅ 40.186MB (SLO: <41.500MB -3.2%) vs baseline: +3.8% ✅ casefold_aspectTime: ✅ 0.737µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +0.3% Memory: ✅ 40.144MB (SLO: <41.500MB -3.3%) vs baseline: +4.4% ✅ casefold_noaspectTime: ✅ 0.370µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.1% Memory: ✅ 40.147MB (SLO: <41.500MB -3.3%) vs baseline: +3.8% ✅ decode_aspectTime: ✅ 0.730µs (SLO: <10.000µs 📉 -92.7%) vs baseline: +0.6% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +4.1% ✅ decode_noaspectTime: ✅ 0.418µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.5% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +3.9% ✅ encode_aspectTime: ✅ 0.712µs (SLO: <10.000µs 📉 -92.9%) vs baseline: +0.6% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +3.8% ✅ encode_noaspectTime: ✅ 0.400µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -2.5% Memory: ✅ 40.029MB (SLO: <41.500MB -3.5%) vs baseline: +3.5% ✅ format_aspectTime: ✅ 3.439µs (SLO: <10.000µs 📉 -65.6%) vs baseline: +0.5% Memory: ✅ 40.127MB (SLO: <41.500MB -3.3%) vs baseline: +3.9% ✅ format_map_aspectTime: ✅ 3.561µs (SLO: <10.000µs 📉 -64.4%) vs baseline: -1.7% Memory: ✅ 40.144MB (SLO: <41.500MB -3.3%) vs baseline: +4.5% ✅ format_map_noaspectTime: ✅ 0.770µs (SLO: <10.000µs 📉 -92.3%) vs baseline: -0.2% Memory: ✅ 40.324MB (SLO: <41.500MB -2.8%) vs baseline: +4.3% ✅ format_noaspectTime: ✅ 0.596µs (SLO: <10.000µs 📉 -94.0%) vs baseline: -0.1% Memory: ✅ 40.029MB (SLO: <41.500MB -3.5%) vs baseline: +3.4% ✅ index_aspectTime: ✅ 0.363µs (SLO: <10.000µs 📉 -96.4%) vs baseline: +1.5% Memory: ✅ 40.144MB (SLO: <41.500MB -3.3%) vs baseline: +4.6% ✅ index_noaspectTime: ✅ 0.277µs (SLO: <10.000µs 📉 -97.2%) vs baseline: -2.0% Memory: ✅ 40.226MB (SLO: <41.500MB -3.1%) vs baseline: +3.9% ✅ join_aspectTime: ✅ 1.342µs (SLO: <10.000µs 📉 -86.6%) vs baseline: +0.7% Memory: ✅ 40.185MB (SLO: <41.500MB -3.2%) vs baseline: +4.4% ✅ join_noaspectTime: ✅ 0.491µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.2% Memory: ✅ 40.203MB (SLO: <41.500MB -3.1%) vs baseline: +4.8% ✅ ljust_aspectTime: ✅ 2.590µs (SLO: <20.000µs 📉 -87.1%) vs baseline: +1.6% Memory: ✅ 40.245MB (SLO: <41.500MB -3.0%) vs baseline: +3.7% ✅ ljust_noaspectTime: ✅ 0.407µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.2% Memory: ✅ 40.246MB (SLO: <41.500MB -3.0%) vs baseline: +4.0% ✅ lower_aspectTime: ✅ 2.302µs (SLO: <10.000µs 📉 -77.0%) vs baseline: +2.5% Memory: ✅ 40.363MB (SLO: <41.500MB -2.7%) vs baseline: +4.5% ✅ lower_noaspectTime: ✅ 0.368µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.4% Memory: ✅ 40.442MB (SLO: <41.500MB -2.5%) vs baseline: +4.6% ✅ lstrip_aspectTime: ✅ 2.277µs (SLO: <20.000µs 📉 -88.6%) vs baseline: +1.1% Memory: ✅ 40.186MB (SLO: <41.500MB -3.2%) vs baseline: +3.7% ✅ lstrip_noaspectTime: ✅ 0.382µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +0.5% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +4.2% ✅ modulo_aspectTime: ✅ 1.046µs (SLO: <10.000µs 📉 -89.5%) vs baseline: +4.4% Memory: ✅ 40.245MB (SLO: <41.500MB -3.0%) vs baseline: +4.0% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.544µs (SLO: <10.000µs 📉 -84.6%) vs baseline: -0.9% Memory: ✅ 40.088MB (SLO: <41.500MB -3.4%) vs baseline: +3.8% ✅ modulo_aspect_for_bytesTime: ✅ 0.980µs (SLO: <10.000µs 📉 -90.2%) vs baseline: +0.6% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +4.1% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.269µs (SLO: <10.000µs 📉 -87.3%) vs baseline: +4.4% Memory: ✅ 40.187MB (SLO: <41.500MB -3.2%) vs baseline: +4.9% ✅ modulo_noaspectTime: ✅ 0.626µs (SLO: <10.000µs 📉 -93.7%) vs baseline: -0.8% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +4.1% ✅ replace_aspectTime: ✅ 4.860µs (SLO: <10.000µs 📉 -51.4%) vs baseline: -0.5% Memory: ✅ 40.205MB (SLO: <41.500MB -3.1%) vs baseline: +3.9% ✅ replace_noaspectTime: ✅ 0.464µs (SLO: <10.000µs 📉 -95.4%) vs baseline: +0.5% Memory: ✅ 40.186MB (SLO: <41.500MB -3.2%) vs baseline: +3.8% ✅ repr_aspectTime: ✅ 0.910µs (SLO: <10.000µs 📉 -90.9%) vs baseline: +0.7% Memory: ✅ 40.203MB (SLO: <41.500MB -3.1%) vs baseline: +4.5% ✅ repr_noaspectTime: ✅ 0.417µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.8% Memory: ✅ 40.104MB (SLO: <41.500MB -3.4%) vs baseline: +4.7% ✅ rstrip_aspectTime: ✅ 1.948µs (SLO: <20.000µs 📉 -90.3%) vs baseline: +1.2% Memory: ✅ 40.108MB (SLO: <41.500MB -3.4%) vs baseline: +4.0% ✅ rstrip_noaspectTime: ✅ 0.378µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -1.6% Memory: ✅ 40.127MB (SLO: <41.500MB -3.3%) vs baseline: +3.6% ✅ slice_aspectTime: ✅ 0.491µs (SLO: <10.000µs 📉 -95.1%) vs baseline: ~same Memory: ✅ 40.341MB (SLO: <41.500MB -2.8%) vs baseline: +5.1% ✅ slice_noaspectTime: ✅ 0.452µs (SLO: <10.000µs 📉 -95.5%) vs baseline: +1.7% Memory: ✅ 40.383MB (SLO: <41.500MB -2.7%) vs baseline: +4.5% ✅ stringio_aspectTime: ✅ 1.536µs (SLO: <10.000µs 📉 -84.6%) vs baseline: ~same Memory: ✅ 40.304MB (SLO: <41.500MB -2.9%) vs baseline: +5.0% ✅ stringio_noaspectTime: ✅ 0.719µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -0.5% Memory: ✅ 40.265MB (SLO: <41.500MB -3.0%) vs baseline: +4.1% ✅ strip_aspectTime: ✅ 2.255µs (SLO: <20.000µs 📉 -88.7%) vs baseline: +1.7% Memory: ✅ 40.147MB (SLO: <41.500MB -3.3%) vs baseline: +3.8% ✅ strip_noaspectTime: ✅ 0.384µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -0.6% Memory: ✅ 40.305MB (SLO: <41.500MB -2.9%) vs baseline: +4.4% ✅ swapcase_aspectTime: ✅ 2.780µs (SLO: <10.000µs 📉 -72.2%) vs baseline: 📈 +13.2% Memory: ✅ 40.225MB (SLO: <41.500MB -3.1%) vs baseline: +4.0% ✅ swapcase_noaspectTime: ✅ 0.533µs (SLO: <10.000µs 📉 -94.7%) vs baseline: -1.2% Memory: ✅ 40.245MB (SLO: <41.500MB -3.0%) vs baseline: +4.1% ✅ title_aspectTime: ✅ 2.428µs (SLO: <10.000µs 📉 -75.7%) vs baseline: +2.0% Memory: ✅ 40.147MB (SLO: <41.500MB -3.3%) vs baseline: +3.7% ✅ title_noaspectTime: ✅ 0.503µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.3% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +4.0% ✅ translate_aspectTime: ✅ 3.299µs (SLO: <10.000µs 📉 -67.0%) vs baseline: -0.2% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +3.8% ✅ translate_noaspectTime: ✅ 1.040µs (SLO: <10.000µs 📉 -89.6%) vs baseline: +0.1% Memory: ✅ 40.402MB (SLO: <41.500MB -2.6%) vs baseline: +4.4% ✅ upper_aspectTime: ✅ 2.310µs (SLO: <10.000µs 📉 -76.9%) vs baseline: +2.3% Memory: ✅ 40.304MB (SLO: <41.500MB -2.9%) vs baseline: +4.1% ✅ upper_noaspectTime: ✅ 0.369µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -0.4% Memory: ✅ 40.167MB (SLO: <41.500MB -3.2%) vs baseline: +3.8% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 5.161µs (SLO: <10.000µs 📉 -48.4%) vs baseline: 📈 +24.8% Memory: ✅ 40.246MB (SLO: <41.000MB 🟡 -1.8%) vs baseline: +4.7% ✅ ospathbasename_noaspectTime: ✅ 1.078µs (SLO: <10.000µs 📉 -89.2%) vs baseline: ~same Memory: ✅ 40.187MB (SLO: <41.000MB 🟡 -2.0%) vs baseline: +4.6% ✅ ospathjoin_aspectTime: ✅ 6.135µs (SLO: <10.000µs 📉 -38.6%) vs baseline: -1.1% Memory: ✅ 40.226MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.7% ✅ ospathjoin_noaspectTime: ✅ 2.283µs (SLO: <10.000µs 📉 -77.2%) vs baseline: +0.1% Memory: ✅ 40.187MB (SLO: <41.000MB 🟡 -2.0%) vs baseline: +4.6% ✅ ospathnormcase_aspectTime: ✅ 3.435µs (SLO: <10.000µs 📉 -65.7%) vs baseline: +0.7% Memory: ✅ 40.147MB (SLO: <41.000MB -2.1%) vs baseline: +4.7% ✅ ospathnormcase_noaspectTime: ✅ 0.566µs (SLO: <10.000µs 📉 -94.3%) vs baseline: -1.5% Memory: ✅ 40.246MB (SLO: <41.000MB 🟡 -1.8%) vs baseline: +4.8% ✅ ospathsplit_aspectTime: ✅ 4.754µs (SLO: <10.000µs 📉 -52.5%) vs baseline: -0.6% Memory: ✅ 40.246MB (SLO: <41.000MB 🟡 -1.8%) vs baseline: +4.9% ✅ ospathsplit_noaspectTime: ✅ 1.588µs (SLO: <10.000µs 📉 -84.1%) vs baseline: -0.6% Memory: ✅ 40.265MB (SLO: <41.000MB 🟡 -1.8%) vs baseline: +5.0% ✅ ospathsplitdrive_aspectTime: ✅ 3.644µs (SLO: <10.000µs 📉 -63.6%) vs baseline: ~same Memory: ✅ 40.206MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.7% ✅ ospathsplitdrive_noaspectTime: ✅ 0.701µs (SLO: <10.000µs 📉 -93.0%) vs baseline: -0.2% Memory: ✅ 40.049MB (SLO: <41.000MB -2.3%) vs baseline: +4.4% ✅ ospathsplitext_aspectTime: ✅ 4.496µs (SLO: <10.000µs 📉 -55.0%) vs baseline: -0.3% Memory: ✅ 40.226MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.6% ✅ ospathsplitext_noaspectTime: ✅ 1.385µs (SLO: <10.000µs 📉 -86.2%) vs baseline: +0.2% Memory: ✅ 40.364MB (SLO: <41.000MB 🟡 -1.6%) vs baseline: +5.0% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.395µs (SLO: <20.000µs 📉 -83.0%) vs baseline: 📈 +15.6% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +5.0% ✅ 1-count-metrics-100-timesTime: ✅ 201.419µs (SLO: <220.000µs -8.4%) vs baseline: +0.4% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +5.1% ✅ 1-distribution-metric-1-timesTime: ✅ 3.316µs (SLO: <20.000µs 📉 -83.4%) vs baseline: +0.3% Memory: ✅ 34.760MB (SLO: <35.500MB -2.1%) vs baseline: +5.0% ✅ 1-distribution-metrics-100-timesTime: ✅ 217.674µs (SLO: <230.000µs -5.4%) vs baseline: -0.7% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.7% ✅ 1-gauge-metric-1-timesTime: ✅ 2.185µs (SLO: <20.000µs 📉 -89.1%) vs baseline: -0.2% Memory: ✅ 34.701MB (SLO: <35.500MB -2.2%) vs baseline: +4.7% ✅ 1-gauge-metrics-100-timesTime: ✅ 136.267µs (SLO: <150.000µs -9.2%) vs baseline: -0.3% Memory: ✅ 34.741MB (SLO: <35.500MB -2.1%) vs baseline: +5.1% ✅ 1-rate-metric-1-timesTime: ✅ 3.091µs (SLO: <20.000µs 📉 -84.5%) vs baseline: +0.1% Memory: ✅ 34.760MB (SLO: <35.500MB -2.1%) vs baseline: +4.7% ✅ 1-rate-metrics-100-timesTime: ✅ 214.611µs (SLO: <250.000µs 📉 -14.2%) vs baseline: -1.0% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.9% ✅ 100-count-metrics-100-timesTime: ✅ 20.406ms (SLO: <22.000ms -7.2%) vs baseline: +0.8% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +5.0% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.266ms (SLO: <2.300ms 🟡 -1.5%) vs baseline: -0.2% Memory: ✅ 34.662MB (SLO: <35.500MB -2.4%) vs baseline: +4.8% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.418ms (SLO: <1.550ms -8.5%) vs baseline: +1.2% Memory: ✅ 34.701MB (SLO: <35.500MB -2.2%) vs baseline: +4.4% ✅ 100-rate-metrics-100-timesTime: ✅ 2.227ms (SLO: <2.550ms 📉 -12.7%) vs baseline: +0.6% Memory: ✅ 34.760MB (SLO: <35.500MB -2.1%) vs baseline: +4.7% ✅ flush-1-metricTime: ✅ 4.616µs (SLO: <20.000µs 📉 -76.9%) vs baseline: ~same Memory: ✅ 35.075MB (SLO: <35.500MB 🟡 -1.2%) vs baseline: +4.6% ✅ flush-100-metricsTime: ✅ 173.593µs (SLO: <250.000µs 📉 -30.6%) vs baseline: -0.3% Memory: ✅ 35.232MB (SLO: <35.500MB 🟡 -0.8%) vs baseline: +5.1% ✅ flush-1000-metricsTime: ✅ 2.188ms (SLO: <2.500ms 📉 -12.5%) vs baseline: ~same Memory: ✅ 35.861MB (SLO: <36.500MB 🟡 -1.7%) vs baseline: +4.6% 🟡 Near SLO Breach (17 suites)🟡 coreapiscenario - 10/10 (1 unstable)
|
Description
Emit reasoning tokens metric, and remove it from metadata.
MLOB-4264
Testing
Risks
Additional Notes