-
Notifications
You must be signed in to change notification settings - Fork 58
Open
Feature
0 / 30 of 3 issues completed
Copy link
Labels
query-engineQuery Engine / Transform related tasksQuery Engine / Transform related tasksquery-engine-columnarColumnar query engine which uses DataFusion to process OTAP BatchesColumnar query engine which uses DataFusion to process OTAP Batches
Description
In #1342 we implemented a version of our columnar query engine (which I'm still somewhat hacking on). I thought it would be good to have an issue to track future work on this including feature gaps, performance optimizations, future ideation/experimentation, etc.
(note: list below is still a work in progress. what's documented below is not a comprehensive list of TODOs. More may be added to it as future and for some of these we can/will make dedicated issues).
Feature gaps:
- Additional signal support
- Metrics
- Traces
- Attribute transformations:
- Set attribute
- from literals(
extend attributes["X"] ="Y") - from other expressions (things like
extend attributes["event"] = <some_field> / some_func(<some_field>) / <etc.>)
- from literals(
- Rename attributes (
project-rename) - Drop attributes (
project-away)
- Set attribute
- Filtering:
- Literal on RHS of binary expression not supported (
where "WARN" == severity_textdoes not work) - Filter by body
- Literal on RHS of binary expression not supported (
Plan Construction & Execution: (I will add issues to expand on the bullet points in near-future)
- Columnar query engine design top level API #1417
- Mechanism for
ExecutionPlans to access current batch for varying payload types #1409 - Custom
ExecutionPlanimpl for filtering by attributes #1410 - Perf: remove the
window(row_number())on the root batch scan - Optimize the post-filtering applied to child batches
- mechanism to perform a "full replan" when schema changes are not compatible with current plan
Further Exploration
- supporting nested streams pipelines (iterating metrics datapoints)
- filtering by nested streams (filter metrics by datapoints)
- support fork (copy data and send down two pipelines)
lquerel
Sub-issues
Metadata
Metadata
Assignees
Labels
query-engineQuery Engine / Transform related tasksQuery Engine / Transform related tasksquery-engine-columnarColumnar query engine which uses DataFusion to process OTAP BatchesColumnar query engine which uses DataFusion to process OTAP Batches
Type
Projects
Status
No status