-
Notifications
You must be signed in to change notification settings - Fork 11
Description
The query rewrite implementation is close to supporting joins in the existing code, but additional work is required:
- While
SpjNormalFormalready supports queries with joins, theViewMatchingRewriterneeds to be generalized to take into account queries with multiple tables. - Due to joins potentially producing duplicates, we must check if the query being matched has the same set of tables as the materialized view being considered. This is explained in further detail in the paper
It is possible to relax (1.b) somewhat by implementing the duplication factor test in the paper (see sections 3.1.5 and 3.2), which would let us substitute materialized views whose tables are a superset of the original query, but this will require using DataFusion's constraints to check for appropriate uniqueness & foreign key constraints; hence, this is not a strict generalization. Furthermore, foreign key constraints are not currently supported in DataFusion, so some work will be needed there.
Future work may also add the filter tree proposed in the paper, but IMO this should be done separately as it is an optimization for large numbers of materialized views.
It's also worth noting that the query rewriting algorithm only really works for inner joins, as it relies on the equivalence of these to cross join + select/project/filter. For outer joins, there is another paper that builds on prior work (including the previous paper) to extend the approach to outer joins, by introducing join-disjunctive normal form. However, it is much more technical, e.g. it introduces the notion of minimum union and describes how to reduce these into normal unions.