-
Notifications
You must be signed in to change notification settings - Fork 3
Expand file tree
/
Copy path04-scheduling.html
More file actions
353 lines (329 loc) · 18.7 KB
/
04-scheduling.html
File metadata and controls
353 lines (329 loc) · 18.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title># Scheduling - egglog tutorial</title>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
line-height: 1.6;
color: #333;
max-width: 900px;
margin: 0 auto;
padding: 0 20px;
background-color: #fafafa;
}
.tutorial-nav {
background: #2c3e50;
color: white;
padding: 1rem 0;
margin: -20px -20px 2rem -20px;
border-bottom: 3px solid #3498db;
}
.nav-container {
max-width: 900px;
margin: 0 auto;
padding: 0 20px;
}
.tutorial-nav h1 {
margin: 0 0 0.5rem 0;
color: #ecf0f1;
font-size: 1.5rem;
}
.nav-links a {
color: #3498db;
text-decoration: none;
margin: 0 0.5rem;
padding: 0.3rem 0.6rem;
border-radius: 4px;
transition: background-color 0.2s;
}
.nav-links a:hover {
background-color: rgba(52, 152, 219, 0.2);
}
.nav-links .current {
background-color: #3498db;
color: white;
padding: 0.3rem 0.6rem;
border-radius: 4px;
margin: 0 0.5rem;
}
h1 {
color: #2c3e50;
border-bottom: 2px solid #3498db;
padding-bottom: 0.5rem;
}
p {
margin: 1rem 0;
text-align: justify;
}
code {
background: #f8f9fa;
border: 1px solid #e9ecef;
border-radius: 3px;
padding: 0.2rem 0.4rem;
font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', monospace;
font-size: 0.9rem;
}
pre {
background: #f8f9fa;
border: 1px solid #e9ecef;
border-radius: 6px;
padding: 1rem;
overflow-x: auto;
margin: 1.5rem 0;
}
pre code {
background: none;
border: none;
padding: 0;
font-size: 0.9rem;
line-height: 1.4;
}
/* egglog syntax highlighting */
.keyword { color: #e74c3c; font-weight: bold; }
.language-egglog {
color: #2c3e50;
}
footer {
margin-top: 3rem;
padding: 2rem 0;
text-align: center;
border-top: 1px solid #e9ecef;
color: #6c757d;
font-size: 0.9rem;
}
@media (max-width: 600px) {
.nav-links {
font-size: 0.9rem;
}
.nav-links a, .nav-links .current {
margin: 0 0.2rem;
padding: 0.2rem 0.4rem;
}
}
</style>
</head>
<body>
<nav class="tutorial-nav">
<div class="nav-container">
<h1>Tutorial for egglog</h1>
<div class="nav-links">
<a href="01-basics.html">01 Basics</a> | <a href="02-datalog.html">02 Datalog</a> | <a href="03-analysis.html">03 Analysis</a> | <span class="current">04 Scheduling</span> | <a href="05-cost-model-and-extraction.html">05 Cost Model And Extraction</a> | <a href="06-case-study.html">06 Case Study</a>
</div>
</div>
</nav>
<main>
<h1># Scheduling</h1>
<p>In this lesson, we will learn how to use <code>run-schedule</code> to improve the performance of egglog.
We start by using the same language as the previous lesson.</p>
<pre><code class="language-egglog">(<span class="keyword">datatype</span> Expr
(Num BigRat)
(Var String)
(Add Expr Expr)
(Mul Expr Expr)
(Div Expr Expr))
(<span class="keyword">let</span> zero (bigrat (bigint 0) (bigint 1)))
(<span class="keyword">let</span> one (bigrat (bigint 1) (bigint 1)))
(<span class="keyword">let</span> two (bigrat (bigint 2) (bigint 1)))</code></pre>
<p><strong>Rulesets</strong></p>
<p>Different from lesson 3, we organize our rules into "rulesets"
A ruleset is exactly what it sounds like; a set of rules.
We can declare rulesets using the <code>ruleset</code> command.</p>
<pre><code class="language-egglog">(<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">ruleset</span> analysis)</code></pre>
<p>We can add rules to rulesets using the <code>:ruleset</code> annotation on <code>rule</code>s, <code>rewrite</code>s, and <code>birewrite</code>s.
Leaving off <code>:ruleset</code> causes the rule to be added to the default ruleset, which is what we've shown so far.
We can run rulesets using <code>(run ruleset iters)</code>, or <code>(run iters)</code> for running the default ruleset.</p>
<p>Here, we add <code>leq</code> rules to the <code>analysis</code> ruleset, because they don't add new <code>Expr</code> nodes to the e-graph.</p>
<pre><code class="language-egglog">(<span class="keyword">relation</span> leq (Expr Expr))
(<span class="keyword">rule</span> ((leq e1 e2) (leq e2 e3)) ((leq e1 e3)) :<span class="keyword">ruleset</span> analysis)
(<span class="keyword">rule</span> ((= e1 (Num n1)) (= e2 (Num n2)) (<= n1 n2)) ((leq e1 e2)) :<span class="keyword">ruleset</span> analysis)
(<span class="keyword">rule</span> ((= v (Var x))) ((leq v v)) :<span class="keyword">ruleset</span> analysis)
(<span class="keyword">rule</span> ((= e1 (Add e1a e1b)) (= e2 (Add e2a e2b)) (leq e1a e2a) (leq e1b e2b))
((leq e1 e2))
:<span class="keyword">ruleset</span> analysis)</code></pre>
<p>In contrast, the following axiomatic rules are doing optimizations, so we add them to the <code>optimizations</code> ruleset.</p>
<pre><code class="language-egglog">(<span class="keyword">birewrite</span> (Add x (Add y z)) (Add (Add x y) z) :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">birewrite</span> (Mul x (Mul y z)) (Mul (Mul x y) z) :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">rewrite</span> (Add x y) (Add y x) :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">rewrite</span> (Mul x y) (Mul y x) :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">rewrite</span> (Mul x (Add y z)) (Add (Mul x y) (Mul x z)) :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">rewrite</span> (Add x (Num zero)) x :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">rewrite</span> (Mul x (Num one)) x :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">rewrite</span> (Add (Num a) (Num b)) (Num (+ a b)) :<span class="keyword">ruleset</span> optimizations)
(<span class="keyword">rewrite</span> (Mul (Num a) (Num b)) (Num (* a b)) :<span class="keyword">ruleset</span> optimizations)</code></pre>
<p>It is tedious and error-prone to manually annotate each rule with a ruleset,
so <code>egglog-experimental</code> provides a convenience syntax <code>with-ruleset</code>:</p>
<pre><code> (with-ruleset optimizations
(birewrite (Add x (Add y z)) (Add (Add x y) z))
...
)
</code></pre>
<p>From now on, we will use this syntax.</p>
<p>Here we add the rest of the rules from the last section, but tagged with the appropriate rulesets.</p>
<pre><code class="language-egglog">(<span class="keyword">function</span> upper-bound (Expr) BigRat :merge (min old new))
(<span class="keyword">function</span> lower-bound (Expr) BigRat :merge (max old new))
(<span class="keyword">with-ruleset</span> analysis
(<span class="keyword">rule</span> ((leq e (Num n))) ((<span class="keyword">set</span> (upper-bound e) n)))
(<span class="keyword">rule</span> ((leq (Num n) e)) ((<span class="keyword">set</span> (lower-bound e) n)))
(<span class="keyword">rule</span> ((= e (Add e1 e2)) (= (upper-bound e1) u1) (= (upper-bound e2) u2))
((<span class="keyword">set</span> (upper-bound e) (+ u1 u2))))
(<span class="keyword">rule</span> ((= e (Add e1 e2)) (= (lower-bound e1) l1) (= (lower-bound e2) l2))
((<span class="keyword">set</span> (lower-bound e) (+ l1 l2))))
(<span class="keyword">rule</span> ((= e (Mul e1 e2))
(= le1 (lower-bound e1)) (= le2 (lower-bound e2))
(= ue1 (upper-bound e1)) (= ue2 (upper-bound e2)))
((<span class="keyword">set</span> (lower-bound e)
(min (* le1 le2) (min (* le1 ue2) (min (* ue1 le2) (* ue1 ue2)))))
(<span class="keyword">set</span> (upper-bound e)
(max (* le1 le2) (max (* le1 ue2) (max (* ue1 le2) (* ue1 ue2))))))
)
)
(<span class="keyword">relation</span> non-zero (Expr))
(<span class="keyword">with-ruleset</span> analysis
(<span class="keyword">rule</span> ((< (upper-bound e) zero)) ((non-zero e)))
(<span class="keyword">rule</span> ((> (lower-bound e) zero)) ((non-zero e)))
)</code></pre>
<p>Finally, we have optimization rules that depend on the analysis rules we defined above.</p>
<pre><code class="language-egglog">(<span class="keyword">with-ruleset</span> optimizations
(<span class="keyword">rewrite</span> (Div x x) (Num one) :<span class="keyword">when</span> ((non-zero x)))
(<span class="keyword">rewrite</span> (Mul x (Div y x)) y :<span class="keyword">when</span> ((non-zero x)))
)</code></pre>
<p>Now consider the following program, which consists of a long sequence of additions <em>inside</em>
a cancelling division.</p>
<pre><code class="language-egglog">(<span class="keyword">push</span>)
; a + (b + (c + (d + (e + f))))
(<span class="keyword">let</span> addition-chain
(Add (Var "a") (Add (Var "b")
(Add (Var "c") (Add (Var "d")
(Add (Var "e") (Var "f")))))))
; 1 + 1 + 1 + 1 + 2
(<span class="keyword">let</span> nonzero-expr
(Add (Num one) (Add (Num one)
(Add (Num one) (Add (Num one) (Num two))))))
(<span class="keyword">let</span> expr (Mul nonzero-expr (Div addition-chain nonzero-expr)))</code></pre>
<p>We want the following check to pass after running the rules.</p>
<pre><code class="language-egglog">(<span class="keyword">fail</span> (<span class="keyword">check</span> (= expr addition-chain)))</code></pre>
<p>To make this check pass, we have to first discover that <code>nonzero-expr</code> is indeed non-zero,
which allows the rule from <code>(Mul x (Div y x))</code> to <code>y</code> to fire.
On the other hand, if we apply the optimization rules, we risk the exponential blowup from
the associative and commutative permutations of the <code>addition-chain</code>.</p>
<p>Therefore, if we try to run both rulesets directly, egglog will spend lots of effort reassociating and
commuting the terms in the <code>addition-chain</code>, even though the optimization that we actually
want to run only takes one iteration. However, that optimization requires knowing a fact
that takes multiple iterations to compute (propagating lower- and upper-bounds
through <code>nonzero-expr</code>). We can build a more efficient <em>schedule</em>.</p>
<pre><code class="language-egglog">(<span class="keyword">push</span>)</code></pre>
<p><strong>Schedules</strong></p>
<p>Our schedule starts by saturating the analysis rules, fully propagating the <code>non-zero</code> information <em>without</em>
adding any e-nodes to the e-graph.</p>
<pre><code class="language-egglog">(<span class="keyword">run-schedule</span> (saturate (<span class="keyword">run</span> analysis)))</code></pre>
<p>Then, just run one iteration of the <code>optimizations</code> ruleset.</p>
<pre><code class="language-egglog">(<span class="keyword">run</span> optimizations 1)</code></pre>
<p>Or equivalently,</p>
<pre><code> (run-schedule
(saturate (run analysis))
(run optimizations))
</code></pre>
<p>This makes our check pass</p>
<pre><code class="language-egglog">(<span class="keyword">check</span> (= expr addition-chain))
(<span class="keyword">pop</span>)</code></pre>
<p>While the above program is effective at optimizing that specific program, it would fail if
we had a slightly more complex program where we had to interleave the optimizations and analyses
to derive the optimal program.
For expressing more complex schedules like these, <code>egglog</code> supports a scheduling sub-language,
with primitives <code>repeat</code>, <code>seq</code>, <code>saturate</code>, and <code>run</code>.</p>
<p>The idea behind the following schedule is to always saturate analyses before running optimizations.
This combination is wrapped in a <code>repeat</code> block to give us control over how long to run egglog.
With <code>repeat 1</code> it is the same schedule as before, but now we can increase the iteration
count if we want to optimize harder with more time and space budget.</p>
<pre><code class="language-egglog">(<span class="keyword">run-schedule</span>
(<span class="keyword">repeat</span> 2
(saturate (<span class="keyword">run</span> analysis))
(<span class="keyword">run</span> optimizations)))
(<span class="keyword">pop</span>)</code></pre>
<p>Running more iterations does not help our above example per se,
but if we had started with a slightly more complex program to optimize...</p>
<pre><code class="language-egglog">; a + (b + (c + (d + (e + f))))
(<span class="keyword">let</span> addition-chain
(Add (Var "a") (Add (Var "b")
(Add (Var "c") (Add (Var "d")
(Add (Var "e") (Var "f")))))))
; x * 0
(<span class="keyword">let</span> x-times-zero (Mul (Var "x") (Num zero)))
(<span class="keyword">let</span> nonzero-expr
(Add (Num one) (Add (Num one)
(Add (Num one) (Add (Num one)
x-times-zero)))))
(<span class="keyword">let</span> expr (Mul nonzero-expr (Div addition-chain nonzero-expr)))</code></pre>
<p>For the purpose of this example, we add this rule</p>
<pre><code class="language-egglog">(<span class="keyword">rewrite</span> (Mul x (Num zero)) (Num zero) :<span class="keyword">ruleset</span> optimizations)</code></pre>
<p>To prove <code>expr</code> is equivalent to <code>addition-chian</code> by applying the cancellation law,
we need to prove <code>nonzero-expr</code> is nonzero, which requires proving
<code>x-times-zero</code>'s bound.
To show <code>x-time-zero</code>'s bound, we need to apply an optimization rule to rewrite
it to 0.
In other words, this requires running analyses in between two runs of optimization rules
(the cancellation law and <code>Mul</code>'s identity law)</p>
<p>Therefore, only running our schedule with one iteration (<code>repeat 1</code>) does not give us the optimal program.</p>
<pre><code class="language-egglog">(<span class="keyword">push</span>)
(<span class="keyword">run-schedule</span>
(<span class="keyword">repeat</span> 1
(saturate (<span class="keyword">run</span> analysis))
(<span class="keyword">run</span> optimizations)))
(<span class="keyword">extract</span> expr)
(<span class="keyword">pop</span>)</code></pre>
<p>Instead, we need to increase the iteration number.</p>
<pre><code class="language-egglog">(<span class="keyword">push</span>)
(<span class="keyword">run-schedule</span>
(<span class="keyword">repeat</span> 2
(saturate (<span class="keyword">run</span> analysis))
(<span class="keyword">run</span> optimizations)))
(<span class="keyword">extract</span> expr)
(<span class="keyword">pop</span>)</code></pre>
<p><strong>Using custom schedulers</strong></p>
<p>However, sometimes just having an iteration number does not give you enough control.
For example, for many rules, such as associativity and commutativity (AC), the size of the e-graph grows hyper-exponentially
with respect to the number of iterations. </p>
<p>Let's go back to this example, and run for five iterations.</p>
<pre><code class="language-egglog">(<span class="keyword">push</span>)
(<span class="keyword">run-schedule</span>
(<span class="keyword">repeat</span> 5
(saturate (<span class="keyword">run</span> analysis))
(<span class="keyword">run</span> optimizations)))
(<span class="keyword">print-size</span> Mul)
(<span class="keyword">pop</span>)</code></pre>
<p>At iteration 5, the <code>Mul</code> function has size 582. However, if we bump that to 6,
the size of the <code>Mul</code> function will increase to 13285! Therefore, the iteration number is too coarse
of a granularity for defining the search space.</p>
<p>To this end, egglog provides a scheduler mechanism. A scheduler can decide which matches are important and need to be applied,
while others can be delayed or skipped. To use scheduler, there are two operations: <code>let-scheduler</code> and <code>run-with</code>.
<code>(let-scheduler sched ..)</code> instantiates a scheduler <code>sched</code>, and <code>(run-with sched ruleset)</code> rules a ruleset with that scheduler.</p>
<p>Currently, <code>egglog-experimental</code> implements one scheduler, <code>back-off</code>. (We will implement our own scheduler
in Section 6.) The idea of <code>back-off</code> is that it will ban a rule from applying if that rule grows the
e-graph too fast. The decision to ban is based on a threshold, which is initially small and increases as rules are banned.
This scheduler works well when the ruleset contains explosive rules like AC.</p>
<p>In this example, the back-off scheduler can prevent the associativity rule
from dominating the equality saturation: when the the associativity rule (or any other rule) is fired too much,
the scheduler will automatically ban this rule for a few iterations, so that other rules can catch up.</p>
<pre><code class="language-egglog">(<span class="keyword">run-schedule</span>
(<span class="keyword">let-scheduler</span> bo (back-off))
(<span class="keyword">repeat</span> 10 (<span class="keyword">run-with</span> bo optimizations)))
(<span class="keyword">print-size</span> Mul)</code></pre>
<p>It is important that the scheduler <code>bo</code> is instantiated outside the <code>repeat</code> loop, since each scheduler carries some state that is updated
when run. For example, the following schedule has a very semantics than the schedule above.</p>
<pre><code> (run-schedule
(repeat 10
(let-scheduler bo (back-off))
(run-with bo optimizations)))
</code></pre>
<p>This schedule instantiates a (fresh) <code>back-off</code> scheduler for each <code>run-with</code>, so the ten iterations of rulesets are all run
with the initial configuration of the <code>back-off</code> scheduler, which has a very low threshold for banning rules.</p>
</main>
<footer>
<p>Generated from <code>04-scheduling.egg</code> | <a href="https://github.com/egraphs-good/egglog" target="_blank">The egglog project</a></p>
</footer>
</body>
</html>