Skip to content

Commit f6c6422

Browse files
committed
updated the refresh functions
1 parent 47b11aa commit f6c6422

File tree

3 files changed

+95
-6
lines changed

3 files changed

+95
-6
lines changed

01_data.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,6 @@ d2 |>
119119
us_gas <- d2
120120
saveRDS(us_gas, file = "./data/us_gas.RDS")
121121
122-
us_gas_csv <- us_gas |> dplyr::select(area_name, process_name, date, description, value)
122+
us_gas_csv <- us_gas |> dplyr::select(area_name, process, process_name, date, description, value)
123123
write.csv(us_gas_csv, "./data/us_gas.csv", row.names = FALSE)
124124
```

02_features_engineering.html

Lines changed: 68 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
88

99
<meta name="author" content="Rami Krispin">
10-
<meta name="dcterms.date" content="2024-10-13">
10+
<meta name="dcterms.date" content="2024-10-15">
1111

1212
<title>Features Engineering</title>
1313
<style>
@@ -98,7 +98,7 @@ <h1 class="title">Features Engineering</h1>
9898
<div>
9999
<div class="quarto-title-meta-heading">Published</div>
100100
<div class="quarto-title-meta-contents">
101-
<p class="date">October 13, 2024</p>
101+
<p class="date">October 15, 2024</p>
102102
</div>
103103
</div>
104104

@@ -397,11 +397,74 @@ <h2 class="anchored" data-anchor-id="features-engineering">Features Engineering<
397397
6 0.1429389 Alabama VRS -2.950452 -0.1183790 0.8885529</code></pre>
398398
</div>
399399
</div>
400+
<p>Scale the features table:</p>
401+
<div class="cell">
402+
<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1" aria-hidden="true" tabindex="-1"></a>features_scale <span class="ot">&lt;-</span> <span class="fu">cbind</span>(<span class="fu">scale</span>(features[, <span class="dv">1</span><span class="sc">:</span><span class="dv">25</span>]), features[, <span class="fu">c</span>(<span class="st">"area_name"</span>, <span class="st">"process"</span>)])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
403+
</div>
404+
<p>Calculate the K-means and merge it back to features table:</p>
405+
<div class="cell">
406+
<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1" aria-hidden="true" tabindex="-1"></a>km2 <span class="ot">&lt;-</span> <span class="fu">kmeans</span>(features_scale[, <span class="dv">1</span><span class="sc">:</span><span class="dv">25</span>], <span class="at">centers =</span> <span class="dv">2</span>, <span class="at">nstart =</span> <span class="dv">25</span>)</span>
407+
<span id="cb29-2"><a href="#cb29-2" aria-hidden="true" tabindex="-1"></a>km3 <span class="ot">&lt;-</span> <span class="fu">kmeans</span>(features_scale[, <span class="dv">1</span><span class="sc">:</span><span class="dv">25</span>], <span class="at">centers =</span> <span class="dv">3</span>, <span class="at">nstart =</span> <span class="dv">25</span>)</span>
408+
<span id="cb29-3"><a href="#cb29-3" aria-hidden="true" tabindex="-1"></a>km4 <span class="ot">&lt;-</span> <span class="fu">kmeans</span>(features_scale[, <span class="dv">1</span><span class="sc">:</span><span class="dv">25</span>], <span class="at">centers =</span> <span class="dv">4</span>, <span class="at">nstart =</span> <span class="dv">25</span>)</span>
409+
<span id="cb29-4"><a href="#cb29-4" aria-hidden="true" tabindex="-1"></a>km5 <span class="ot">&lt;-</span> <span class="fu">kmeans</span>(features_scale[, <span class="dv">1</span><span class="sc">:</span><span class="dv">25</span>], <span class="at">centers =</span> <span class="dv">5</span>, <span class="at">nstart =</span> <span class="dv">25</span>)</span>
410+
<span id="cb29-5"><a href="#cb29-5" aria-hidden="true" tabindex="-1"></a></span>
411+
<span id="cb29-6"><a href="#cb29-6" aria-hidden="true" tabindex="-1"></a></span>
412+
<span id="cb29-7"><a href="#cb29-7" aria-hidden="true" tabindex="-1"></a>features<span class="sc">$</span>cluster2 <span class="ot">&lt;-</span> km2[<span class="dv">1</span>]<span class="sc">$</span>cluster</span>
413+
<span id="cb29-8"><a href="#cb29-8" aria-hidden="true" tabindex="-1"></a>features<span class="sc">$</span>cluster3 <span class="ot">&lt;-</span> km3[<span class="dv">1</span>]<span class="sc">$</span>cluster</span>
414+
<span id="cb29-9"><a href="#cb29-9" aria-hidden="true" tabindex="-1"></a>features<span class="sc">$</span>cluster4 <span class="ot">&lt;-</span> km4[<span class="dv">1</span>]<span class="sc">$</span>cluster</span>
415+
<span id="cb29-10"><a href="#cb29-10" aria-hidden="true" tabindex="-1"></a>features<span class="sc">$</span>cluster5 <span class="ot">&lt;-</span> km5[<span class="dv">1</span>]<span class="sc">$</span>cluster</span>
416+
<span id="cb29-11"><a href="#cb29-11" aria-hidden="true" tabindex="-1"></a></span>
417+
<span id="cb29-12"><a href="#cb29-12" aria-hidden="true" tabindex="-1"></a><span class="fu">head</span>(features)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
418+
<div class="cell-output cell-output-stdout">
419+
<pre><code> trend spike linearity curvature e_acf1 e_acf10 entropy
420+
1 0.003057269 9.712750e-06 -0.9752748 -0.1708456 0.7523232 2.4133825 0.5646704
421+
2 0.948030396 6.726470e-07 -7.2015516 -3.0015766 0.4296742 0.4436997 0.4504870
422+
3 0.800618791 1.125195e-06 14.2253302 -3.1554623 0.5867871 0.8027259 0.3008994
423+
4 0.836803825 6.086026e-07 14.8085737 -1.5848186 0.3569556 1.0992692 0.2368032
424+
5 0.850064046 6.811054e-07 13.7873300 4.5584124 0.5766747 1.0069159 0.1810873
425+
6 0.055090260 9.327703e-06 -4.6036276 -0.2804776 0.7735770 2.5141998 0.5991920
426+
x_acf1 x_acf10 diff1_acf1 diff1_acf10 diff2_acf1 diff2_acf10 arch_stat
427+
1 0.7530430 2.389684 0.29888389 0.57436812 -0.3554251 0.1948968 0.3704131
428+
2 0.9643920 7.019090 -0.18144404 0.07357371 -0.6279015 0.5012790 0.8498117
429+
3 0.9053211 5.520940 0.02418422 0.32177527 -0.4643220 0.3928694 0.7123761
430+
4 0.8877402 6.144797 -0.07201543 0.69852439 -0.4692256 0.4904081 0.5880839
431+
5 0.9334901 6.614953 -0.30579509 0.33282521 -0.6917163 0.8276553 0.6707924
432+
6 0.7845562 2.122313 0.45377070 0.79226036 -0.1257789 0.1624230 0.5199401
433+
embed2_incircle_1 embed2_incircle_2 ac_9 firstmin_ac trev_num
434+
1 0 0 -0.07868176 6 -7.229022e+07
435+
2 0 0 0.72951504 74 8.126437e+00
436+
3 0 0 0.68210089 4 -9.869577e+09
437+
4 0 0 0.67427730 3 -9.078644e+10
438+
5 0 0 0.77306245 6 -1.963808e+08
439+
6 0 0 -0.04204759 6 8.863228e+08
440+
motiftwo_entro3 walker_propcross nonlinearity x_pacf5 diff1x_pacf5
441+
1 1.5666745 0.1807512 0.9068600 0.9851775 0.20617291
442+
2 0.7702653 0.1264368 0.1082076 0.9606749 0.04383207
443+
3 1.2237333 0.2624113 0.1502996 0.9320975 0.15499032
444+
4 1.2732961 0.3120567 0.9465369 1.1507212 0.43904278
445+
5 1.2211373 0.2021277 0.5132332 1.0126882 0.16944682
446+
6 1.5166308 0.1666667 0.4290233 1.0753667 0.33320892
447+
diff2x_pacf5 area_name process PC1 PC2 PC3 cluster2
448+
1 0.2335997 Alabama VCS -2.130900 -1.3036161 0.9384032 2
449+
2 0.6195178 Alabama VDV 4.626793 1.4217552 1.0713550 1
450+
3 0.5246779 Alabama VEU 2.536517 0.1059892 -1.3959387 1
451+
4 0.7932931 Alabama VGT 2.799819 0.3110900 -2.1326149 1
452+
5 0.6569552 Alabama VIN 4.075354 0.3044582 -1.6657347 1
453+
6 0.1429389 Alabama VRS -2.950452 -0.1183790 0.8885529 2
454+
cluster3 cluster4 cluster5
455+
1 1 4 4
456+
2 2 2 1
457+
3 3 1 2
458+
4 3 1 2
459+
5 2 1 2
460+
6 1 3 3</code></pre>
461+
</div>
462+
</div>
400463
<p>Save the features table:</p>
401464
<div class="cell">
402-
<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1" aria-hidden="true" tabindex="-1"></a><span class="fu">saveRDS</span>(features, <span class="at">file =</span> <span class="st">"./data/features.RDS"</span>)</span>
403-
<span id="cb28-2"><a href="#cb28-2" aria-hidden="true" tabindex="-1"></a></span>
404-
<span id="cb28-3"><a href="#cb28-3" aria-hidden="true" tabindex="-1"></a><span class="fu">write.csv</span>(features, <span class="st">"./data/features.csv"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
465+
<div class="sourceCode cell-code" id="cb31"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb31-1"><a href="#cb31-1" aria-hidden="true" tabindex="-1"></a><span class="fu">saveRDS</span>(features, <span class="at">file =</span> <span class="st">"./data/features.RDS"</span>)</span>
466+
<span id="cb31-2"><a href="#cb31-2" aria-hidden="true" tabindex="-1"></a></span>
467+
<span id="cb31-3"><a href="#cb31-3" aria-hidden="true" tabindex="-1"></a><span class="fu">write.csv</span>(features, <span class="st">"./data/features.csv"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
405468
</div>
406469
</section>
407470

02_features_engineering.qmd

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,32 @@ head(features)
150150
```
151151

152152

153+
154+
Scale the features table:
155+
156+
```{r}
157+
features_scale <- cbind(scale(features[, 1:25]), features[, c("area_name", "process")])
158+
159+
```
160+
Calculate the K-means and merge it back to features table:
161+
```{r}
162+
km2 <- kmeans(features_scale[, 1:25], centers = 2, nstart = 25)
163+
km3 <- kmeans(features_scale[, 1:25], centers = 3, nstart = 25)
164+
km4 <- kmeans(features_scale[, 1:25], centers = 4, nstart = 25)
165+
km5 <- kmeans(features_scale[, 1:25], centers = 5, nstart = 25)
166+
167+
168+
features$cluster2 <- km2[1]$cluster
169+
features$cluster3 <- km3[1]$cluster
170+
features$cluster4 <- km4[1]$cluster
171+
features$cluster5 <- km5[1]$cluster
172+
173+
head(features)
174+
```
175+
176+
177+
178+
153179
Save the features table:
154180

155181
```{r}

0 commit comments

Comments
 (0)