Skip to content

Commit 966ca22

Browse files
authored
[doc] use standard Flink docker image in QuickStart (#1997)
1 parent 61c4ece commit 966ca22

File tree

2 files changed

+133
-25
lines changed

2 files changed

+133
-25
lines changed

website/docs/quickstart/flink.md

Lines changed: 85 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,19 @@ mkdir fluss-quickstart-flink
3333
cd fluss-quickstart-flink
3434
```
3535

36-
2. Create a `docker-compose.yml` file with the following content:
36+
2. Create a `lib` directory and download the required jar files. You can adjust the Flink version as needed. Please make sure to download the compatible versions of [fluss-flink connector jar](/downloads) and [flink-connector-faker](https://github.com/knaufk/flink-faker/releases)
3737

38+
```shell
39+
export FLINK_VERSION="1.20"
40+
```
41+
42+
```shell
43+
mkdir lib
44+
wget -O lib/flink-faker-0.5.3.jar https://github.com/knaufk/flink-faker/releases/download/v0.5.3/flink-faker-0.5.3.jar
45+
wget -O "lib/fluss-flink-${FLINK_VERSION}-$FLUSS_DOCKER_VERSION$.jar" "https://repo1.maven.org/maven2/org/apache/fluss/fluss-flink-${FLINK_VERSION}/$FLUSS_DOCKER_VERSION$/fluss-flink-${FLINK_VERSION}-$FLUSS_DOCKER_VERSION$.jar"
46+
```
47+
48+
3. Create a `docker-compose.yml` file with the following content:
3849

3950
```yaml
4051
services:
@@ -69,36 +80,48 @@ services:
6980
#end
7081
#begin Flink cluster
7182
jobmanager:
72-
image: apache/fluss-quickstart-flink:1.20-$FLUSS_DOCKER_VERSION$
83+
image: flink:${FLINK_VERSION}
7384
ports:
7485
- "8083:8081"
75-
command: jobmanager
7686
environment:
7787
- |
7888
FLINK_PROPERTIES=
7989
jobmanager.rpc.address: jobmanager
90+
entrypoint: ["sh", "-c", "cp -v /tmp/lib/*.jar /opt/flink/lib && exec /docker-entrypoint.sh jobmanager"]
91+
volumes:
92+
- ./lib:/tmp/lib
8093
taskmanager:
81-
image: apache/fluss-quickstart-flink:1.20-$FLUSS_DOCKER_VERSION$
94+
image: flink:${FLINK_VERSION}
8295
depends_on:
8396
- jobmanager
84-
command: taskmanager
8597
environment:
8698
- |
8799
FLINK_PROPERTIES=
88100
jobmanager.rpc.address: jobmanager
89101
taskmanager.numberOfTaskSlots: 10
90102
taskmanager.memory.process.size: 2048m
91103
taskmanager.memory.framework.off-heap.size: 256m
104+
entrypoint: ["sh", "-c", "cp -v /tmp/lib/*.jar /opt/flink/lib && exec /docker-entrypoint.sh taskmanager"]
105+
volumes:
106+
- ./lib:/tmp/lib
107+
sql-client:
108+
image: flink:${FLINK_VERSION}
109+
depends_on:
110+
- jobmanager
111+
environment:
112+
- |
113+
FLINK_PROPERTIES=
114+
jobmanager.rpc.address: jobmanager
115+
rest.address: jobmanager
116+
entrypoint: ["sh", "-c", "cp -v /tmp/lib/*.jar /opt/flink/lib && exec /docker-entrypoint.sh bin/sql-client.sh"]
117+
volumes:
118+
- ./lib:/tmp/lib
92119
#end
93120
```
94121

95122
The Docker Compose environment consists of the following containers:
96123
- **Fluss Cluster:** a Fluss `CoordinatorServer`, a Fluss `TabletServer` and a `ZooKeeper` server.
97-
- **Flink Cluster**: a Flink `JobManager` and a Flink `TaskManager` container to execute queries.
98-
99-
**Note:** The `apache/fluss-quickstart-flink` image is based on [flink:1.20.1-java17](https://hub.docker.com/layers/library/flink/1.20-java17/images/sha256:bf1af6406c4f4ad8faa46efe2b3d0a0bf811d1034849c42c1e3484712bc83505) and
100-
includes the [fluss-flink](engine-flink/getting-started.md) and
101-
[flink-connector-faker](https://flink-packages.org/packages/flink-faker) to simplify this guide.
124+
- **Flink Cluster**: a Flink `JobManager`, a Flink `TaskManager`, and a Flink SQL client container to execute queries.
102125

103126
3. To start all containers, run:
104127
```shell
@@ -116,7 +139,6 @@ You can also visit http://localhost:8083/ to see if Flink is running normally.
116139

117140
:::note
118141
- If you want to additionally use an observability stack, follow one of the provided quickstart guides [here](maintenance/observability/quickstart.md) and then continue with this guide.
119-
- If you want to run with your own Flink environment, remember to download the [fluss-flink connector jar](/downloads), [flink-connector-faker](https://github.com/knaufk/flink-faker/releases) and then put them to `FLINK_HOME/lib/`.
120142
- All the following commands involving `docker compose` should be executed in the created working directory that contains the `docker-compose.yml` file.
121143
:::
122144

@@ -125,25 +147,70 @@ Congratulations, you are all set!
125147
## Enter into SQL-Client
126148
First, use the following command to enter the Flink SQL CLI Container:
127149
```shell
128-
docker compose exec jobmanager ./sql-client
150+
docker compose run sql-client
129151
```
130152

131-
**Note**:
132-
To simplify this guide, three temporary tables have been pre-created with `faker` connector to generate data.
133-
You can view their schemas by running the following commands:
153+
To simplify this guide, we will create three temporary tables with `faker` connector to generate data:
134154

135155
```sql title="Flink SQL"
136-
SHOW CREATE TABLE source_customer;
156+
CREATE TEMPORARY TABLE source_order (
157+
`order_key` BIGINT,
158+
`cust_key` INT,
159+
`total_price` DECIMAL(15, 2),
160+
`order_date` DATE,
161+
`order_priority` STRING,
162+
`clerk` STRING
163+
) WITH (
164+
'connector' = 'faker',
165+
'rows-per-second' = '10',
166+
'number-of-rows' = '10000',
167+
'fields.order_key.expression' = '#{number.numberBetween ''0'',''100000000''}',
168+
'fields.cust_key.expression' = '#{number.numberBetween ''0'',''20''}',
169+
'fields.total_price.expression' = '#{number.randomDouble ''3'',''1'',''1000''}',
170+
'fields.order_date.expression' = '#{date.past ''100'' ''DAYS''}',
171+
'fields.order_priority.expression' = '#{regexify ''(low|medium|high){1}''}',
172+
'fields.clerk.expression' = '#{regexify ''(Clerk1|Clerk2|Clerk3|Clerk4){1}''}'
173+
);
137174
```
138175

139176
```sql title="Flink SQL"
140-
SHOW CREATE TABLE source_order;
177+
CREATE TEMPORARY TABLE source_customer (
178+
`cust_key` INT,
179+
`name` STRING,
180+
`phone` STRING,
181+
`nation_key` INT NOT NULL,
182+
`acctbal` DECIMAL(15, 2),
183+
`mktsegment` STRING,
184+
PRIMARY KEY (`cust_key`) NOT ENFORCED
185+
) WITH (
186+
'connector' = 'faker',
187+
'number-of-rows' = '200',
188+
'fields.cust_key.expression' = '#{number.numberBetween ''0'',''20''}',
189+
'fields.name.expression' = '#{funnyName.name}',
190+
'fields.nation_key.expression' = '#{number.numberBetween ''1'',''5''}',
191+
'fields.phone.expression' = '#{phoneNumber.cellPhone}',
192+
'fields.acctbal.expression' = '#{number.randomDouble ''3'',''1'',''1000''}',
193+
'fields.mktsegment.expression' = '#{regexify ''(AUTOMOBILE|BUILDING|FURNITURE|MACHINERY|HOUSEHOLD){1}''}'
194+
);
141195
```
142196

143197
```sql title="Flink SQL"
144-
SHOW CREATE TABLE source_nation;
198+
CREATE TEMPORARY TABLE `source_nation` (
199+
`nation_key` INT NOT NULL,
200+
`name` STRING,
201+
PRIMARY KEY (`nation_key`) NOT ENFORCED
202+
) WITH (
203+
'connector' = 'faker',
204+
'number-of-rows' = '100',
205+
'fields.nation_key.expression' = '#{number.numberBetween ''1'',''5''}',
206+
'fields.name.expression' = '#{regexify ''(CANADA|JORDAN|CHINA|UNITED|INDIA){1}''}'
207+
);
145208
```
146209

210+
```sql title="Flink SQL"
211+
-- drop records silently if a null value would have to be inserted into a NOT NULL column
212+
SET 'table.exec.sink.not-null-enforcer'='DROP';
213+
```
147214

148215
## Create Fluss Tables
149216
### Create Fluss Catalog

website/docs/quickstart/lakehouse.md

Lines changed: 48 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -332,6 +332,10 @@ For further information how to store catalog configurations, see [Flink's Catalo
332332
:::
333333

334334
### Create Tables
335+
<Tabs groupId="lake-tabs">
336+
<TabItem value="paimon" label="Paimon" default>
337+
338+
335339
Running the following SQL to create Fluss tables to be used in this guide:
336340
```sql title="Flink SQL"
337341
CREATE TABLE fluss_order (
@@ -366,6 +370,46 @@ CREATE TABLE fluss_nation (
366370
);
367371
```
368372

373+
</TabItem>
374+
375+
<TabItem value="iceberg" label="Iceberg">
376+
377+
378+
Running the following SQL to create Fluss tables to be used in this guide:
379+
```sql title="Flink SQL"
380+
CREATE TABLE fluss_order (
381+
`order_key` BIGINT,
382+
`cust_key` INT NOT NULL,
383+
`total_price` DECIMAL(15, 2),
384+
`order_date` DATE,
385+
`order_priority` STRING,
386+
`clerk` STRING,
387+
`ptime` AS PROCTIME()
388+
);
389+
```
390+
391+
```sql title="Flink SQL"
392+
CREATE TABLE fluss_customer (
393+
`cust_key` INT NOT NULL,
394+
`name` STRING,
395+
`phone` STRING,
396+
`nation_key` INT NOT NULL,
397+
`acctbal` DECIMAL(15, 2),
398+
`mktsegment` STRING,
399+
PRIMARY KEY (`cust_key`) NOT ENFORCED
400+
);
401+
```
402+
403+
```sql title="Flink SQL"
404+
CREATE TABLE fluss_nation (
405+
`nation_key` INT NOT NULL,
406+
`name` STRING,
407+
PRIMARY KEY (`nation_key`) NOT ENFORCED
408+
);
409+
```
410+
411+
</TabItem>
412+
</Tabs>
369413
## Streaming into Fluss
370414

371415
First, run the following SQL to sync data from source tables to Fluss tables:
@@ -520,13 +564,10 @@ SELECT o.order_key,
520564
c.acctbal,
521565
c.mktsegment,
522566
n.name
523-
FROM (
524-
SELECT *, PROCTIME() as ptime
525-
FROM `default_catalog`.`default_database`.source_order
526-
) o
527-
LEFT JOIN fluss_customer FOR SYSTEM_TIME AS OF o.ptime AS c
567+
FROM fluss_order o
568+
LEFT JOIN fluss_customer FOR SYSTEM_TIME AS OF `o`.`ptime` AS `c`
528569
ON o.cust_key = c.cust_key
529-
LEFT JOIN fluss_nation FOR SYSTEM_TIME AS OF o.ptime AS n
570+
LEFT JOIN fluss_nation FOR SYSTEM_TIME AS OF `o`.`ptime` AS `n`
530571
ON c.nation_key = n.nation_key;
531572
```
532573

@@ -714,4 +755,4 @@ After finishing the tutorial, run `exit` to exit Flink SQL CLI Container and the
714755
```shell
715756
docker compose down -v
716757
```
717-
to stop all containers.
758+
to stop all containers.

0 commit comments

Comments
 (0)