Skip to content

Add documentation for Zeppelin with Spark on Kubernetes#21

Open
echarles wants to merge 9 commits into
apache-spark-on-k8s:masterfrom
datalayer-externals:zeppelin
Open

Add documentation for Zeppelin with Spark on Kubernetes#21
echarles wants to merge 9 commits into
apache-spark-on-k8s:masterfrom
datalayer-externals:zeppelin

Conversation

@echarles
Copy link
Copy Markdown
Member

First draft documentation to further discuss and prepare the WIP for Zeppelin with Spark on Kubernetes.

@erikerlandson
Copy link
Copy Markdown
Member

Is the transparent background a potential problem? Presumably fine against a white background, but could that change?

@erikerlandson
Copy link
Copy Markdown
Member

I can't pull the netlify link up - does anybody else have that issue?

@erikerlandson
Copy link
Copy Markdown
Member

The doc looks good - I am wondering if we should include this while it is experimental. Or somehow tag this doc as experimental. @foxish what do you think?

@foxish
Copy link
Copy Markdown
Member

foxish commented Nov 16, 2017

I like the idea of marking as experimental and getting it out. It would help us garner feedback. If someone can verify the working of the tutorial in its current state, we can go ahead.

@foxish
Copy link
Copy Markdown
Member

foxish commented Nov 16, 2017

@echarles, would you be open to demo-ing this at next week's SIG meeting? It would help a lot of us understand where this effort is at.
cc/ @felixcheung

Copy link
Copy Markdown

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's cool to document this.
Although it could be hard to maintain if we reference ongoing PRs

Comment thread src/jekyll/zeppelin.md Outdated
> At the time being, the needed code is not integrated in the `master` branches of `apache-zeppelin` nor the `apache-spark-on-k8s/spark` repositories.
> You are welcome to already ty it out and send any feedback and question.

Firs things firs, you have to choose the following modes in which you will run Zeppelin with Spark on Kubernetes:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/jekyll/zeppelin.md Outdated
For now, to be able to test these combinations, you need to build specific branches (see hereafter) or to use third-party Helm charts or Docker images. The needed branches and related PR are listed here:

1. Spark-k8s driven branch: In-cluster client mode [see pull request #456](https://github.com/apache-spark-on-k8s/spark/pull/456)
2. Apache Zeppeoin driven branch: Add support to run Spark interpreter on a Kubernetes cluster [see pull request #2637](https://github.com/apache/zeppelin/pull/2637)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zeppelin?
what is driven branch?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to point where this branch resides... I have remove that to avoid confusion.

Comment thread src/jekyll/zeppelin.md Outdated

![In-Cluster with Spark-Client](/img/zeppelin_in-cluster_spark-client.png "In-Cluster with Spark-Client")

Build a new Zepplin based on [#456 In-cluster client mode](https://github.com/apache-spark-on-k8s/spark/pull/456).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zeppelin, extra space

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/jekyll/zeppelin.md Outdated

![In-Cluster with Spark-Cluster](/img/zeppelin_in-cluster_spark-cluster.png "In-Cluster with Spark-Cluster")

Build a new Zepplin based on [#2637 Spark interpreter on a Kubernetes](https://github.com/apache/zeppelin/pull/2637).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zeppelin

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one doesn't seem to be updated...?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done now.

Comment thread src/jekyll/zeppelin.md

Firs things firs, you have to choose the following modes in which you will run Zeppelin with Spark on Kubernetes:

+ The `Kubernetes modes`: Can be `in-cluster` (within a Pod) or `out-cluster` (from outside the Kubernetes cluster).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the proper terminology in k8s world? is "out-cluster" the right term?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question and from the already used/seen in-cluster, I have deduced 'out-cluster`. Happy to change to any other more official terminology.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@echarles
Copy link
Copy Markdown
Member Author

@felixcheung Thx a lot for your reviews (just pushed the fixes).
@erikerlandson I have pushed the 3 images with white backgrounds.

@foxish Happy to demo this during the next SIG meeting (22 Nov).

IMHO it is not bad to publish early docs if it the needed steps are clear (no release, need to build branches...) to get early-adopters feedbacks as much as possible.

Comment thread src/jekyll/zeppelin.md

Build a new Spark and their associated docker images based on [#2637 Spark interpreter on a Kubernetes](https://github.com/apache/zeppelin/pull/2637).

Once done, any vanilla Apache Zeppelin deployed in a Kubernetes Pod (your can use a Helm chart for this) will work out-of-the box with the following interpreter settings:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this helm chart for this (use a different image for a newer Zeppelin though)
https://github.com/kubernetes/charts/blob/master/stable/spark/templates/spark-zeppelin-deployment.yaml

shall we link it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a section at the end "how to test" and linked to the chart.

@erikerlandson
Copy link
Copy Markdown
Member

I am still OK with documentation, as long as it's clearly marked experimental

@echarles
Copy link
Copy Markdown
Member Author

@erikerlandson It is now documented as experimental at the beginning of the doc.

Comment thread src/jekyll/zeppelin.md
2. `in-cluster` with `spark-cluster` mode.
3. `out-cluster` with `spark-cluster` mode.

For now, to be able to test these combinations, you need to build specific branches (see hereafter) or to use third-party Helm charts or Docker images. The needed branches and related PR are listed here:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the meeting today, we want to ensure that these branches merge before we can publish documentation.

cc @felixcheung @erikerlandson @liyinan926 @mccheah

@echarles
Copy link
Copy Markdown
Member Author

Should I close this one? Doesn't seem like it will be merged and we will move soon to apache repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants