Skip to content

kyuubi Is it possible to support spark on yarn SharedCacheClient #7239

@liangrui198

Description

@liangrui198

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the feature

It will not be shared by nodemagner, which will result in a large number of file downloads and deletions, causing significant consumption of disk I/O. Currently, Yarn has a SharedCache service, but Spark and Kyuubi do not have related integrations. Can they be integrated?

Motivation

spark submit job example:
sun.java.command=org.apache.spark.deploy.yarn.ApplicationMaster --class org.apache.kyuubi.engine.spark.SparkSQLEngine --jar file:/data/services/kyuubi_package-t10141512.56c0a726.r/apache-kyuubi-1.7.1-bin/externals/engines/spark/kyuubi-spark-sql-engine_2.12-1.7.1.jar ...
kyuubi-spark-sql-engine_2.12-1.7.1.jar

Describe the solution

Realize the following functions
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/SharedCache.html

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
  • No. I cannot submit a PR at this time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions