github kubeflow/spark-operator v2.0.0-rc.0

latest releases: v2.1.0-rc.0, v2.0.2, v2.0.1...
pre-release2 months ago

This is the Spark Operator v2.0.0-rc.0 pre-release.

Breaking Changes

  • Use controller-runtime to reconsturct spark operator (#2072 by @ChenYi015)

Misc

What's Changed

Full Changelog: spark-operator-chart-1.4.3...v2.0.0-rc.0

More details

This pre-release is a major refactoring of the Spark operator and the Helm chart, includes:

  • Use controller-runtime to reconsturct spark operator (#547). It will significantly improve the maintenance and performance of spark operator.

  • Support multiple namespaces (#507, #2052). For example, if you set spark.jobNamespaces to [default, spark-operator] (please make sure these spark job namespaces already exist before the installation), then the controller and the webhook server will only watch and handle SparkApplications in these spark job namespaces:

helm install spark-operator spark-operator/spark-operator \
    --version 2.0.0-rc.0 \
    --create-namespace \
    --namespace spark-operator \
    --set 'spark.jobNamespaces={default,spark-operator}'
  • Support multiple instances. Deploy several spark operator in the same namespace or different namespaces. For example, install two spark operator both in the spark-operator namespace. One with name spark-operator and handles namespace default, another one with name spark-operator-2 and handles namespace spark-operator (please make sure these instances have different release names and handle different spark job namespaces so that they are not conflicting with each other):
helm install spark-operator spark-operator/spark-operator \
    --version 2.0.0-rc.0 \
    --create-namespace \
    --namespace spark-operator \
    --set 'spark.jobNamespaces={default}'

helm install spark-operator-2 spark-operator/spark-operator \
    --version 2.0.0-rc.0 \
    --create-namespace \
    --namespace spark-operator \
    --set 'spark.jobNamespaces={spark-operator}'
  • Leader election is enabled by default and cannot be disabled, it can make sure only one controller instance will be handling SparkApplications during the install/upgrade/rollback process.

  • Webhook server is enabled by default and cannot be disabled. It will be used to default/validate SparkApplications and mutate Spark pods.

  • Webhook secret will be populated and handled properly during the install/upgrade/rollback process. It will be created and updated by the controller. If the secret is empty, then new certificates will be generated to populate it, otherwise, controller will sync certificates to local disk.

  • Change the default of webhook failurePolicy from Ignore to Fail. Change the default of webhook timeoutSeconds from 30 to 10. There are many issues related to webhook, e.g. environments variables dropped, volumes not mounted. And these issues can be solved by setting webhook.failurePolicy to Failure, webhook server will admit spark pods creation only when there is no error.

  • Controller and webhook server are deployed in different k8s deployments and can be scaled independently. When deploying spark applications at a very large scale, the webhook server can be a performance bottleneck. This can be solved by increasing the replicas of webhook server:

helm install spark-operator spark-operator/spark-operator \
    --version 2.0.0-rc.0 \
    --create-namespace \
    --namespace spark-operator \
    --set webhook.replicas=5

Some Helm values renamings:

  • Change imagePullSecrets to image.pullSecrets.
  • All controller configurations are prefixed with controller e.g. controller.replicas and controller.workers
  • All webhook configurations are prefixed with webhook e.g. webhook.replicas and webhook.failurePolicy.
  • All monitoring configurations are prefixed with promethues e.g. promethues.metrics and promethues.podMonitor.
  • The update strategy of controller/webhook deployment will be the rolling update, not recreate.
  • Change the default spark job namespace from [] to ["default], thus the SparkApplication under examples directory can be running directly without creating rbac resources manually.
  • Service account are configured with controller.serviceAccount, webhook.serviceAccount and spark.serviceAccount respectively.
  • RBAC resources are configured with controller.rbac, webhook.rbac and spark.rbac respectively.
  • logLevel will be one of info, debug and error.

If you want to try this new pre-release with Helm, do as follows:

helm repo add spark-operator https://kubeflow.github.io/spark-operator

helm repo update

helm install spark-operator spark-operator/spark-operator \
    --version 2.0.0-rc.0 \
    --create-namespace \
    --namespace spark-operator

Or upgrade from chart 1.4.6 :

helm upgrade spark-operator spark-operator/spark-operator \
    --version 2.0.0-rc.0 \
    --namespace spark-operator

Don't miss a new spark-operator release

NewReleases is sending notifications on new releases.