the-tree/docs/running-on-mesos.md

0001 ---
0002 layout: global
0003 title: Running Spark on Mesos
0004 license: |
0005   Licensed to the Apache Software Foundation (ASF) under one or more
0006   contributor license agreements.  See the NOTICE file distributed with
0007   this work for additional information regarding copyright ownership.
0008   The ASF licenses this file to You under the Apache License, Version 2.0
0009   (the "License"); you may not use this file except in compliance with
0010   the License.  You may obtain a copy of the License at
0011
0012      http://www.apache.org/licenses/LICENSE-2.0
0013
0014   Unless required by applicable law or agreed to in writing, software
0015   distributed under the License is distributed on an "AS IS" BASIS,
0016   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
0017   See the License for the specific language governing permissions and
0018   limitations under the License.
0019 ---
0020 * This will become a table of contents (this text will be scraped).
0021 {:toc}
0022
0023 Spark can run on hardware clusters managed by [Apache Mesos](http://mesos.apache.org/).
0024
0025 The advantages of deploying Spark with Mesos include:
0026
0027 - dynamic partitioning between Spark and other
0028   [frameworks](https://mesos.apache.org/documentation/latest/frameworks/)
0029 - scalable partitioning between multiple instances of Spark
0030
0031 # Security
0032
0033 Security in Spark is OFF by default. This could mean you are vulnerable to attack by default.
0034 Please see [Spark Security](security.html) and the specific security sections in this doc before running Spark.
0035
0036 # How it Works
0037
0038 In a standalone cluster deployment, the cluster manager in the below diagram is a Spark master
0039 instance.  When using Mesos, the Mesos master replaces the Spark master as the cluster manager.
0040
0041 <p style="text-align: center;">
0042   <img src="img/cluster-overview.png" title="Spark cluster components" alt="Spark cluster components" />
0043 </p>
0044
0045 Now when a driver creates a job and starts issuing tasks for scheduling, Mesos determines what
0046 machines handle what tasks.  Because it takes into account other frameworks when scheduling these
0047 many short-lived tasks, multiple frameworks can coexist on the same cluster without resorting to a
0048 static partitioning of resources.
0049
0050 To get started, follow the steps below to install Mesos and deploy Spark jobs via Mesos.
0051
0052
0053 # Installing Mesos
0054
0055 Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} or newer and does not
0056 require any special patches of Mesos. File and environment-based secrets support requires Mesos 1.3.0 or
0057 newer.
0058
0059 If you already have a Mesos cluster running, you can skip this Mesos installation step.
0060
0061 Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other
0062 frameworks.  You can install Mesos either from source or using prebuilt packages.
0063
0064 ## From Source
0065
0066 To install Apache Mesos from source, follow these steps:
0067
0068 1. Download a Mesos release from a
0069    [mirror](http://www.apache.org/dyn/closer.lua/mesos/{{site.MESOS_VERSION}}/)
0070 2. Follow the Mesos [Getting Started](http://mesos.apache.org/getting-started) page for compiling and
0071    installing Mesos
0072
0073 **Note:** If you want to run Mesos without installing it into the default paths on your system
0074 (e.g., if you lack administrative privileges to install it), pass the
0075 `--prefix` option to `configure` to tell it where to install. For example, pass
0076 `--prefix=/home/me/mesos`. By default the prefix is `/usr/local`.
0077
0078 ## Third-Party Packages
0079
0080 The Apache Mesos project only publishes source releases, not binary packages.  But other
0081 third party projects publish binary releases that may be helpful in setting Mesos up.
0082
0083 One of those is Mesosphere.  To install Mesos using the binary releases provided by Mesosphere:
0084
0085 1. Download Mesos installation package from [downloads page](https://open.mesosphere.com/downloads/mesos/)
0086 2. Follow their instructions for installation and configuration
0087
0088 The Mesosphere installation documents suggest setting up ZooKeeper to handle Mesos master failover,
0089 but Mesos can be run without ZooKeeper using a single master as well.
0090
0091 ## Verification
0092
0093 To verify that the Mesos cluster is ready for Spark, navigate to the Mesos master webui at port
0094 `:5050`  Confirm that all expected machines are present in the slaves tab.
0095
0096
0097 # Connecting Spark to Mesos
0098
0099 To use Mesos from Spark, you need a Spark binary package available in a place accessible by Mesos, and
0100 a Spark driver program configured to connect to Mesos.
0101
0102 Alternatively, you can also install Spark in the same location in all the Mesos slaves, and configure
0103 `spark.mesos.executor.home` (defaults to SPARK_HOME) to point to that location.
0104
0105 ## Authenticating to Mesos
0106
0107 When Mesos Framework authentication is enabled it is necessary to provide a principal and secret by which to authenticate Spark to Mesos.  Each Spark job will register with Mesos as a separate framework.
0108
0109 Depending on your deployment environment you may wish to create a single set of framework credentials that are shared across all users or create framework credentials for each user.  Creating and managing framework credentials should be done following the Mesos [Authentication documentation](http://mesos.apache.org/documentation/latest/authentication/).
0110
0111 Framework credentials may be specified in a variety of ways depending on your deployment environment and security requirements.  The most simple way is to specify the `spark.mesos.principal` and `spark.mesos.secret` values directly in your Spark configuration.  Alternatively you may specify these values indirectly by instead specifying `spark.mesos.principal.file` and `spark.mesos.secret.file`, these settings point to files containing the principal and secret.  These files must be plaintext files in UTF-8 encoding.  Combined with appropriate file ownership and mode/ACLs this provides a more secure way to specify these credentials.
0112
0113 Additionally, if you prefer to use environment variables you can specify all of the above via environment variables instead, the environment variable names are simply the configuration settings uppercased with `.` replaced with `_` e.g. `SPARK_MESOS_PRINCIPAL`.
0114
0115 ### Credential Specification Preference Order
0116
0117 Please note that if you specify multiple ways to obtain the credentials then the following preference order applies.  Spark will use the first valid value found and any subsequent values are ignored:
0118
0119 - `spark.mesos.principal` configuration setting
0120 - `SPARK_MESOS_PRINCIPAL` environment variable
0121 - `spark.mesos.principal.file` configuration setting
0122 - `SPARK_MESOS_PRINCIPAL_FILE` environment variable
0123
0124 An equivalent order applies for the secret.  Essentially we prefer the configuration to be specified directly rather than indirectly by files, and we prefer that configuration settings are used over environment variables.
0125
0126 ### Deploy to a Mesos running on Secure Sockets
0127
0128 If you want to deploy a Spark Application into a Mesos cluster that is running in a secure mode there are some environment variables that need to be set.
0129
0130 - `LIBPROCESS_SSL_ENABLED=true` enables SSL communication
0131 - `LIBPROCESS_SSL_VERIFY_CERT=false` verifies the ssl certificate
0132 - `LIBPROCESS_SSL_KEY_FILE=pathToKeyFile.key` path to key
0133 - `LIBPROCESS_SSL_CERT_FILE=pathToCRTFile.crt` the certificate file to be used
0134
0135 All options can be found at http://mesos.apache.org/documentation/latest/ssl/
0136
0137 Then submit happens as described in Client mode or Cluster mode below
0138
0139 ## Uploading Spark Package
0140
0141 When Mesos runs a task on a Mesos slave for the first time, that slave must have a Spark binary
0142 package for running the Spark Mesos executor backend.
0143 The Spark package can be hosted at any Hadoop-accessible URI, including HTTP via `http://`,
0144 [Amazon Simple Storage Service](http://aws.amazon.com/s3) via `s3n://`, or HDFS via `hdfs://`.
0145
0146 To use a precompiled package:
0147
0148 1. Download a Spark binary package from the Spark [download page](https://spark.apache.org/downloads.html)
0149 2. Upload to hdfs/http/s3
0150
0151 To host on HDFS, use the Hadoop fs put command: `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz
0152 /path/to/spark-{{site.SPARK_VERSION}}.tar.gz`
0153
0154
0155 Or if you are using a custom-compiled version of Spark, you will need to create a package using
0156 the `dev/make-distribution.sh` script included in a Spark source tarball/checkout.
0157
0158 1. Download and build Spark using the instructions [here](index.html)
0159 2. Create a binary package using `./dev/make-distribution.sh --tgz`.
0160 3. Upload archive to http/s3/hdfs
0161
0162
0163 ## Using a Mesos Master URL
0164
0165 The Master URLs for Mesos are in the form `mesos://host:5050` for a single-master Mesos
0166 cluster, or `mesos://zk://host1:2181,host2:2181,host3:2181/mesos` for a multi-master Mesos cluster using ZooKeeper.
0167
0168 ## Client Mode
0169
0170 In client mode, a Spark Mesos framework is launched directly on the client machine and waits for the driver output.
0171
0172 The driver needs some configuration in `spark-env.sh` to interact properly with Mesos:
0173
0174 1. In `spark-env.sh` set some environment variables:
0175  * `export MESOS_NATIVE_JAVA_LIBRARY=<path to libmesos.so>`. This path is typically
0176    `<prefix>/lib/libmesos.so` where the prefix is `/usr/local` by default. See Mesos installation
0177    instructions above. On Mac OS X, the library is called `libmesos.dylib` instead of
0178    `libmesos.so`.
0179  * `export SPARK_EXECUTOR_URI=<URL of spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>`.
0180 2. Also set `spark.executor.uri` to `<URL of spark-{{site.SPARK_VERSION}}.tar.gz>`.
0181
0182 Now when starting a Spark application against the cluster, pass a `mesos://`
0183 URL as the master when creating a `SparkContext`. For example:
0184
0185 {% highlight scala %}
0186 val conf = new SparkConf()
0187   .setMaster("mesos://HOST:5050")
0188   .setAppName("My app")
0189   .set("spark.executor.uri", "<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>")
0190 val sc = new SparkContext(conf)
0191 {% endhighlight %}
0192
0193 (You can also use [`spark-submit`](submitting-applications.html) and configure `spark.executor.uri`
0194 in the [conf/spark-defaults.conf](configuration.html#loading-default-configurations) file.)
0195
0196 When running a shell, the `spark.executor.uri` parameter is inherited from `SPARK_EXECUTOR_URI`, so
0197 it does not need to be redundantly passed in as a system property.
0198
0199 {% highlight bash %}
0200 ./bin/spark-shell --master mesos://host:5050
0201 {% endhighlight %}
0202
0203 ## Cluster mode
0204
0205 Spark on Mesos also supports cluster mode, where the driver is launched in the cluster and the client
0206 can find the results of the driver from the Mesos Web UI.
0207
0208 To use cluster mode, you must start the `MesosClusterDispatcher` in your cluster via the `sbin/start-mesos-dispatcher.sh` script,
0209 passing in the Mesos master URL (e.g: mesos://host:5050). This starts the `MesosClusterDispatcher` as a daemon running on the host.
0210 Note that the `MesosClusterDispatcher` does not support authentication.  You should ensure that all network access to it is
0211 protected (port 7077 by default).
0212
0213 By setting the Mesos proxy config property (requires mesos version >= 1.4), `--conf spark.mesos.proxy.baseURL=http://localhost:5050` when launching the dispatcher, the mesos sandbox URI for each driver is added to the mesos dispatcher UI.
0214
0215 If you like to run the `MesosClusterDispatcher` with Marathon, you need to run the `MesosClusterDispatcher` in the foreground (i.e: `./bin/spark-class org.apache.spark.deploy.mesos.MesosClusterDispatcher`). Note that the `MesosClusterDispatcher` not yet supports multiple instances for HA.
0216
0217 The `MesosClusterDispatcher` also supports writing recovery state into Zookeeper. This will allow the `MesosClusterDispatcher` to be able to recover all submitted and running containers on relaunch.   In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spark-env by configuring `spark.deploy.recoveryMode` and related spark.deploy.zookeeper.* configurations.
0218 For more information about these configurations please refer to the configurations [doc](configuration.html#deploy).
0219
0220 You can also specify any additional jars required by the `MesosClusterDispatcher` in the classpath by setting the environment variable SPARK_DAEMON_CLASSPATH in spark-env.
0221
0222 From the client, you can submit a job to Mesos cluster by running `spark-submit` and specifying the master URL
0223 to the URL of the `MesosClusterDispatcher` (e.g: mesos://dispatcher:7077). You can view driver statuses on the
0224 Spark cluster Web UI.
0225
0226 For example:
0227 {% highlight bash %}
0228 ./bin/spark-submit \
0229   --class org.apache.spark.examples.SparkPi \
0230   --master mesos://207.184.161.138:7077 \
0231   --deploy-mode cluster \
0232   --supervise \
0233   --executor-memory 20G \
0234   --total-executor-cores 100 \
0235   http://path/to/examples.jar \
0236   1000
0237 {% endhighlight %}
0238
0239
0240 Note that jars or python files that are passed to spark-submit should be URIs reachable by Mesos slaves, as the Spark driver doesn't automatically upload local jars.
0241
0242 # Mesos Run Modes
0243
0244 Spark can run over Mesos in two modes: "coarse-grained" (default) and
0245 "fine-grained" (deprecated).
0246
0247 ## Coarse-Grained
0248
0249 In "coarse-grained" mode, each Spark executor runs as a single Mesos
0250 task.  Spark executors are sized according to the following
0251 configuration variables:
0252
0253 * Executor memory: `spark.executor.memory`
0254 * Executor cores: `spark.executor.cores`
0255 * Number of executors: `spark.cores.max`/`spark.executor.cores`
0256
0257 Please see the [Spark Configuration](configuration.html) page for
0258 details and default values.
0259
0260 Executors are brought up eagerly when the application starts, until
0261 `spark.cores.max` is reached.  If you don't set `spark.cores.max`, the
0262 Spark application will consume all resources offered to it by Mesos,
0263 so we, of course, urge you to set this variable in any sort of
0264 multi-tenant cluster, including one which runs multiple concurrent
0265 Spark applications.
0266
0267 The scheduler will start executors round-robin on the offers Mesos
0268 gives it, but there are no spread guarantees, as Mesos does not
0269 provide such guarantees on the offer stream.
0270
0271 In this mode Spark executors will honor port allocation if such is
0272 provided from the user. Specifically, if the user defines
0273 `spark.blockManager.port` in Spark configuration,
0274 the mesos scheduler will check the available offers for a valid port
0275 range containing the port numbers. If no such range is available it will
0276 not launch any task. If no restriction is imposed on port numbers by the
0277 user, ephemeral ports are used as usual. This port honouring implementation
0278 implies one task per host if the user defines a port. In the future network,
0279 isolation shall be supported.
0280
0281 The benefit of coarse-grained mode is much lower startup overhead, but
0282 at the cost of reserving Mesos resources for the complete duration of
0283 the application.  To configure your job to dynamically adjust to its
0284 resource requirements, look into
0285 [Dynamic Allocation](#dynamic-resource-allocation-with-mesos).
0286
0287 ## Fine-Grained (deprecated)
0288
0289 **NOTE:** Fine-grained mode is deprecated as of Spark 2.0.0.  Consider
0290  using [Dynamic Allocation](#dynamic-resource-allocation-with-mesos)
0291  for some of the benefits.  For a full explanation see
0292  [SPARK-11857](https://issues.apache.org/jira/browse/SPARK-11857)
0293
0294 In "fine-grained" mode, each Spark task inside the Spark executor runs
0295 as a separate Mesos task. This allows multiple instances of Spark (and
0296 other frameworks) to share cores at a very fine granularity, where
0297 each application gets more or fewer cores as it ramps up and down, but
0298 it comes with an additional overhead in launching each task. This mode
0299 may be inappropriate for low-latency requirements like interactive
0300 queries or serving web requests.
0301
0302 Note that while Spark tasks in fine-grained will relinquish cores as
0303 they terminate, they will not relinquish memory, as the JVM does not
0304 give memory back to the Operating System.  Neither will executors
0305 terminate when they're idle.
0306
0307 To run in fine-grained mode, set the `spark.mesos.coarse` property to false in your
0308 [SparkConf](configuration.html#spark-properties):
0309
0310 {% highlight scala %}
0311 conf.set("spark.mesos.coarse", "false")
0312 {% endhighlight %}
0313
0314 You may also make use of `spark.mesos.constraints` to set
0315 attribute-based constraints on Mesos resource offers. By default, all
0316 resource offers will be accepted.
0317
0318 {% highlight scala %}
0319 conf.set("spark.mesos.constraints", "os:centos7;us-east-1:false")
0320 {% endhighlight %}
0321
0322 For example, Let's say `spark.mesos.constraints` is set to `os:centos7;us-east-1:false`, then the resource offers will
0323 be checked to see if they meet both these constraints and only then will be accepted to start new executors.
0324
0325 To constrain where driver tasks are run, use `spark.mesos.driver.constraints`
0326
0327 # Mesos Docker Support
0328
0329 Spark can make use of a Mesos Docker containerizer by setting the property `spark.mesos.executor.docker.image`
0330 in your [SparkConf](configuration.html#spark-properties).
0331
0332 The Docker image used must have an appropriate version of Spark already part of the image, or you can
0333 have Mesos download Spark via the usual methods.
0334
0335 Requires Mesos version 0.20.1 or later.
0336
0337 Note that by default Mesos agents will not pull the image if it already exists on the agent. If you use mutable image
0338 tags you can set `spark.mesos.executor.docker.forcePullImage` to `true` in order to force the agent to always pull the
0339 image before running the executor. Force pulling images is only available in Mesos version 0.22 and above.
0340
0341 # Running Alongside Hadoop
0342
0343 You can run Spark and Mesos alongside your existing Hadoop cluster by just launching them as a
0344 separate service on the machines. To access Hadoop data from Spark, a full `hdfs://` URL is required
0345 (typically `hdfs://<namenode>:9000/path`, but you can find the right URL on your Hadoop Namenode web
0346 UI).
0347
0348 In addition, it is possible to also run Hadoop MapReduce on Mesos for better resource isolation and
0349 sharing between the two. In this case, Mesos will act as a unified scheduler that assigns cores to
0350 either Hadoop or Spark, as opposed to having them share resources via the Linux scheduler on each
0351 node. Please refer to [Hadoop on Mesos](https://github.com/mesos/hadoop).
0352
0353 In either case, HDFS runs separately from Hadoop MapReduce, without being scheduled through Mesos.
0354
0355 # Dynamic Resource Allocation with Mesos
0356
0357 Mesos supports dynamic allocation only with coarse-grained mode, which can resize the number of
0358 executors based on statistics of the application. For general information,
0359 see [Dynamic Resource Allocation](job-scheduling.html#dynamic-resource-allocation).
0360
0361 The External Shuffle Service to use is the Mesos Shuffle Service. It provides shuffle data cleanup functionality
0362 on top of the Shuffle Service since Mesos doesn't yet support notifying another framework's
0363 termination. To launch it, run `$SPARK_HOME/sbin/start-mesos-shuffle-service.sh` on all slave nodes, with `spark.shuffle.service.enabled` set to `true`.
0364
0365 This can also be achieved through Marathon, using a unique host constraint, and the following command: `./bin/spark-class org.apache.spark.deploy.mesos.MesosExternalShuffleService`.
0366
0367 # Configuration
0368
0369 See the [configuration page](configuration.html) for information on Spark configurations.  The following configs are specific for Spark on Mesos.
0370
0371 #### Spark Properties
0372
0373 <table class="table">
0374 <tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr>
0375 <tr>
0376   <td><code>spark.mesos.coarse</code></td>
0377   <td>true</td>
0378   <td>
0379     If set to <code>true</code>, runs over Mesos clusters in "coarse-grained" sharing mode, where Spark acquires one long-lived Mesos task on each machine.
0380     If set to <code>false</code>, runs over Mesos cluster in "fine-grained" sharing mode, where one Mesos task is created per Spark task.
0381     Detailed information in <a href="running-on-mesos.html#mesos-run-modes">'Mesos Run Modes'</a>.
0382   </td>
0383   <td>0.6.0</td>
0384 </tr>
0385 <tr>
0386   <td><code>spark.mesos.extra.cores</code></td>
0387   <td><code>0</code></td>
0388   <td>
0389     Set the extra number of cores for an executor to advertise. This
0390     does not result in more cores allocated.  It instead means that an
0391     executor will "pretend" it has more cores, so that the driver will
0392     send it more tasks.  Use this to increase parallelism.  This
0393     setting is only used for Mesos coarse-grained mode.
0394   </td>
0395   <td>0.6.0</td>
0396 </tr>
0397 <tr>
0398   <td><code>spark.mesos.mesosExecutor.cores</code></td>
0399   <td><code>1.0</code></td>
0400   <td>
0401     (Fine-grained mode only) Number of cores to give each Mesos executor. This does not
0402     include the cores used to run the Spark tasks. In other words, even if no Spark task
0403     is being run, each Mesos executor will occupy the number of cores configured here.
0404     The value can be a floating point number.
0405   </td>
0406   <td>1.4.0</td>
0407 </tr>
0408 <tr>
0409   <td><code>spark.mesos.executor.docker.image</code></td>
0410   <td>(none)</td>
0411   <td>
0412     Set the name of the docker image that the Spark executors will run in. The selected
0413     image must have Spark installed, as well as a compatible version of the Mesos library.
0414     The installed path of Spark in the image can be specified with <code>spark.mesos.executor.home</code>;
0415     the installed path of the Mesos library can be specified with <code>spark.executorEnv.MESOS_NATIVE_JAVA_LIBRARY</code>.
0416   </td>
0417   <td>1.4.0</td>
0418 </tr>
0419 <tr>
0420   <td><code>spark.mesos.executor.docker.forcePullImage</code></td>
0421   <td>false</td>
0422   <td>
0423     Force Mesos agents to pull the image specified in <code>spark.mesos.executor.docker.image</code>.
0424     By default Mesos agents will not pull images they already have cached.
0425   </td>
0426   <td>2.1.0</td>
0427 </tr>
0428 <tr>
0429   <td><code>spark.mesos.executor.docker.parameters</code></td>
0430   <td>(none)</td>
0431   <td>
0432     Set the list of custom parameters which will be passed into the <code>docker run</code> command when launching the Spark executor on Mesos using the docker containerizer. The format of this property is a comma-separated list of
0433     key/value pairs. Example:
0434
0435     <pre>key1=val1,key2=val2,key3=val3</pre>
0436   </td>
0437   <td>2.2.0</td>
0438 </tr>
0439 <tr>
0440   <td><code>spark.mesos.executor.docker.volumes</code></td>
0441   <td>(none)</td>
0442   <td>
0443     Set the list of volumes which will be mounted into the Docker image, which was set using
0444     <code>spark.mesos.executor.docker.image</code>. The format of this property is a comma-separated list of
0445     mappings following the form passed to <code>docker run -v</code>. That is they take the form:
0446
0447     <pre>[host_path:]container_path[:ro|:rw]</pre>
0448   </td>
0449   <td>1.4.0</td>
0450 </tr>
0451 <tr>
0452   <td><code>spark.mesos.task.labels</code></td>
0453   <td>(none)</td>
0454   <td>
0455     Set the Mesos labels to add to each task. Labels are free-form key-value pairs.
0456     Key-value pairs should be separated by a colon, and commas used to
0457     list more than one.  If your label includes a colon or comma, you
0458     can escape it with a backslash.  Ex. key:value,key2:a\:b.
0459   </td>
0460   <td>2.2.0</td>
0461 </tr>
0462 <tr>
0463   <td><code>spark.mesos.executor.home</code></td>
0464   <td>driver side <code>SPARK_HOME</code></td>
0465   <td>
0466     Set the directory in which Spark is installed on the executors in Mesos. By default, the
0467     executors will simply use the driver's Spark home directory, which may not be visible to
0468     them. Note that this is only relevant if a Spark binary package is not specified through
0469     <code>spark.executor.uri</code>.
0470   </td>
0471   <td>1.1.1</td>
0472 </tr>
0473 <tr>
0474   <td><code>spark.mesos.executor.memoryOverhead</code></td>
0475   <td>executor memory * 0.10, with minimum of 384</td>
0476   <td>
0477     The amount of additional memory, specified in MiB, to be allocated per executor. By default,
0478     the overhead will be larger of either 384 or 10% of <code>spark.executor.memory</code>. If set,
0479     the final overhead will be this value.
0480   </td>
0481   <td>1.1.1</td>
0482 </tr>
0483 <tr>
0484   <td><code>spark.mesos.uris</code></td>
0485   <td>(none)</td>
0486   <td>
0487     A comma-separated list of URIs to be downloaded to the sandbox
0488     when driver or executor is launched by Mesos.  This applies to
0489     both coarse-grained and fine-grained mode.
0490   </td>
0491   <td>1.5.0</td>
0492 </tr>
0493 <tr>
0494   <td><code>spark.mesos.principal</code></td>
0495   <td>(none)</td>
0496   <td>
0497     Set the principal with which Spark framework will use to authenticate with Mesos.  You can also specify this via the environment variable `SPARK_MESOS_PRINCIPAL`.
0498   </td>
0499   <td>1.5.0</td>
0500 </tr>
0501 <tr>
0502   <td><code>spark.mesos.principal.file</code></td>
0503   <td>(none)</td>
0504   <td>
0505     Set the file containing the principal with which Spark framework will use to authenticate with Mesos.  Allows specifying the principal indirectly in more security conscious deployments.  The file must be readable by the user launching the job and be UTF-8 encoded plaintext.  You can also specify this via the environment variable `SPARK_MESOS_PRINCIPAL_FILE`.
0506   </td>
0507   <td>2.4.0</td>
0508 </tr>
0509 <tr>
0510   <td><code>spark.mesos.secret</code></td>
0511   <td>(none)</td>
0512   <td>
0513     Set the secret with which Spark framework will use to authenticate with Mesos. Used, for example, when
0514     authenticating with the registry.  You can also specify this via the environment variable `SPARK_MESOS_SECRET`.
0515   </td>
0516   <td>1.5.0</td>
0517 </tr>
0518 <tr>
0519   <td><code>spark.mesos.secret.file</code></td>
0520   <td>(none)</td>
0521   <td>
0522     Set the file containing the secret with which Spark framework will use to authenticate with Mesos. Used, for example, when
0523     authenticating with the registry.  Allows for specifying the secret indirectly in more security conscious deployments.  The file must be readable by the user launching the job and be UTF-8 encoded plaintext.  You can also specify this via the environment variable `SPARK_MESOS_SECRET_FILE`.
0524   </td>
0525   <td>2.4.0</td>
0526 </tr>
0527 <tr>
0528   <td><code>spark.mesos.role</code></td>
0529   <td><code>*</code></td>
0530   <td>
0531     Set the role of this Spark framework for Mesos. Roles are used in Mesos for reservations
0532     and resource weight sharing.
0533   </td>
0534   <td>1.5.0</td>
0535 </tr>
0536 <tr>
0537   <td><code>spark.mesos.constraints</code></td>
0538   <td>(none)</td>
0539   <td>
0540     Attribute-based constraints on mesos resource offers. By default, all resource offers will be accepted. This setting
0541     applies only to executors. Refer to <a href="http://mesos.apache.org/documentation/attributes-resources/">Mesos
0542     Attributes & Resources</a> for more information on attributes.
0543     <ul>
0544       <li>Scalar constraints are matched with "less than equal" semantics i.e. value in the constraint must be less than or equal to the value in the resource offer.</li>
0545       <li>Range constraints are matched with "contains" semantics i.e. value in the constraint must be within the resource offer's value.</li>
0546       <li>Set constraints are matched with "subset of" semantics i.e. value in the constraint must be a subset of the resource offer's value.</li>
0547       <li>Text constraints are matched with "equality" semantics i.e. value in the constraint must be exactly equal to the resource offer's value.</li>
0548       <li>In case there is no value present as a part of the constraint any offer with the corresponding attribute will be accepted (without value check).</li>
0549     </ul>
0550   </td>
0551   <td>1.5.0</td>
0552 </tr>
0553 <tr>
0554   <td><code>spark.mesos.driver.constraints</code></td>
0555   <td>(none)</td>
0556   <td>
0557     Same as <code>spark.mesos.constraints</code> except applied to drivers when launched through the dispatcher. By default,
0558     all offers with sufficient resources will be accepted.
0559   </td>
0560   <td>2.2.1</td>
0561 </tr>
0562 <tr>
0563   <td><code>spark.mesos.containerizer</code></td>
0564   <td><code>docker</code></td>
0565   <td>
0566     This only affects docker containers, and must be one of "docker"
0567     or "mesos".  Mesos supports two types of
0568     containerizers for docker: the "docker" containerizer, and the preferred
0569     "mesos" containerizer.  Read more here: http://mesos.apache.org/documentation/latest/container-image/
0570   </td>
0571   <td>2.1.0</td>
0572 </tr>
0573 <tr>
0574   <td><code>spark.mesos.driver.webui.url</code></td>
0575   <td><code>(none)</code></td>
0576   <td>
0577     Set the Spark Mesos driver webui_url for interacting with the framework.
0578     If unset it will point to Spark's internal web UI.
0579   </td>
0580   <td>2.0.0</td>
0581 </tr>
0582 <tr>
0583   <td><code>spark.mesos.driver.labels</code></td>
0584   <td><code>(none)</code></td>
0585   <td>
0586     Mesos labels to add to the driver.  See <code>spark.mesos.task.labels</code>
0587     for formatting information.
0588   </td>
0589   <td>2.3.0</td>
0590 </tr>
0591 <tr>
0592   <td>
0593     <code>spark.mesos.driver.secret.values</code>,
0594     <code>spark.mesos.driver.secret.names</code>,
0595     <code>spark.mesos.executor.secret.values</code>,
0596     <code>spark.mesos.executor.secret.names</code>,
0597   </td>
0598   <td><code>(none)</code></td>
0599   <td>
0600     <p>
0601       A secret is specified by its contents and destination. These properties
0602       specify a secret's contents. To specify a secret's destination, see the cell below.
0603     </p>
0604     <p>
0605       You can specify a secret's contents either (1) by value or (2) by reference.
0606     </p>
0607     <p>
0608       (1) To specify a secret by value, set the
0609       <code>spark.mesos.[driver|executor].secret.values</code>
0610       property, to make the secret available in the driver or executors.
0611       For example, to make a secret password "guessme" available to the driver process, set:
0612
0613       <pre>spark.mesos.driver.secret.values=guessme</pre>
0614     </p>
0615     <p>
0616       (2) To specify a secret that has been placed in a secret store
0617       by reference, specify its name within the secret store
0618       by setting the <code>spark.mesos.[driver|executor].secret.names</code>
0619       property. For example, to make a secret password named "password" in a secret store
0620       available to the driver process, set:
0621
0622       <pre>spark.mesos.driver.secret.names=password</pre>
0623     </p>
0624     <p>
0625       Note: To use a secret store, make sure one has been integrated with Mesos via a custom
0626       <a href="http://mesos.apache.org/documentation/latest/secrets/">SecretResolver
0627       module</a>.
0628     </p>
0629     <p>
0630       To specify multiple secrets, provide a comma-separated list:
0631
0632       <pre>spark.mesos.driver.secret.values=guessme,passwd123</pre>
0633
0634       or
0635
0636       <pre>spark.mesos.driver.secret.names=password1,password2</pre>
0637     </p>
0638   </td>
0639   <td>2.3.0</td>
0640 </tr>
0641 <tr>
0642   <td>
0643     <code>spark.mesos.driver.secret.envkeys</code>,
0644     <code>spark.mesos.driver.secret.filenames</code>,
0645     <code>spark.mesos.executor.secret.envkeys</code>,
0646     <code>spark.mesos.executor.secret.filenames</code>,
0647   </td>
0648   <td><code>(none)</code></td>
0649   <td>
0650     <p>
0651       A secret is specified by its contents and destination. These properties
0652       specify a secret's destination. To specify a secret's contents, see the cell above.
0653     </p>
0654     <p>
0655       You can specify a secret's destination in the driver or
0656       executors as either (1) an environment variable or (2) as a file.
0657     </p>
0658     <p>
0659       (1) To make an environment-based secret, set the
0660       <code>spark.mesos.[driver|executor].secret.envkeys</code> property.
0661       The secret will appear as an environment variable with the
0662       given name in the driver or executors. For example, to make a secret password available
0663       to the driver process as $PASSWORD, set:
0664
0665       <pre>spark.mesos.driver.secret.envkeys=PASSWORD</pre>
0666     </p>
0667     <p>
0668       (2) To make a file-based secret, set the
0669       <code>spark.mesos.[driver|executor].secret.filenames</code> property.
0670       The secret will appear in the contents of a file with the given file name in
0671       the driver or executors. For example, to make a secret password available in a
0672       file named "pwdfile" in the driver process, set:
0673
0674       <pre>spark.mesos.driver.secret.filenames=pwdfile</pre>
0675     </p>
0676     <p>
0677       Paths are relative to the container's work directory. Absolute paths must
0678       already exist. Note: File-based secrets require a custom
0679       <a href="http://mesos.apache.org/documentation/latest/secrets/">SecretResolver
0680       module</a>.
0681     </p>
0682     <p>
0683       To specify env vars or file names corresponding to multiple secrets,
0684       provide a comma-separated list:
0685
0686       <pre>spark.mesos.driver.secret.envkeys=PASSWORD1,PASSWORD2</pre>
0687
0688       or
0689
0690       <pre>spark.mesos.driver.secret.filenames=pwdfile1,pwdfile2</pre>
0691     </p>
0692   </td>
0693   <td>2.3.0</td>
0694 </tr>
0695 <tr>
0696   <td><code>spark.mesos.driverEnv.[EnvironmentVariableName]</code></td>
0697   <td><code>(none)</code></td>
0698   <td>
0699     This only affects drivers submitted in cluster mode.  Add the
0700     environment variable specified by EnvironmentVariableName to the
0701     driver process. The user can specify multiple of these to set
0702     multiple environment variables.
0703   </td>
0704   <td>2.1.0</td>
0705 </tr>
0706 <tr>
0707   <td><code>spark.mesos.dispatcher.webui.url</code></td>
0708   <td><code>(none)</code></td>
0709   <td>
0710     Set the Spark Mesos dispatcher webui_url for interacting with the framework.
0711     If unset it will point to Spark's internal web UI.
0712   </td>
0713   <td>2.0.0</td>
0714 </tr>
0715 <tr>
0716   <td><code>spark.mesos.dispatcher.driverDefault.[PropertyName]</code></td>
0717   <td><code>(none)</code></td>
0718   <td>
0719     Set default properties for drivers submitted through the
0720     dispatcher.  For example,
0721     spark.mesos.dispatcher.driverProperty.spark.executor.memory=32g
0722     results in the executors for all drivers submitted in cluster mode
0723     to run in 32g containers.
0724   </td>
0725   <td>2.1.0</td>
0726 </tr>
0727 <tr>
0728   <td><code>spark.mesos.dispatcher.historyServer.url</code></td>
0729   <td><code>(none)</code></td>
0730   <td>
0731     Set the URL of the <a href="monitoring.html#viewing-after-the-fact">history
0732     server</a>.  The dispatcher will then link each driver to its entry
0733     in the history server.
0734   </td>
0735   <td>2.1.0</td>
0736 </tr>
0737 <tr>
0738   <td><code>spark.mesos.gpus.max</code></td>
0739   <td><code>0</code></td>
0740   <td>
0741     Set the maximum number GPU resources to acquire for this job. Note that executors will still launch when no GPU resources are found
0742     since this configuration is just an upper limit and not a guaranteed amount.
0743   </td>
0744   <td>2.1.0</td>
0745 </tr>
0746 <tr>
0747   <td><code>spark.mesos.network.name</code></td>
0748   <td><code>(none)</code></td>
0749   <td>
0750     Attach containers to the given named network.  If this job is
0751     launched in cluster mode, also launch the driver in the given named
0752     network.  See
0753     <a href="http://mesos.apache.org/documentation/latest/cni/">the Mesos CNI docs</a>
0754     for more details.
0755   </td>
0756   <td>2.1.0</td>
0757 </tr>
0758 <tr>
0759   <td><code>spark.mesos.network.labels</code></td>
0760   <td><code>(none)</code></td>
0761   <td>
0762     Pass network labels to CNI plugins.  This is a comma-separated list
0763     of key-value pairs, where each key-value pair has the format key:value.
0764     Example:
0765
0766     <pre>key1:val1,key2:val2</pre>
0767     See
0768     <a href="http://mesos.apache.org/documentation/latest/cni/#mesos-meta-data-to-cni-plugins">the Mesos CNI docs</a>
0769     for more details.
0770   </td>
0771   <td>2.3.0</td>
0772 </tr>
0773 <tr>
0774   <td><code>spark.mesos.fetcherCache.enable</code></td>
0775   <td><code>false</code></td>
0776   <td>
0777     If set to `true`, all URIs (example: `spark.executor.uri`,
0778     `spark.mesos.uris`) will be cached by the <a
0779     href="http://mesos.apache.org/documentation/latest/fetcher/">Mesos
0780     Fetcher Cache</a>
0781   </td>
0782   <td>2.1.0</td>
0783 </tr>
0784 <tr>
0785   <td><code>spark.mesos.driver.failoverTimeout</code></td>
0786   <td><code>0.0</code></td>
0787   <td>
0788     The amount of time (in seconds) that the master will wait for the
0789     driver to reconnect, after being temporarily disconnected, before
0790     it tears down the driver framework by killing all its
0791     executors. The default value is zero, meaning no timeout: if the
0792     driver disconnects, the master immediately tears down the framework.
0793   </td>
0794   <td>2.3.0</td>
0795 </tr>
0796 <tr>
0797   <td><code>spark.mesos.rejectOfferDuration</code></td>
0798   <td><code>120s</code></td>
0799   <td>
0800     Time to consider unused resources refused, serves as a fallback of
0801     `spark.mesos.rejectOfferDurationForUnmetConstraints`,
0802     `spark.mesos.rejectOfferDurationForReachedMaxCores`
0803   </td>
0804   <td>2.2.0</td>
0805 </tr>
0806 <tr>
0807   <td><code>spark.mesos.rejectOfferDurationForUnmetConstraints</code></td>
0808   <td><code>spark.mesos.rejectOfferDuration</code></td>
0809   <td>
0810     Time to consider unused resources refused with unmet constraints
0811   </td>
0812   <td>1.6.0</td>
0813 </tr>
0814 <tr>
0815   <td><code>spark.mesos.rejectOfferDurationForReachedMaxCores</code></td>
0816   <td><code>spark.mesos.rejectOfferDuration</code></td>
0817   <td>
0818     Time to consider unused resources refused when maximum number of cores
0819     <code>spark.cores.max</code> is reached
0820   </td>
0821   <td>2.0.0</td>
0822 </tr>
0823 <tr>
0824   <td><code>spark.mesos.appJar.local.resolution.mode</code></td>
0825   <td><code>host</code></td>
0826   <td>
0827     Provides support for the `local:///` scheme to reference the app jar resource in cluster mode.
0828     If user uses a local resource (`local:///path/to/jar`) and the config option is not used it defaults to `host` eg.
0829     the mesos fetcher tries to get the resource from the host's file system.
0830     If the value is unknown it prints a warning msg in the dispatcher logs and defaults to `host`.
0831     If the value is `container` then spark submit in the container will use the jar in the container's path:
0832     `/path/to/jar`.
0833   </td>
0834   <td>2.4.0</td>
0835 </tr>
0836 </table>
0837
0838 # Troubleshooting and Debugging
0839
0840 A few places to look during debugging:
0841
0842 - Mesos master on port `:5050`
0843   - Slaves should appear in the slaves tab
0844   - Spark applications should appear in the frameworks tab
0845   - Tasks should appear in the details of a framework
0846   - Check the stdout and stderr of the sandbox of failed tasks
0847 - Mesos logs
0848   - Master and slave logs are both in `/var/log/mesos` by default
0849
0850 And common pitfalls:
0851
0852 - Spark assembly not reachable/accessible
0853   - Slaves must be able to download the Spark binary package from the `http://`, `hdfs://` or `s3n://` URL you gave
0854 - Firewall blocking communications
0855   - Check for messages about failed connections
0856   - Temporarily disable firewalls for debugging and then poke appropriate holes