- 在Hive提交的 map/reduce 任务中间过程产生的临时文件所存储的目录。
- 设置 hive execution 引擎。 除 spark外还可以设置 tez, mr. mr 是默认。 以下相关子参数：
spark.executor.memory ， 分配给Remote Spark Context (RSC)
- spark.executor.cores 每个executor使用核的数量
- hive建立表时默认的文件格式 (默认值:TextFile) 包括:
当我们要进入动态分区插入时, 应该 设置为 nonstrict.
Session will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero or a negative value.
For example, the value of “86400000” indicate that the session will be timed out after 1 day of inactivity.
The check interval for session/operation timeout, in milliseconds, which can be disabled by setting to zero or a negative value.
For example, the value of “3600000” indicate that the session will be checked every 1 hour.
Operation will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero. For a positive value, checked for operations in terminal state only (FINISHED, CANCELED, CLOSED, ERROR). For a negative value, checked for all of the operations regardless of state.
For example, the value of “7200000” indicate that the query/operation will be timed out after 2 hours if it is still running.
Scratch space for Hive jobs. This directory is used by Hive to store the plans for different map/reduce stages for the query as well as to stored the intermediate outputs of these stages.
spark.executor.memory: Amount of memory to use per executor process.
spark.executor.cores: Number of cores per executor.
spark.yarn.executor.memoryOverhead: The amount of off heap memory (in megabytes) to be allocated per executor, when running Spark on Yarn. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc. In addition to the executor's memory, the container in which the executor is launched needs some extra memory for system processes, and this is what this overhead is for.
spark.executor.instances: The number of executors assigned to each application.
spark.driver.memory: The amount of memory assigned to the Remote Spark Context (RSC). We recommend 4GB.
spark.yarn.driver.memoryOverhead: We recommend 400 (MB).
to protect against dynamic partition insert is that the user may accidentally specify all partitions to be dynamic partitions without specifying one static partition, while the original intention is to just overwrite the sub-partitions of one root partition. We define another parameter hive.exec.dynamic.partition.mode=strict to prevent the all-dynamic partition case. In the strict mode, you have to specify at least one static partition. The default mode is strict. In addition, we have a parameter hive.exec.dynamic.partition=true/false to control whether to allow dynamic partition at all. The default value is false prior to Hive 0.9.0 and true in Hive 0.9.0 and later.
We just sent you an email. Please click the link in the email to confirm your subscription!
OKSubscriptions powered by Strikingly