talk-data.com
Managing Straggler Executors at Apache Spark 3.3
Topics
Description
Tuning high-performance Apache Spark applications to handle mis-behaving executors is at best challenging and at worst impossible. Apache Spark does provide some built-in support to kill and recreate new executors under certain conditions such as long GC delays or due to application errors. However this still leaves-open various scenarios where slow-running executors can impact the overall performance of your application even when you enable features such as task speculation. In this talk, we are going to describe Apache Spark 3.3’s new feature, Executor Rolling. Apache Spark 3.3 (SPARK-37810) provides a built-in executor rolling driver plugin with three configurations.
spark.kubernetes.executor.rollInterval (default: '0s' which means being disabled.)
spark.kubernetes.executor.rollPolicy (default: OUTLIER)
spark.kubernetes.executor.minTasksPerExecutorBeforeRolling (default: 0)
This driver plugin tries to choose and decommission a single executor at every interval with the given policy. The followings are the built-in policies and their targets.
- ID: An executor with the smallest executor ID
- ADD_TIME: An executor with the smallest add-time
- TOTAL_GC_TIME: An executor with the biggest GC time
- TOTAL_DURATION: An executor with the biggest total task time
- AVERAGE_DURATION: An executor with the biggest average task duration
- FAILED_TASKS: An executor with the largest number of failed tasks
- OUTLIER: An outlier executor or the biggest total task time
In short, Apache Spark 3.3 maintains the set of live executors literally freshly and reduces much engineering burdens to handle executors’ JVM misbehavior at diverse production jobs by utilizing the proposed built-in executor rolling policies in advance.
Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/