Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Spark is introduced


May 17, 2021 Spark Programming guide


Table of contents


Spark is introduced

Spark 1.2.0 Write an application using Scala 2.10, and you need to use a compatible version of Scala (ex: 2.10.X).

When you write a Spark application, you need to add Spark's Maven dependency, which Spark can get through the Maven Central Warehouse:

groupId = org.apache.spark
artifactId = spark-core_2.10
version = 1.2.0

Also, if you want to access an HDFS cluster, you'll need to add hadoop-client version. Some common HDFS version tags are listed on third-party release pages.

groupId = org.apache.hadoop
artifactId = hadoop-client
version = <your-hdfs-version>

Finally, you need to import some Spark classes and implicitly convert to your program, just add the following lines:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf