May 17, 2021 Spark Programming guide
The entry point for all related features in Spark is the SQLContext class or its sub-classes, and all you need to create an SQLContext is just a SparkContext.
val sc: SparkContext // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
// createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD.
import sqlContext.createSchemaRDD
In addition to a basic SQLContext, you can also create a HiveContext that supports a superset of features supported by basic SQLContext. I ts additional features include the ability to write queries with a more complete HivEQL analyzer to access HiveUDFs and the ability to read data from the Hive table. W ith HiveContext you don't need an existing Hive on, and sqlContext's available data sources are available for HiveContext. H iveContext is packaged separately to avoid including all Hive dependencies when Spark is built. I f these dependencies are not a problem for your application, Spark 1.2 recommends hiveContext. Future stable versions will focus on providing SQLContext with features equivalent to HiveContext.
The specific SQL variant used to resolve query statements can be
spark.sql.dialect
option. T
his parameter can be changed in two ways, one by setting it by
setConf
method and the other by setting it in the SQL command
SET key=value
F
or SQLContext, the only available dialect is "sql", a simple SQL parser provided by Spark SQL. I
n HiveContext, although "sql" is also supported, the default dialect is "hiveql". T
his is because the HivQL parser is more complete.
"hiveql" is recommended in many use cases.