Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Spark SQL data type


May 17, 2021 Spark Programming guide


Table of contents


Spark SQL data type

  • The type of number

  • ByteType: Represents an integer of a byte. The range is -128 to 127
  • ShortType: Represents an integer of two bytes. The range is -32768 to 32767
  • IntegerType: Represents an integer of 4 bytes. The range is -2147483648 to 2147483647
  • LongType: Represents an integer of 8 bytes. The range is -9223372036854775808 to 9223372036854775807
  • FloatType: Represents a single-precision floating point of 4 bytes
  • DoubleType: Represents a double float of 8 bytes
  • DecimalType: Represents 10-step data for any precision. S upported by internal java.math.BigDecimal. BigDecimal consists of an integer non-scale value with arbitrary precision and a 32-bit integer
  • StringType: Represents a string value
  • BinaryType: Represents a byte sequence value
  • Boolean Type: Stands for Boolean value
  • Datetime type

  • TimestampType: Represents the value that contains the field year, month, day, time, minute, and second
  • DateType: Represents the value of the year, month, and day that contains the field

  • Complex type

  • ArrayType (elementType, containsNull): Represents a sequence value consisting of elements of the type elementType. containsNull to indicate whether ArrayType has a null value
  • MapType (keyType, valueType, valueContainsNull): represents a set of keys - value pairs. T he type of key data is represented by keyType and the type of value data is represented by valueType. valueContainsNull to indicate whether MapType has a null value
  • StructType (fields): Represents a value that has a StructFields (fields) structure

  • StructField (name, dataType, nullable): represents a field in StructType the name of the name dataType type of the field, and nullable value of the field has a null value.

All of Spark's data types are defined in package org.apache.spark.sql which you can access import org.apache.spark.sql._

The data type The type of value in Scala Access or create an API for the data type
ByteType Byte ByteType
ShortType Short ShortType
IntegerType Int IntegerType
LongType Long LongType
FloatType Float FloatType
DoubleType Double DoubleType
DecimalType scala.math.BigDecimal DecimalType
StringType String StringType
BinaryType Array[Byte] BinaryType
BooleanType Boolean BooleanType
TimestampType java.sql.Timestamp TimestampType
DateType java.sql.Date DateType
ArrayType scala.collection.Seq ArrayType (elementType, "containsNull") Note that containsNull defaults to true
MapType scala.collection.Map MapType (keyType, valueType, valueContainsNull) Note that valueContainsNull defaults to true
StructType org.apache.spark.sql.Row StructType (fields), note that Fields is a StructField sequence, and two StructFields with the same name are not allowed
StructField The value type in Scala of the data type of this field (For example, Int for a StructField with the data type IntegerType) StructField(name, dataType, nullable)