DataFrame¶
Empty with predefined structure¶
val df = Seq.empty[(String, Int)].toDF("color", "weight")
Empty with case class structure¶
val df = Seq.empty[Apple].toDF()
Empty with struct field¶
val df = spark.emptyDataFrame
.withColumn("apple", struct(lit("green") as "color", lit(110) as "weight"))
From primitive list¶
val df = List(1, 2, 3).toDF("values")
From tuples (often used)¶
val df = List(("green", 70), ("red", 110)).toDF("color", "weight")
Array field¶
val df = List(
Array("red", "green", "yellow"),
Array("green", "yellow")
).toDF()
With null values¶
List(null.asInstanceOf[Integer]).toDF("color")
Example: DataFrameCreationSpec.scala