How to get last item from Array using Pyspark

import pyspark.sql.functions as F
df = spark.createDataFrame([[['A', 'B', 'C', 'D']], [['E', 'F']]], ['split'])

df.show()
+------------+                                                                  
      split|
+------------+
|[A, B, C, D]|
|      [E, F]|
+------------+

df.withColumn('lastItem', df.split.getItem(F.size(df.split) - 1)).show()
+------------+--------+
      split|lastItem|
+------------+--------+
|[A, B, C, D]|       D|
|      [E, F]|       F|
+------------+--------+

No comments:

Post a Comment

Pages