Spark Dataframe order preservation .Does calling the save operation on orderBy dataframe preserves ordering -
i ran test cases spark shell . statement executed of form . read.orderby($"p_int".asc ).write.format("com.databricks.spark.csv").save(“file:///tmp/output.txt”) the content in output directory seems sorted. cannot find documentation in spark related guarantees provided either dataframewriter in terms of preserving partition order or row order. the question can expect data in target file sorted ?and please add link proper documentation. if coalesce 1 partition before saving, output sorted. careful thought, when reading .csv in spark, if in spark config spark.default.parallelism more 1, ordering lost.