features in Apache Spark 1.4

apache spark

Here at LTS Computing LLC, we’ve been using Apache Spark mostly for ETL work and we like it a lot.  The 1.4 release made it even more attractive.

Here are some of the key new features:

  • SparkR – an R binding for Spark. SparkR gives R users access to Spark’s scale-out parallel run-time along with all of Spark’s input and output formats.
  • Mathematical functions in DataFrames
  • Window functions in Spark SQL and DataFrames
  • Rollup and cube functions
  • Summary and descriptive statistics

The full release 1.4 notes are here.