Spark (Connectors)!

Spark, a distributed computing framework based on JVM has connectors that allow direct access to myriad databases. This has eliminated the need of intermediate layers between computation and final storage location. Of course, use of Spark to perform this operation could result in DDoS however these connectors have had their number of concurrent connections limited, so one could safely use these.

Following are the Spark – DB connectors that I have recently implemented:

  1. Spark – Aerospike
  2. Spark – MongoDB (jar file, maven)
  3. Spark – Redshift

Basically its as simple as adding the jar file to Spark path. Of course, the jar needs to be present in every worker Spark path.

Leave a Reply