Webinar : Learn about Apache Spark and Big Data on Amazon Web Services (AWS)

Webinar on Apache Spark and Big Data on AWS

stackArmor is an Advanced AWS Partner that enables security focused customers in Financial Services, Healthcare, Telecom, Non-profits and Public sector markets rapidly setup and deploy analytics environments.
Webinar on learning more about Apache Spark on Amazon Web Services. Some of the topics we covered are described below.

– Apache Spark and Big Data Ecosystem Overview
– Role of Spark with respect to Hadoop, AWS, EMR, and popular big data technologies
– Analytics and ETL with SparkSQL and DataFrame/Dataset APIs
– Basics of Spark Execution and Memory
– Visualizing Data with Zeppelin (and possibly Tableau, time permitting)
– Intro to Machine Learning with SparkML
– Intro to Spark Streaming
– Spark on YARN: Clustering and Operations within EMR
– Business Cases and Architecture Patterns with Spark

Some of the technologies we will talk about and demonstrate include:
– Amazon EMR clusters supporting Apache Spark 2.0, HDFS and/or EMRFS, Apache Zeppelin with support for at least Scala (Spark), PySpark, (Spark)SQL, sh, hdfs interpreters


Gaurav “GP” Pal is the Founder of stackArmor and a well known expert in big data architectures on cloud based platforms such as AWS with many years of implementation experience on large data centric platforms such as and

Adam Breindel is a stackArmor Big Data Consultant focused on consulting and teaching Apache Spark. Adam’s experience includes work with banks on neural-net fraud detection, streaming analytics, cluster management code, and web apps, as well as development at a variety of startup and established companies in the travel, productivity, and entertainment industries. He is excited by the way that Spark and other modern big-data tech remove so many old obstacles to system design and make it possible to explore new categories of interesting, fun, hard problems.

Learn more about stackArmor and our Analytics offerings on our website