Spark and Scala Online & Classroom Training

Upcoming Batches

Batch Type Date Time (IST)
Weekday Saturday 10:00 AM
Weekday Sunday 10:00 AM
Weekday Wednesday 08:00 AM

Request for Demo

Training Mode: ClassroomOnline

About Spark and Scala Course

Scala was developed as a type-safe programming language for general-purpose programming, helping developer to write concise, elegant and type-safe programs. Developed as a Object-Oriented programming language, Scala helps programmers to be efficient and productive with the application developments, and the turn-a-round time of the application development is better managed. Scala was introduced in January 2004 in the JVM platform and in June 2004 it was introduced on .NET also.

Spark is an open source, scalable, massively parallel, in-memory application software used for running analytical application for Big data processing. Spark application work on distributed data source across network to process large volumes of data. Spark works in-memory making the application for analytical processing faster and efficient.

Primary objectives of this Course

IT Corp Analytics Apache Spark and Scala Certification Training is designed to provide you the knowledge and skills that are required to become a successful Spark Developer and prepare you for the Cloudera Hadoop and Spark Developer Certification Exam (CCA175). Throughout the Apache Spark Training, you will get an in-depth knowledge on Apache Spark and the Spark Ecosystem, which includes Spark RDD, Spark SQL, Spark MLlib and Spark Streaming.You will also get comprehensive knowledge on Scala Programming language, HDFS, Sqoop, FLume, Spark GraphX and Messaging System such as Kafka.

How will Spark and Scala Training help your Career?

Spark and Scala programming are the latest and the most sought after technology today in the market. With industries adapting and aligning it’s IT needs on the cloud, more business using e-commerce and the need for complex web applications has brought about the need for Spark and Scala engineers and IT professionals. IT Corp understands this potential and helps student gain relevant industry knowledge to excel in their career goal.

Who Should do this Course ?

The course is applicable to:

  • Engineer Graduates
  • Working IT professional from programming, web development and DBA fields
  • Software programmers
  • JAVA developers
  • .NET developers

Spark and Scala Curriculum

Introduction to Scala and Spark
  • What is Scala?
  • Why Scala for Spark?
  • Scala in other frameworks
  • Introduction to Scala REPL
  • Basic Scala operations
  • Variable Types in Scala
  • Control Structures in Scala
  • Foreach loop, Functions and Procedures
  • Collections in Scala- Array
  • Array Buffer, Map, Tuples, Lists, and more
OOPS and Functional Programming in Scala
  • Class in Scala
  • Getters and Setters
  • Custom Getters and Setters
  • Properties with only Getters
  • Auxiliary Constructor and Primary Constructor
  • Singletons
  • Extending a Class
  • Overriding Methods
  • Traits as Interfaces and Layered Traits
  • Programming
  • Higher Order Functions
  • Anonymous Functions, and more
Introduction to Big Data And Hadoop
  • What is Big Data?
  • Big Data Customer Scenarios
  • Limitations and Solutions of Existing Data Analytics Architecture with Uber Use Case
  • How Hadoop Solves the Big Data Problem
  • What is Hadoop?
  • Hadoop’s Key Characteristics
  • Hadoop Ecosystem and HDFS
  • Hadoop Core Components
  • Rack Awareness and Block Replication
  • HDFS Read/Write Mechanism
  • YARN and Its Advantage
  • Hadoop Cluster and Its Architecture
  • Hadoop: Different Cluster Modes
  • Data Loading using Sqoop
Apache Spark Framework
  • Big Data Analytics with Batch & Real-Time Processing
  • Why Spark is Needed?
  • What is Spark?
  • How Spark Differs from Its Competitors?
  • Spark at eBay
  • Spark’s Place in Hadoop Ecosystem
  • Spark Components & it’s Architecture
  • Running Programs on Scala IDE & Spark Shell
  • Spark Web UI
  • Configuring Spark Properties
Playing With RDD's
  • Challenges in Existing Computing Methods
  • Probable Solution & How RDD Solves the Problem
  • What is RDD, It’s Functions, Transformations & Actions?
  • Data Loading and Saving Through RDDs
  • Key-Value Pair RDDs and Other Pair RDDs o RDD Lineage
  • RDD Persistence
  • WordCount Program Using RDD Concepts
  • RDD Partitioning & How It Helps Achieve Parallelization
Data Frames And Spark SQL
  • Need for Spark SQL
  • What is Spark SQL?
  • Spark SQL Architecture
  • SQL Context in Spark SQL
  • Data Frames & Datasets
  • Interoperating with RDDs
  • JSON and Parquet File Formats
  • Loading Data through Different Sources
Machine Learning Using Spark MLlib
  • What is Machine Learning?
  • Where is Machine Learning Used?
  • Different Types of Machine Learning Techniques
  • Face Detection: USE CASE
  • Understanding MLlib
  • Features of Saprk MLlib and MLlib Tools
  • Various ML algorithms supported by Spark MLlib
  • K-Means Clustering & How It Works with MLlib
  • Analysis on US Election Data: K-Means Spark MLlib USE CASE
Understanding Apache Kafka and Kafka Cluster
  • Need for Kafka
  • What is Kafka?
  • Core Concepts of Kafka
  • Kafka Architecture
  • Where is Kafka Used?
  • Understanding the Components of Kafka Cluster
  • Configuring Kafka Cluster
  • Producer and Consumer
Capturing Data With Apache Flume and integration with Kafka
  • Need of Apache Flume
  • What is Apache Flume
  • Basic Flume Architecture
  • Flume Sources
  • Flume Sinks
  • Flume Channels
  • Flume Configuration
  • Integrating Apache Flume and Apache Kafka
Apache Spark Streaming
  • Drawbacks in Existing Computing Methods
  • Why Streaming is Necessary?
  • What is Spark Streaming?
  • Spark Streaming Features
  • Spark Streaming Workflow
  • How Uber Uses Streaming Data
  • Streaming Context & DStreams
  • Transformations on DStreams
  • WordCount Program using Spark Streaming
  • Describe Windowed Operators and Why it is Useful
  • Important Windowed Operators
  • Slice, Window and ReduceByWindow Operators
  • Stateful Operators
  • Perform Twitter Sentimental Analysis Using Spark Streaming

TESTIMONIALS