Details
Introduction to Big data and Hadoop
- Understanding Big Data
- Challenges in processing Big Data
- 3V Characteristics (Volume, Variety and Velocity)
- Brief history of Hadoop
- How Hadoop addresses Big Data?
- Core Hadoop Daemons
- Hadoop echo system
- Hadoop Clusters
LINUX Commands Hands on
HDFS (Hadoop Distributed File System)
- HDFS Overview and Architecture
- HDFS Keywords like Name Node, Data Node, Heart Beat etc
- Configuring HDFS
- Data Flows (Read and Write)
- HDFS Permissions and Security
- HDFS commands
- HDFS from Admin stand point
- Rack Awareness
Map Reduce
- Basics of Map Reduce
- Map Reduce Data Flow
- Word count Example solving
- Developing a Map Reduce Application
- Configuring Map Reduce
- 2 ways executing Map Reduce program
- Input and Output file formats
- Driver, Mapper and Reducer Code walk thru
- Hadoop Integration with Eclipse in Linux
- Partitioners
- Map Reduce Web UI
- Joins, Distributed cache
- Compression techniques in Map Reduce
How Map Reduce works?
- Classic Map Reduce (Map Reduce I)
- YARN (Map Reduce II)
- Shuffle and Sort
- Job Chaining
- Input formats – Input splits & custom file input formats
- Output formats – text output, custom file output formats
- Hands-on
Hadoop Echo System PIG
- Overview of PIG
- PIG Latin
- Why PIG?
- Loading and storing data
- 21 Transformations of PIG
- Local and HDFS modes of PIG
- Grunt Shell
- Script and Embedded modes of processing using PIG
- Understanding Complex data types of PIG
- Word Count using PIG
- Hands-on
HIVE
- Overview of HIVE
- PIG vs HIVE
- HiveQL
- Managed and External Tables
- LOAD vs INSERT
- Views
- CTAS
- Partitioning
- Bucketing
- Dynamic partitioning vs Bucketing
- OVERWRITE key word
- Collection Data types in HIVE
- Date type in HIVE
- ORC File Format and other File Formats
- Understanding SerDe
- Types of Hive JOINS
- Tuning Hive JOINS
- Vectorization
- Exploring HIVE User Defined Functions
- HIVE Unions
- Hands-on
- Temporary Tables
- Delete, Update Operations
HBASE
- Overview of HBASE
- NoSQL vs RDBMS
- HBASE vs HDFS
- HBASE Shell
- CRUD with JAVA API
- Hands-on
SQOOP
- Overview
- Data Ingestion mechanisms
- Getting granted from MySQL
- SQOOPING from MySQL
- SQOOPING to MYSQL
- Incremental append
- working with Sqoop jobs
Understanding OOZIE with use cases
Understanding FLUME with use cases
Hadoop Developer Admin
- Single Node Hadoop Cluster setup
- OS installation
- SSH Setup
- Java Setup
- Hadoop Installation
- Configuring Hadoop
- Multi Node Hadoop Cluster setup
- Installation of PIG, HIVE, SQOOP2 Components
Assignments at the end of every Component
5 - POC's
Real time project explanation
Training Details:
Course Duration: 40-50 DAYS (Mutual agreement) + Assignments
Note: We provide soft copies of materials and recorded videos of all classes to students directly. If we have any issues lets have internal meeting with trainers
Pre-Requisites to Learn Hadoop:
Pretty basics of
- Core Java
- SQL
- Linux commands
Who Should Join this Course:
This course has been designed for people aspiring to learn and work in Big Data world using Hadoop Framework and become a Hadoop Developer. IT Freshers, Graduates/Post Graduates from other domains with knowledge on pre requisites, Software Professionals, Analytics Professionals, and ETL developers are the key beneficiaries of this course.
Eligibility:
Graduate/Post Graduate Degree (B.Tech, MCA, M.Sc, M.S, B.Sc, BCA), MBA, Computer professional.
Duration: 40 Days
Fee: Rs.25,000/-
- Pega7
- Hadoop
- DevOps
- SalesforceCRM
- Selenium
- CoreJava
- Powerbi Online Training
- Salesforce Integration
- Oracle DBA
- Python
- ASP.Net MVC6
- Advanced Java
- Android training
- ASP.Net Training
- C language
- C#.Net Training
- C++
- Selenium Testing
- MY SQLi
- PHP