We've noticed this is not your region.
Redirect me to my region
What do you want to learn today?

Details

The Hadoop Cluster Administration training course is designed to provide knowledge and skills to become a successful Hadoop Architect. It starts with the fundamental concepts of Apache Hadoop and Hadoop Cluster. It covers topics to deploy, configure, manage, monitor and secure a Hadoop Cluster. The course will also cover HBase Administration. There will be many challenging, practical and focused hands-on exercises for the learners. By the end of this Hadoop Cluster Administration training, you will be prepared to understand and solve real world problems that you may come across while working on Hadoop Cluster.

WHO SHOULD ATTEND

  • Systems administrators
  • Windows administrators
  • Linux administrators
  • Infrastructure engineers
  • DB Administrators
  • Big Data Architects
  • Mainframe Professionals and IT managers



COURSE OUTLINE

Module 1 – Introduction to Hadoop

  • The amount of data processing in today’s life
  • What Hadoop is why it is important?
  • Hadoop comparison with traditional systems
  • Hadoop history
  • Hadoop main components and architecture

Module 2 – Hadoop Distributed File System (HDFS)

  • HDFS overview and design
  • HDFS architecture
  • HDFS file storage
  • Component failures and recoveries
  • Block placement
  • Balancing the Hadoop cluster

Module 3 – Planning your Hadoop cluster

  • Planning a Hadoop cluster and its capacity
  • Hadoop software and hardware configuration
  • HDFS Block replication and rack awareness
  • Network topology for Hadoop cluster

Module 4 – Hadoop Deployment

  • Different Hadoop deployment types
  • Hadoop distribution options
  • Hadoop competitors
  • Hadoop installation procedure
  • Distributed cluster architecture

Lab: Hadoop Installation
Module 5 – Working with HDFS

  • Ways of accessing data in HDFS
  • Common HDFS operations and commands
  • Different HDFS commands
  • Internals of a file read in HDFS
  • Data copying with ‘distcp’

Module 6 – Mapreduce Abstraction

  • What MapReduce is and why it is popular
  • The Big Picture of the MapReduce
  • MapReduce process and terminology
  • MapReduce components failures and recoveries
  • Working with MapReduce

Module 7 – Hadoop Cluster Configuration

  • Hadoop configuration overview and important configuration file
  • Configuration parameters and values
  • HDFS parameters MapReduce parameters
  • Hadoop environment setup
  • ‘Include’ and ‘Exclude’ configuration files

Lab: MapReduce Performance Tuning
Module 8 – Hadoop Administration and Maintenance

  • Namenode/Datanode directory structures and files
  • File system image and Edit log
  • The Checkpoint Procedure
  • Namenode failure and recovery procedure
  • Safe Mode
  • Metadata and Data backup
  • Potential problems and solutions / what to look for
  • Adding and removing nodes

Lab: MapReduce File system Recovery
Module 9 – Hadoop Monitoring and Troubleshooting

  • Best practices of monitoring a Hadoop cluster
  • Using logs and stack traces for monitoring and troubleshooting
  • Using open-source tools to monitor Hadoop cluster

Module 10 – Job Scheduling

  • How to schedule Hadoop Jobs on the same cluster
  • Default Hadoop FIFO Schedule
  • Fair Scheduler and its configuration

Module 11 – Hadoop Multi Node Cluster Setup and Running Map Reduce Jobs on Amazon Ec2

  • Hadoop Multi Node Cluster Setup using Amazon ec2 – Creating 4 node cluster setup
  • Running Map Reduce Jobs on Cluster

Module 12 – High Availability Federation, Yarn and Security

Reviews
Be the first to write a review about this course.
Write a Review

Graspskills is an IT service training and consulting organization. We conduct corporate trainings and open house workshops in Information Technology. We provide training for management courses, Agile courses, IT courses and other courses.

Sending Message
Please wait...
× × Speedycourse.com uses cookies to deliver our services. By continuing to use the site, you are agreeing to our use of cookies, Privacy Policy, and our Terms & Conditions.