Cover image for Hadoop Beginner's Guide [electronic resource].
Hadoop Beginner's Guide [electronic resource].
ISBN:
9781849517317

9781628700237
Title:
Hadoop Beginner's Guide [electronic resource].
Author:
Turkington, Garry.
Personal Author:
Publication Information:
Birmingham : Packt Pub., 2013.
Physical Description:
1 online resource (911 pages)
Contents:
Table of Contents; Hadoop Beginner's Guide; Hadoop Beginner's Guide; Credits; About the Author; About the Reviewers; www.PacktPub.com; Support files, eBooks, discount offers and more; Why Subscribe?; Free Access for Packt account holders; Preface; What this book covers; What you need for this book; Who this book is for; Conventions; Time for action -- heading; What just happened?; Pop quiz -- heading; Have a go hero -- heading; Reader feedback; Customer support; Downloading the example code; Errata; Piracy; Questions; 1. What It's All About; Big data processing; The value of data.

Historically for the few and not the manyClassic data processing systems; Scale-up; Early approaches to scale-out; Limiting factors; A different approach; All roads lead to scale-out; Share nothing; Expect failure; Smart software, dumb hardware; Move processing, not data; Build applications, not infrastructure; Hadoop; Thanks, Google; Thanks, Doug; Thanks, Yahoo; Parts of Hadoop; Common building blocks; HDFS; MapReduce; Better together; Common architecture; What it is and isn't good for; Cloud computing with Amazon Web Services; Too many clouds; A third way; Different types of costs.

AWS -- infrastructure on demand from AmazonElastic Compute Cloud (EC2); Simple Storage Service (S3); Elastic MapReduce (EMR); What this book covers; A dual approach; Summary; 2. Getting Hadoop Up and Running; Hadoop on a local Ubuntu host; Other operating systems; Time for action -- checking the prerequisites; What just happened?; Setting up Hadoop; A note on versions; Time for action -- downloading Hadoop; What just happened?; Time for action -- setting up SSH; What just happened?; Configuring and running Hadoop; Time for action -- using Hadoop to calculate Pi; What just happened?; Three modes.

Time for action -- configuring the pseudo-distributed modeWhat just happened?; Configuring the base directory and formatting the filesystem; Time for action -- changing the base HDFS directory; What just happened?; Time for action -- formatting the NameNode; What just happened?; Starting and using Hadoop; Time for action -- starting Hadoop; What just happened?; Time for action -- using HDFS; What just happened?; Time for action -- WordCount, the Hello World of MapReduce; What just happened?; Have a go hero -- WordCount on a larger body of text; Monitoring Hadoop from the browser; The HDFS web UI.

The MapReduce web UIUsing Elastic MapReduce; Setting up an account in Amazon Web Services; Creating an AWS account; Signing up for the necessary services; Time for action -- WordCount on EMR using the management console; What just happened?; Have a go hero -- other EMR sample applications; Other ways of using EMR; AWS credentials; The EMR command-line tools; The AWS ecosystem; Comparison of local versus EMR Hadoop; Summary; 3. Understanding MapReduce; Key/value pairs; What it mean; Why key/value data?; Some real-world examples; MapReduce as a series of key/value transformations.

Pop quiz -- key/value pairs.
Local Note:
eBooks on EBSCOhost
Format:
Electronic Resources
Electronic Access:
Click here to view
Publication Date:
2013
Publication Information:
Birmingham : Packt Pub., 2013.