How To Create a Hadoop Cluster in AWS

Share This Post

One of the really great things about Amazon Web Services (AWS) is that AWS makes it easy to create structures in the cloud that would be extremely tedious and time-consuming to create on-premises. For example, with Amazon Elastic MapReduce (Amazon EMR) you can build a Hadoop cluster within AWS without the expense and hassle of provisioning physical machines.

Before I show you how to create a Hadoop cluster in the cloud, I need to discuss a couple of prerequisites. If you’re planning on running hive queries against the cluster, then you’ll need to dedicate an Amazon Simple Storage Service (Amazon S3) bucket for storing the query results. It’s critically important to give this bucket a name that complies with Amazon’s naming requirements and with the Hadoop requirements. Specifically, there are two criteria that you must meet when naming your Amazon S3 bucket. First, the bucket name can only contain lowercase letters, numbers, periods and hyphens. Uppercase letters aren’t allowed. Second, your bucket name cannot end in a number.

To read the entire article, please click on

More To Explore