HDInsight HDInsight

Provision cloud Hadoop, Spark, HBase, and Storm clusters

Azure HDInsight is a fully-managed cloud service that makes it easy, fast, and cost-effective to process massive amounts of data. Use the most popular open-source frameworks such as Hadoop, Spark, Hive, LLAP, Kafka, Storm, HBase & more. Azure HDInsight enables a broad range of scenarios such as ETL, Data Warehousing, IoT and more.

Service features

Preconfigured clusters optimized for different big data scenarios

99.9 % SLA on the cluster

High Availability

Cost-effective for cloud scale

Network Security: Secure Gateway Azure VNET Support

Data Security: Encryption +Role-based access control on Storage

Integration: Azure Cosmos DB and other Azure data services

Components

Hadoop

Spark

Interactive Query

Kafka

HBase

Storm

Extend HDInsight to install any Open Source Engine 1

Enterprise Security Package

No support of SLA provided by Microsoft Azure operated by 21Vianet for these open source apps. Support and SLA only provided for the above workloads.

Pricing features

Azure HDInsight Clusters

Billed on a per minute basis, clusters run a group of nodes depending on the component. Nodes vary by group (e.g. Worker Node, Head Node, etc.), quantity, and instance type (e.g. D1v2).

Refer to the FAQ below for details on workloads and the required nodes. Customers will be billed for each node for the duration of the cluster's life.

Pricing Details

HDInsight Cluster is composed of a group of nodes. In the lifecycle of the cluster, customers need to pay for these nodes. Billing starts from creation of the cluster, and ends in deletion of the cluster. Billing is done proportionately every minute.

Pricing Method

Component Pricing
Hadoop, Spark, Interactive Query, Kafka*, Storm, HBase Base price/node-hour + ¥0/core-hour
Enterprise Security Package Base price/node-hour + ¥0.06/core-hour
1 Kafka needs a Managed Disk, Customers can make a selection of standard Managed Disk. For the pricing of Managed Disks, please view the Azure Storage Pricing Details page.

Memory Optimized nodes for HDInsight

*The following prices are tax-inclusive.

*Monthly pricing estimates are based on 744 hours of usage per month.
Instances Number of cores RAM Disk size Pricing
E2 v3 2 16 GB 50 GB ¥1.31 /hour
(about¥974.64 /month)
E4 v3 4 32 GB 100 GB ¥2.63 /hour
(about¥1,956.72 /month)
E8 v3 8 64 GB 200 GB ¥5.27 /hour
(about¥3,920.88 /month)
E16 v3 16 128 GB 400 GB ¥10.54 /hour
(about¥7,841.76 /month)
E20 v3 20 160 GB 500 GB ¥16.92 /hour
(about¥12,588.48 /month)
E32 v3 32 256 GB 800 GB ¥21.08 /hour
(about¥15,683.52 /month)
E64i v3 64 432 GB 1,600 GB ¥42.14 /hour
(about¥31,352.16 /month)
E64 v3 64 432 GB 1,600 GB ¥42.14 /hour
(about¥31,352.16 /month)

Compute Optimized nodes for HDInsight

*The following prices are tax-inclusive.

*Monthly pricing estimates are based on 744 hours of usage per month.
Instances Number of cores RAM Disk size Pricing
F1 1 2 GB 16 GB ¥0.531 /hour
(about¥395.064 /month)
F2 2 4 GB 32 GB ¥1.102 /hour
(about¥819.888 /month)
F4 4 8 GB 64 GB ¥2.193 /hour
(about¥1,631.592 /month)
F8 8 16 GB 128 GB ¥4.387 /hour
(about¥3,263.928 /month)
F16 16 32 GB 256 GB ¥8.763 /hour
(about¥6,519.672 /month)
Instances Number of cores RAM Disk size Pricing
F1 1 2 GB 16 GB ¥0.531 /hour
(about¥395.064 /month)
F2 2 4 GB 32 GB ¥1.102 /hour
(about¥819.888 /month)
F4 4 8 GB 64 GB ¥2.193 /hour
(about¥1,631.592 /month)
F8 8 16 GB 128 GB ¥4.387 /hour
(about¥3,263.928 /month)
F16 16 32 GB 256 GB ¥8.763 /hour
(about¥6,519.672 /month)
F16s v2 16 32 GB 256 GiB ¥10.597/hour
(about¥7,884.168 /month)

General Purpose nodes for HDInsight

AV2 HDInsight nodes run on Av2 Standard VM, which is the latest generation of A-series virtual machines with similar CPU performance and faster disk.

*The following prices are tax-inclusive.

*Monthly pricing estimates are based on 744 hours of usage per month.
Instances Number of cores RAM Disk size Pricing
A1 v2 1 2 GB 10 GB ¥0.545 /hour
(about¥405.48 /month)
A2 v2 2 4 GB 20 GB ¥1.079 /hour
(about¥802.776 /month)
A2m v2 2 16 GB 20 GB ¥2.079 /hour
(about¥1,546.776 /month)
A4 v2 4 8 GB 40 GB ¥2.169 /hour
(about¥1,613.736 /month)
A4m v2 4 32 GB 40 GB ¥4.166 /hour
(about¥3,099.504 /month)
A8 v2 8 16 GB 80 GB ¥4.326 /hour
(about¥3,218.544 /month)
A8m v2 8 64 GB 80 GB ¥8.328 /hour
(about¥6,196.032 /month)

A Series Universal Nodes

A3 is economical option for meeting universal demands. Customers running basic query applications and modes on Hadoop will benefit from using the A Series.

The A1 node can be only used as Storm’s Zookeeper node. The A2 node can be only used as Zookeeper node of HBase and Storm.

A Series cannot be used as data nodes in Linux Cluster. They can only be used as the head node and ZooKeeper node, and only A1, A2 and A3 are available.

*The following prices are tax-inclusive.

*Monthly pricing estimates are based on 744 hours of usage per month.
Instances Number of cores RAM Disk size Pricing Per Node
A1 1 1.75 GB 70 GB ¥ 0.3981/hour
(about ¥ 296.1864/month)
A2 2 3.5 GB 135 GB ¥ 0.9662/hour
(about ¥ 718.8528/month)
A3 4 7 GB 285 GB ¥ 1.9425/hour
(about ¥ 1,445.22/month)

D Series Nodes: CPU accelerated by 60%, bigger memory, local SSD

D1, D2 and D11 nodes can only be used as Zookeeper nodes for HBase and Storm.

*The following prices are tax-inclusive.

*Monthly pricing estimates are based on 744 hours of usage per month.
Instances Number of cores Memory Disk size Pricing
D1 1 3.5 GB 50 GB ¥ 0.5981/hour
(about ¥ 444.9864/month)
D2 2 7 GB 100 GB ¥ 1.2362/hour
(about ¥ 919.7328/month)
D3 4 14 GB 200 GB ¥ 2.4725/hour
(about ¥ 1,839.54/month)
D4 8 28 GB 400 GB ¥ 4.965/hour
(about ¥ 3,693.96/month)
D5 16 56 GB 800 GB ¥ 9.8999/hour
(about ¥ 7,365.5256/month)
D11 2 14 GB 100 GB ¥ 1.9717/hour
(about ¥ 1,466.9448/month)
D12 4 28 GB 200 GB ¥ 3.9434/hour
(about ¥ 2,933.8896/month)
D13 8 56 GB 400 GB ¥ 7.8867/hour
(about ¥ 5,867.7048/month)
D14 16 112 GB 800 GB ¥ 11.2234/hour
(about ¥ 8,350.2096/month)

Dv2 Series Optimized Nodes: A New Generation of CPU

The Dv2 Series is the new generation of D Series instances with stronger CPU, of which the memory and disk configurations are the same as the D Series. The instances of the Dv2 Series are based on a new generation of 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, which can reach up to 3.2GHz using Intel Turbo Boost Technology 2.0. The Dv2 Series can cater to customers requiring low delay, local SSD access or faster CPU to run applications. The bigger memory of the D Series and Dv2 Series can improve performance for customers using HDInsight HBase. Customers using HDInsight Storm and Spark can upload bigger reference data via its bigger memory, and realize bigger throughput through its faster CPU.

The D Series will continue to apply, but the DV2 series is recommended.

*The following prices are tax-inclusive.

*Monthly pricing estimates are based on 744 hours of usage per month.
*The pricing and billing mode of HDInsight Dv2 series are same as those of HDInsight D series. The name of HDInsight Dv2 series is displayed as HDInsight D series in the bill.
Instances Number of cores Memory Disk size Pricing
D1 v2 1 3.5 GB 50 GB ¥ 0.5981/hour
(about ¥ 444.9864/month)
D2 v2 2 7 GB 100 GB ¥ 1.2362/hour
(about ¥ 919.7328/month)
D3 v2 4 14 GB 200 GB ¥ 2.4725/hour
(about ¥ 1,839.54/month)
D4 v2 8 28 GB 400 GB ¥ 4.965/hour
(about ¥ 3,693.96/month)
D5 v2 16 56 GB 800 GB ¥ 9.8999/hour
(about ¥ 7,365.5256/month)
D11 v2 2 14 GB 100 GB ¥ 1.9717/hour
(about ¥ 1,466.9448/month)
D12 v2 4 28 GB 200 GB ¥ 3.9434/hour
(about ¥ 2,933.8896/month)
D13 v2 8 56 GB 400 GB ¥ 7.8867/hour
(about ¥ 5,867.7048/month)
D14 v2 16 112 GB 800 GB ¥ 11.2234/hour
(about ¥ 8,350.2096/month)

FAQ

Expand all
  • How are the different HDInsight cluster types billed?

    HDInsight deploys different number of nodes for each cluster type. Within a given cluster type, there are different roles for the various nodes, which allow a customer to size those nodes in a given role appropriate to the details of their workload. For example, a Hadoop cluster can have its worker nodes provisioned with a large amount of memory if the type of analytics being performed are memory intensive.

    HDInsights’ Hadoop Cluster can deploy three kinds of roles:

    • Head node (2 nodes)
    • Data node (at least 1 node)
    • Zookeeper nodes (3 nodes)

    HDInsight’s HBase Cluster can deploy three kinds of roles:

    • Control Server (2 nodes)
    • Zone Server (at least 1 node)
    • Main node/Zookeeper node (3 nodes)

    HDInsight’s Storm Cluster can deploy three kinds of roles:

    • Nimbus node (2 nodes)
    • Supervision Server (at least 1 node)
    • Zookeeper node (3 nodes)
  • If my cluster ran for less than an hour, how much would I get billed?

    We charge for the number of minutes your cluster is running, rounded to the nearest minute, not hour.

  • Could you give me an example on how billing works?

    If you run a cluster for 100 hours in US East with two D13 v2 head nodes, three D12 v2 data nodes, and three D11 v2 zookeepers, the billing would be the following in the scenario:

    On a Standard HDInsight cluster—100 hours x ( 2 x ¥7.8867/hour + 3 x ¥3.9434/hour + 3 x ¥1.9717/hour) = ¥3351.87

  • How can I check that I have properly stopped an HDInsight cluster and that I am not being billed for it?

    In order to stop an HDInsight cluster, you must delete the cluster. By default, all data an HDInsight cluster operates on resides in Azure Blob storage, so data will not be impacted by this. If you want to preserve your Hive metadata (tables, schemas) you should provision a cluster with an external metadata store. You can find more details in this documentation .

  • How many data nodes do I need for my HDInsight cluster?

    The number of data nodes will vary depending on your needs. With the elasticity available in Azure cloud services, you can try a variety of cluster sizes to determine your own optimal mix of performance and cost, and only pay for what you use at any given time. Clusters can also be scaled on demand to grow and shrink to match the requirements of your workload.

  • What if I need more HDInsight data nodes than my subscription allows?

    Each subscription has a default limit on how many HDInsight data nodes can be created. If you need to create a larger HDInsight cluster or multiple HDInsight clusters that together exceed your current subscription maximum, you can request that your subscription's billing limits be increased. Please open "Support Type" for related operations. Depending on the maximum nodes per subscription that you request, you may be asked for additional information that will allow us to optimize your deployment(s).

  • How much would a cluster with "x" data nodes cost?

    To estimate the cost of clusters of various sizes, try the Azure Calculator .

  • How can I reduce costs on clusters I use infrequently?

    There are a number of options to reduce the costs:

    • Drive higher utilization of your existing clusters.

      1.Delete clusters while not in use. For more information about deleting a cluster, see Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI

      2.Scale down. For more information about manually scaling clusters, see Scale HDInsight clusters

    • Deploy the clusters with lower cost. This includes proper planning on how many nodes to use, which type of node to use for head nodes and worker nodes, and which region to launch the cluster as HDInsight offers many different node types to deploy to, with a range of pricing options. Review the Base price/node-hour section of this article for pricing and for more information see Capacity planning for HDInsight clusters
  • How much would a cluster with "x" data nodes cost?

    To estimate the cost of clusters of various sizes, try the Azure Calculator .

Support & SLA

If you have any questions or need help, please visit Azure Support and select self-help service or any other method to contact us for support.

As for HDInsight, we guarantee that any HDInsight cluster you deploy can establish external connections at least 99.9% of the time during the monthly billing cycle. To learn more about the details of our Service Level Agreement, please visit the Service Level Agreements page.