TAGS :Viewed: 2 - Published at: a few seconds ago

[ Data Storage in Hadoop cluster ]

This is a question from a hadoop book and the answer i thougt was 200 but that is not correct.Can anyone explain?

Assume that there are 50 nodes in your Hadoop cluster with a total of 200 TB (4 TB per node) of raw disk space allocated HDFS storage. Assuming Hadoop's default configuration, how much data will you be able to store?

Answer 1


HDFS has the default replication level set to 3, therefore, each of your data would have 3 copies in HDFS unless specified clearly at the time of creation.

Therefore, under the default HDFS configuration, you could only store 200/3 TB of actual data.