[ Data Storage in Hadoop cluster ]
This is a question from a hadoop book and the answer i thougt was 200 but that is not correct.Can anyone explain?
Assume that there are 50 nodes in your Hadoop cluster with a total of 200 TB (4 TB per node) of raw disk space allocated HDFS storage. Assuming Hadoop's default configuration, how much data will you be able to store?
Answer 1
HDFS has the default replication level
set to 3, therefore, each of your data would have 3 copies in HDFS unless specified clearly at the time of creation.
Therefore, under the default HDFS configuration, you could only store 200/3 TB of actual data.