Skip to content

Tasks and conclusion

Post-training tasks:

  1. Try setting up your own 3 node Hadoop cluster.
    1. A VM based solution can be found here
  2. Write a simple spark/MR job of your choice and understand how to generate analytics from data.
    1. Sample dataset can be found here

References:

  1. Hadoop documentation
  2. HDFS Architecture
  3. YARN Architecture
  4. Google GFS paper

Comments