Learning Hadoop Virtual Machine Setup

in Proof of Brain2 years ago

I love big data. It's an interesting field. And with or without AI, there are lot of things that you can do with the same. Like spending a lot of time running variety of big data jobs. And that would help you understand this field better.

So recently I had a chance to test out some of the images of the Hadoop. And you have like plenty of good option out there. Most common being the Hortonworks, CDP, and many other private images for the Virtual Machine running the Hadoop. I just don't know how many others are out there.

So let's discuss how to setup VM for Hadoop.

Cloudera or Centos Image?


There are many images where you have the hadoop preinstalled and you just have to run the image and access the hadoop from the url that is shown after you login to that OS image. So depending on which distribution you choose things would be a bit different.

Like say some of the images would make it easy for you to use the hadoop as you go through installation phase. And some would make it easier as you choose to use the install on linux setup option. I realized that not all people can possibly make use of that but if you use it with cludera things would be easy.

Pre-configured images?


I feel making use of the pre configured image would be much better option. like it would help you to use the settings as per some educational syllabus. And also it would be possible for you to try out the images which would be easier for the usage too. So that would be one of the options at the start.

My personal favorite is the pre configured image that would allow you to get the work done and also allow you to test out all the features. So in that case the Cloudera images would work out. I would say that horton works would be a good option.

hdp.png

Most logical option would be making use of the Cloudera's HDP. This is one such option which allows you to make the changes to the platform and also make sure to get those changes properly run and configured too. So in that sense I would say that you should consider making use of the HDP.


Hadoop is an interesting thing to learn. I need to find the real life client project though. Because I can test out the things and also get the things worked with the tutorials. But having the help from the other people would be a good thing in the project. So that's it for the day.

Sort:  

Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!

Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).

You may also include @stemsocial as a beneficiary of the rewards of this post to get a stronger support.