Hadoop local mode, Distributed mode, Pseudo Distributed
There is three different modes are available in hadoop that is
- Local Mode Hadoop setup
- Pseudo Distributed Mode Setup
- Distributed Mode Setup
Let’s describe this terms in more detail
What is Local or Standalone Mode in Hadoop ?
- This is a default mode of hadoop, for references Hadoop uses local filesystem
- If you run any job in local mode, it will start running locally means it will use only single jvm process.
- you can directly point to bin directory of hadoop and run example all the default local or standalone configuration is setup
Note – Alternative Ex. “Hadoop jar myjar.jar ….”
If you mention only jar then jobs will run locally, to run hadoop jobs on cluster mode you should specify “-jar” you need to add hyphen sign
What is Distributed Mode in Hadoop ?
- If you are having namenode, Jobtracker running on different servers we can call it as a Distributed mode cluster where your replication factor is greater than 2 or 3
- In a distributed mode you can take advantage of parallel processing and data replication.
- For example you can consider hadoop production cluster.
What is Pseudo Distributed Mode in Hadoop ?
- Pseudo distributed mode mainly used by developers to test the code on their machine before deploying it on production cluster.
- in pseudo distributed mode jobtracker, namenode, tasktracker and datanode services are running on single machine
- In this case the limitation is you can’t set the replication factor greater than one