I want to run my custom java code/program on a single node hadoop cluster.
How do I run a Java program in a single node cluster in hadoop? Do I need to convert my Java code into a JAR file and then execute?
Yes, you need to convert into .Jar file. I will explain you step by step
1)Write your java code in Eclipse IDE.
2)To create jar of your project, follow this link
3)Copy your dataset to HDFS using following command
$ bin/hadoop dfs -copyFromLocal /path/to/file/on/filesystem /path/to/input/on/hdfs
4)Run your jar by giving path of a dataset which is stored in HDFS, you can follow command
$ bin/hadoop jar path/to/jar/on/filesystem /path/to/input/on/hdfs /path/to/outputdir/on/hdfs
5)The following command is used to verify the resultant files in the output folder.
$ bin/hadoop fs -ls /path/to/outputdir/on/hdfs
6)The following command is used to see the output in Part-00000 file. This file is generated by HDFS.
$ bin/hadoop fs -cat path/to/output_dir/part-00000
Hope this helps you.