Fun with MapReduce, Eclipse and Java

I started working on a lab assignment from my MapReduce course in Big Data University a week ago and finished it off today. Not that the lab assignment was hard – you simply had to paste some supplied code into a few files, compile them and run them against some test data they gave you. Simple enough.

Big Data University (BDU) supplied me with a login to a cloud server and so I set up the job via command line tools, compiled and tried to run it. It blew off on some errors that seemed to related to the environment – since I couldn’t get to the environment I decided to go ahead and set up my own Hadoop environment. So I did. Three times.

At first, I just installed it on my laptop, but soon discovered that some components (I suspect Yarn) were messing with my USB mounts and so I decided to install it on Cloud 9, which I did. Every time I brought Hadoop up,  the server went down. Then I set up a Virtual Box install of a chopped version of Ubuntu called lxle and installed Hadoop. This time things stayed up, even when I installed and ran Eclipse.

I always find Eclipse give me a headache, probably because it was written by aliens and this time was no different. And even the simplest MapReduce program involve a lot of libraries to access the Hadoop file system, juggle all the components and run them through the  gyrations required and finding a list of the required libraries? Good luck, they and their locations change every time.

After several days of flailing I got everything together and working – flailing is actually quite educational.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s