Skip to content

Latest commit

 

History

History
54 lines (28 loc) · 1.94 KB

README.md

File metadata and controls

54 lines (28 loc) · 1.94 KB

MapReduce Program to Find Maximum Temperature

giphy
License

Description

This repository contains a MapReduce program written in Java to find the maximum temperature from a given dataset. The program uses Hadoop MapReduce framework to process large amounts of data in parallel on a cluster of commodity hardware.

Table of Contents

Requirements

  • Java 8 or higher
  • Hadoop 2.7.1 or higher

Input Format

The input dataset is assumed to be in the following format:
image

where station_name is a string representing the name of the weather station, year is a string representing the year in yyyy format, and max temperature is a float representing the temperature in Fahrenheit.

Usage

To run the MapReduce program, you need to first create a jar file using the following command:

This will create a mapreduce-1.0-SNAPSHOT.jar file in the home directory. You can then run the program using the following command:

$ hadoop jar home/mt.jar MaxTemperature <input_path> <output_path>

where <input_path> is the path to the input dataset and <output_path> is the path to the output directory where the maximum temperature will be written.

Output Format

The output of the program is a single line containing the maximum temperature and the date on which it occurred. The output is in the following format: image

Contact

If you have any questions or suggestions, please feel free to connect with us.