Skip to content

Latest commit

 

History

History
19 lines (9 loc) · 627 Bytes

README.md

File metadata and controls

19 lines (9 loc) · 627 Bytes

Development Test

EX.1 Number of lines in large files

EX2. Top ten arrival airports by passenger number

EX3. Plot the monthly number of searches for selected airports

Solution is given in Python upon some internet search about how creating hdfs store and using Pandas

I tried also to use R which is easier to manipulate data

The problem is that with R loading the large file into memory takes more time but computation is as easier as trivial operation

at the end both solution are almost equal in time and i can give advantage to R especially that i didn't use the libraries for large data files