-
Notifications
You must be signed in to change notification settings - Fork 0
nsinha-git/netflix
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
High level logic: 1. We materialize the files from IMDB into an in memory data structure at initialization time.We use TSVreader class to read files. 2. We use a system of hash maps as we are always searching based on id's of titles to store the title basic info, episodes info, rating info. We also use helper maps like reverse map or join map (e.g title to episode ids) to make the serving api as quick as possible. 3. To tackle concurrency issues the api are either stateless or use concurrent data structures for shared state. Hence all the maps that we use, are concurrent maps. How to run: 1. TestApi.java is junit test class to test the Api's. Please run `mvn test`. Used Classes Summary: 1. Api.java: Interface class for api 2. ApiImpl.java: Implements Api interface 3. TitleBasicInfo.java: POJO for title.basic.tsv 4. TitleEpisode.java: POJO for title.episode.tsv 5. TitleRating.java: POJO for title.rating.tsv 6. TsvReader.java: util class for tsv file parsing 7. InMemoryDataStore.java: This class when initialized reads all the required tsv files and initializes the in memory maps. It publishes methods for helping implementation of Api. 8. TestApi.java: junit test class. Issues: 1. This will not work as sizes of file increase. We will need to use distributed file system of databse like HBASE/HDFS in those cases. 2. API is by Id and name of Title. Flow of program:: 1. InMemdataStore reads title.basic.tsv and initializes a map containing ids against the titleBasiceInfo POJO. Also it creates a set of all id's for 2017 so that we can filter title not based on that year. We also maintain a titleName to id map to look titles by name. 2. It then reads title.episodes.tsv and initializes the map with id against episode pojo(filters on 2017). It also creates a join map for tvseriesId to list of its episode ids to quickly look through episodes of a series. 3. It then reads the ratings and initializes the id to rating pojo(filters on 2017). 4.At last updateRatingBySeason() is called. It ges through every title in titleMap at step 1 and finds if the title is a tvseries. If yes, then it looks at all the episodes and groups them on season. It refers ratings map and calculates an average rating of all episodes belonging to the season. A title to season to ratings map structure is created. Api: 1. Api to read the imdb rating by id and titleName is provided. 2. Api to read rating by id (also by Title Name) and season is provided. If id is a tvseries then season-title map in FlowOFProgram step 4 is used to provide answers. 3.Api to update rating is provided. It updates the titleRating map. If title is found to be episode of a tv series then we recalculate season ratings of that title again. Concurrency: 1. Shared state are based on concurrent data structures. No explicit locks are used. Algorithm for Season rating: 1. For a given season., all applicable episodes ratings are summed and then by divison a average is derived. This average rating is what is used for the season rating.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published