The Movie Quote project is a web scraping application designed to collect data from the website Kaakook. It allows users to search for movie or TV show information, including quotes, directors, release years, and more. This README provides an overview of the project and explains how to use it effectively.
The Movie Quote project scrapes data from kaakook.fr to create a database of movie and series quotes. The following steps outline how the project functions:
-
Run the
data_collector.py
file: This script creates the initial database by collecting IDs corresponding to movies and series from the website. This database will be used to retrieve movie information later. -
Import the
database
module into your Python interpreter. -
Utilize the various methods available in the
database
module to interact with the data
The database
module contains several methods for interacting with the data:
Use the search
method to search for a movie or series by its title and retrieve its corresponding ID:
search("movie_title")
Retrieve the number of quotes for a specific movie or TV show. You can specify additional filters such as type ("Film" or "Série")
search_num_quotes(200)
search_num_quotes(200, types="Film")
search_num_quotes(200, types="Série")
search_num_quotes(top=10)
Search for movies released in a specific year or within a range of years.
search_years(2005)
search_years(2005, sup="sup")
search_years(2005, inf="inf")
Find movies directed by a specific director.
search_director("director_name")
search_director("director_name", type="Film")
search_director("director_name", type="Série")
View all movies currently in the list.
view()
Add one or more movie IDs to the list.
add(*10)
Remove one or more movie IDs from the list.
delete(*10)
Execute the program to ave for quotes for all movies in the list and store them in a database (or remove them if they are no longer in the list).
run()
The movies, IDs and more collected by data_collector.py are stored in a JSON file named data_movies.json
The quotes collected by the program are stored in a JSON file named data_quote.json.