Skip to content
This repository has been archived by the owner on May 19, 2023. It is now read-only.
/ flats-and-flatmates Public archive

UI scrapper/crawler to filter required posts in a facebook group. It could be used to filter posts in Flats and Flatmates groups to save post containing particular keywords(example: locality name)

License

Notifications You must be signed in to change notification settings

kc596/flats-and-flatmates

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flats and Flatmates

Motivation

Searching a flat/flatmate in cities like Bengaluru is quite challenging. Brokerage is very high (generally 1 month rent). Fortunately, there are few facebook groups with aim for finding flats and flatmates.

Problem

There are many active Flats/Flatmates group and the frequency of posts in these groups are very high. Adding to this, most of the post are addressed to different location than we are looking for. Facebook APIs cannot be used for these groups to get required data unless you are an admin.

UI crawling and automation to rescue!

The crawler will browse mentioned facebook groups (in config) and check if the post contains any of mentioned keywords and not any exception words. If yes, the link to post, time of posting, time of crawling and keywords that match are stored in sqllite3 datbase.

Steps to setup and use

  1. Install Python3
  2. Run command : pip install -r requirements.txt   in root of project.
  3. Set parameters in config/config.yaml file and your facebook credentials in config/credentials.yaml file.
  4. Place chromedriver in drivers folder according to your operation system. From here
  5. Run python main.py

CAUTION

  1. 'male only' keyword and exceptions in config will also match 'female only' because the later contains the first. You can use spaces to handle these situation i.e., ' male only'.
  2. Facebook might detect unusual activity from your account if you overuse the crawler. It might log you out and restrict certain features of your account (example: like, comments, etc.) for some time. I won't be responsible for any harm done due to this application.

TODO

  1. Find better way of crawling recent post only. Research ways/APIs to find the same.
  2. Multithreaded crawling and starting from same place after timeout exception than from beginning.
  3. Regular expression instead of keyword match.
  4. Semantic analysis to get the summary of entire post in few lines?
If you face any problem in using this program or find any error, feel free to create an issue. I will try my best to look into it in my free time.

This project can crawl any facebook group in general and filter out post with specific keywords not containing exception words defined in config.

About

UI scrapper/crawler to filter required posts in a facebook group. It could be used to filter posts in Flats and Flatmates groups to save post containing particular keywords(example: locality name)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages