-
-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getting empty data #169
Comments
I have the exact same issue, I can see Selinium searching through Tweets for the specified period, but no data is returned, this is my code:
console: |
I manage to find the problem with this issue. So first u have to go to the function get_data in 'Scweet\utils.py' and change all instances of find_element_by_xpath('...') to find_element('xpath', '...') As it is no longer supported for latest versions of Selinium. The second thing is that you have to check if all the functions that return an element from HTML are actually returning something (it appears that if only one element is null the whole Tweet is considered Null for example if Selinium couldn't find the Username of the Tweet). To do this u have to check all the xpaths if they're correct or not. I will give an example but u should check all of them.
should actually be:
in my case I haven't used all that Tweet metadata I've only used the ones I needed and checked if their xpath is correct. there's the final code of the get_data() method:
|
and does it work?
|
Thanks for your great work! When the result is I only got the "reply to @xxxxx", |
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from Scweet.scweet import scrape
Specify the parameters for scraping
username = "2MInteractive"
since_date = "2023-07-01"
until_date = "2023-07-11"
headless = True
Set up the ChromeDriver service
service = Service("C:/Users/HP Probook/Downloads/chromedriver.exe") # Replace with the actual path to chromedriver
Set up the ChromeOptions
options = webdriver.ChromeOptions()
options.headless = headless
Create the WebDriver
driver = webdriver.Chrome(service=service, options=options)
Scrape the tweets by username
data = scrape(from_account=username, since=since_date, until=until_date, headless=headless, driver=driver)
Print the scraped data
print(data)
Close the WebDriver
driver.quit()
getting empty data "C:\Users\HP Probook\PycharmProjects\firstproject\venv\Scripts\python.exe" "C:/Users/HP Probook/PycharmProjects/firstproject/TikTokScrap.py"
looking for tweets between 2023-07-01 and 2023-07-06 ...
path : https://twitter.com/search?q=(from%3A2MInteractive)%20until%3A2023-07-06%20since%3A2023-07-01%20&src=typed_query
scroll 1
scroll 2
looking for tweets between 2023-07-06 and 2023-07-11 ...
path : https://twitter.com/search?q=(from%3A2MInteractive)%20until%3A2023-07-11%20since%3A2023-07-06%20&src=typed_query
scroll 1
scroll 2
Empty DataFrame
Columns: [UserScreenName, UserName, Timestamp, Text, Embedded_text, Emojis, Comments, Likes, Retweets, Image link, Tweet URL]
Index: []
Process finished with exit code 0
The text was updated successfully, but these errors were encountered: