-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset issue #2
Comments
Hello, |
You may use https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html |
Do we know the actual results of the X-Ray images ? or can i assume that all 125 x-ray images inside Covid-19 folder are covid-19 positive ? Thanks |
|
Hi Muhammed, Have you got the code to implement K-Fold on the datasets ? Thanks |
Hi Muhammed, The reason why i am asking is i trained your model using KFold datasets but i am getting only 58% accuracy. I am printing below one of the iteration output and I think somewhere something is wrong in my code. epoch | train_loss | valid_loss | accuracy | time 0 | 0.003996 | 0.006837 | 1.000000 | 02:26 |
It seems you are using the test set during the training. |
I have not created any testset during the training. All i did was split the data using StratifiedKFold and split the data using 20%. That means KFOLD n_splits=5. In each iteration 20% of my entire dataset will act as testset. I used Stratified KFold to split the data, this is to make sure some portion of testdata will be available during training. for example : [ 25 26 27 28 ... 621 622 623 624] [ 0 1 2 3 ... 221 222 223 224] |
Also whatever the dataset i am using is training set and validation set. My testset is completely unseen x-ray images and the accuracy i am getting is 67%. |
@Kannadasa can you please provide the code you used for KFolds? |
Please find below my code for KFolds. from sklearn.model_selection import KFold data= (ImageList.from_folder(path) df=data.to_df() for train_index, test_index in skf.split(df.index, df['y']):
|
Thanks, much appreciated. On a side note did you manage to get higher accuracy? I'm running the model now and it sits around 78% for the the three classes model. |
Hi, I did not test for 3 classes. I did test only 2 classes. My KFold code is also for 2 classes. I am not getting good accuracy on unseen data. With the training set and validation set the model is working fine. I am not getting good accuracy on the new data which the model has not seen before. Thanks |
Are you using KFold to split the data for 3 classes prediction? Is my KFold split code working for you in 3 classes? Thanks |
i am also facing the same issue,,,i hope you have fixed this problem now,,,Please let me know how you are creating the folder structure and loading the images for train and valid datasets. |
Hi, First of all you need to have a directory called train and valid, because Fastai will look for these names while running the code. I am using KFold cross validation to split the data into training and validation sets. Please find below my code for KFolds. from sklearn.model_selection import KFold data= (ImageList.from_folder(path) df=data.to_df() for train_index, test_index in skf.split(df.index, df['y']): print((train_index), (test_index)) d = (ImageList.from_folder (path) |
Thank you so much my friend for this valuable comment,,,I will try to split train and validation sets as per your guidance,,Thank you again,,,Lets collaborate together to fight against this pandemic. |
It works fine for me in the training set and validation set. If i show some unseen x-ray images to my model, the model does not predict well. I dont know how to fix this problem. Thanks |
Hey @Kannadasa , I successfully run the code for normal splitting like 80% for training and 10% for validation and 10% for testing. But I'm still facing issues with KFold cross-validation. After creating a train and valid dir. this code didn't produce anything. Could you please give a brief about this code? Thanks in advance |
Hello! Why you used validation dataset as test dataset? Or you did different thing that I don't understand? |
Hi,
I am testing your model, but i am not getting the desired output. I think i am not distributing the data properly in train and valid folders.
Please let me know how you are creating the folder structure and loading the images for train and valid datasets. This is for binary classification
The text was updated successfully, but these errors were encountered: