The data was generated by crawling google and reddit. It was manually curated after reading about canabis diseases by a nonexpert. Please consult a professional if your marijuana is sick.
One source identifies 30 types of cannabis ailments:
- boron deficiency
- broad mites
- bud rot
- calcium deficiency
- copper deficiency
- fungus gnats
- heat light stress
- iron deficiency
- yellow leaf spot leaf septoria
- light burn
- magnesium deficiency
- male plants bananas hermies
- manganese deficiency
- molybdenum deficiency
- nitrogen deficiency
- nitrogen toxicity
- nutrient burn
- overwatering
- bugs pests symptoms marijuana grow
- ph fluctuations
- phosphorus deficiency
- potassium deficiency
- root problems
- root rot
- spider mites
- sulfur deficiency
- underwatering
- white powdery mildew
- wind burn
- zinc deficiency
resources:
- https://www.pinterest.com/search/pins/?q=nitrogen%20deficiency%20canabis&rs=typed&term_meta[]=nitrogen%7Ctyped&term_meta[]=deficiency%7Ctyped&term_meta[]=canabis%7Ctyped
- http://www.marijuana-seeds.net/marijuana-plant-problems
- https://www.growweedeasy.com/cannabis-symptoms-pictures/
- https://towardsdatascience.com/build-a-taylor-swift-detector-with-the-tensorflow-object-detection-api-ml-engine-and-swift-82707f5b4a56
the "other" dataset was generated from:
- https://www.kaggle.com/alxmamaev/flowers-recognition/home
- https://www.kaggle.com/sayangoswami/reddit-memes-dataset/home
- https://www.kaggle.com/jessicali9530/caltech256/home we wanted to build an "other" dataset with images that may fool the model, we collected images likely to be uploaded by random users from their phones, random objects, plants that aren't cannabis and images with overlayed text. Many of the sick samples were from sites that tend to add overlay text describing the cannabis's condition.
healthy 1120 sick 946 not_weed 964
train/validation/test 60/20/20
- data augmentation make validation accuracy worse on first attempt. Must mean
the transformations are not representative of valid data for our problem.
- find transformations that produce valid data for our problem
- resnet50 may overfit on datasets < 10K images
- try VGG16
- visualize misclassified samples to gain intuition about where the model is struggling
- curate more data
- consider dropping photos of groups of plants
- see projects
gcloud projects list
- see current project
gcloud config list project
- change project
gcloud config set project <project name>
- set bucket name
BUCKET_NAME="keras-class-191806"
- set your preferred region
REGION=us-east1
- copy model over
gsutil cp src gs://$BUCKET_NAME/saved_model.pg
NOTE: ml-engine expects your model's protocol buffer file to be namesaved_model.{pb,pbtxt}
- register ml-engine entry
gcloud ml-engine models create <model name>
- deploy version of model
gcloud ml-engine versions create v4 --model=plantDisease01 --origin=gs://keras-class-191806/plantDisease01/vgg16_data_augmented-tf --runtime-version=1.4
the origin arg must be a local dir with the full tf model (protocol buffer and variables) - see models
gcloud ml-engine models list
- craft request payload
python -c 'import base64, sys, json; img = base64.b64encode(open(sys.argv[1], "rb").read()).decode(); print(json.dumps({"foo-input": {"b64": img}}))' test_healthy.jpg &> request.json
- craft request
python -c 'req = []; [req.append(0.2) for i in range(224*224)]; print(req)' &> request-float32.json
- inference
gcloud ml-engine predict --model=plantDisease01 --json-instances=request.json