[`semester-upkeep`] Enable YAML control of Kaggle's GPU setting #30

jmuchovej · 2019-12-17T03:43:54Z

Just a note, when referring to groups... groups := {intelligence, core, data-science, supplementary}

Feature Request, for `autobot`

Description Not all Kaggle Kernels need GPUs. More importantly, Kaggle limits us to 2 active GPU Kernels, while we're allowed up to 10 CPU Kernels. Based on this, it makes sense to toggle GPUs from syllabus.yml – this also allows for toggling the meeting's info if there's a need to rapidly iterate on the Kaggle Kernel.

Needs

Follow the parameter settings and toggle GPU usage

Initial Comments

syllabus.yml specifies all the parameters needed for a given meeting's setup. Within syllabus.yml, one should find...

- required:
    # ... lots going on here
  optional:
    # ... lots going on here
    kaggle:
      datasets: []
      competitions: []
      kernels: []
      gpu: true  # <- pay attention to this

You'll want to make sure that this parameter is both parsed and later propagated to a meeting's kernel-metadata.json file – the template file is here:

{
    "id": "ucfaibot/{{ slug }}",
    "title": "{{ slug }}",
    "code_file": "{{ notebook }}.ipynb",
    "language": "python",
    "kernel_type": "notebook",
    "is_private": false,
    "enable_gpu": false,
    "enable_internet": true,
    "dataset_sources": {{ kaggle.datasets }},
    "competition_sources": {{ kaggle.competitions }},
    "kernel_sources": {{ kaggle.kernels }}
}

The text was updated successfully, but these errors were encountered:

bb912 · 2019-12-18T01:44:07Z

Would adding a GPU boolean parameter to meeting objects be a sound start to a solution?

and/or changing the parse_yaml and write_yaml functions to allow for this indication?

bb912 · 2019-12-18T04:09:17Z

I believe all this requires is to add {{ kaggle.enable_gpu }} to the template.

{ "id": "ucfaibot/{{ slug }}", "title": "{{ slug }}", "code_file": "{{ notebook }}.ipynb", "language": "python", "kernel_type": "notebook", "is_private": false, "enable_gpu": {{ kaggle.enable_gpu }}, "enable_internet": true, "dataset_sources": {{ kaggle.datasets }}, "competition_sources": {{ kaggle.competitions }}, "kernel_sources": {{ kaggle.kernels }} }

bb912 · 2019-12-18T04:23:57Z

I'm not seeing in meetings.py line 166-167 ish

kernel_metadata_path = paths.repo_meeting_folder(meeting) / "kernel-metadata.json" kernel_metadata = Template(open(kernel_metadata_path).read())

after we write to this metadata, (enable_gpu should be written with a changed template described above), where/when/how does the kernel-metadata.json actually help the new notebook get published using kaggle API?

bb912 · 2019-12-18T04:30:40Z

I am also confused by the purpose of the write_yaml function in the meeting meta, where else is it called?

bb912 · 2019-12-18T05:13:02Z

so , i believe we are going to be accessing the kernel metadata in autobot/lib/apis/kaggle.py push_kernel function. instead of doing a subprocess.call("kaggle k push", shell=True)

we do a

subprocess.call("kaggle k push -p templates/seed/meeting/kernel-metadata.json", shell=True)

if this is the case, we would be only pushing the kernel specified by this metadata (and push it WITH this metadata so we can push it with enable_gpu as true.

What I don't fully understand yet: how was this kernel-metadata.json file getting pushed before, without this subprocess call edit?

jmuchovej · 2019-12-18T11:36:26Z

Would adding a GPU boolean parameter to meeting objects be a sound start to a solution?
...
I believe all this requires is to add {{ kaggle.enable_gpu }} to the template.

{
    "id": "ucfaibot/{{ slug }}",
    "title": "{{ slug }}",
    "code_file": "{{ notebook }}.ipynb",
    "language": "python",
    "kernel_type": "notebook",
    "is_private": false,
    "enable_gpu": {{ kaggle.enable_gpu }},
    "enable_internet": true,
    "dataset_sources": {{ kaggle.datasets }},
    "competition_sources": {{ kaggle.competitions }},
    "kernel_sources": {{ kaggle.kernels }}
}

looks good! (this similar to the solution i was thinking of.) 😅 you'll also want to modify the syllabus.yml to make sure the parameter is present. i think it makes sense to set Kaggle GPUs to false.

@brandons209 (thoughts on Kaggle GPU setting to true/false by default?)

and/or changing the parse_yaml and write_yaml functions to allow for this indication?
...
I am also confused by the purpose of the write_yaml function in the meeting meta, where else is it called?

parse_yaml definitely needs to be changed, if memory serves.
write_yaml doesn't really get used – it was used in an older version of the bot, but hasn't been removed.

I'm not seeing in meetings.py line 166-167 ish

kernel_metadata_path = paths.repo_meeting_folder(meeting) / "kernel-metadata.json"
kernel_metadata = Template(open(kernel_metadata_path).read())

those lines are loading that template JSON. the actual substitutions are happening on line 173: https://github.com/ucfai/bot/blob/409902fc7fa352d0e619ceaf661dd837f91d160f/autobot/lib/utils/meetings.py#L173

after we write to this metadata, (enable_gpu should be written with a changed template described above), where/when/how does the kernel-metadata.json actually help the new notebook get published using kaggle API?

based on how the kaggle CLI works, kernel-metadata.json needs to be in meeting's folder. an example: https://github.com/ucfai/core/tree/master/fa19/2019-10-16-cnns

i avoided the subprocess modification you proposed because putting kernel-metadata.json in the meeting's folder allows for two things:

explicitly states the configuration of the Kaggle Kernel for the given meeting.
allows for emergency edits/pushing without a need to use the bot, provided they have the correct key. (that's the decoded version of kaggle.json.gpg)

all told, looks like you're on the right track. i'll need to help you decode the JSON configuration file, DM on discord for that. There's also some playing around to be done with reacting to Kaggle (but i think this belongs in a separate issue (#39), since it's not required).

bb912 · 2019-12-20T17:36:36Z

I think I got this figured out, but unsure why you think changing parse_yaml is necessary. We are holding this data in the meeting object.optional.kaggle. It is only ever(?) obtained through syllabus.yml file in line 90, and use syllabus.yml as dictionary "meeting" parameter make a meeting object in line 97 of ops.py.

jmuchovej · 2019-12-22T14:52:15Z

I think I got this figured out, but unsure why you think changing parse_yaml is necessary. ...

So, parse_yaml was a method I used to do some of the parameter enforcement that I believe now gets done in Meeting.__init__(..); so it's possible that we may not need this.

We are holding this data in the meeting object.optional.kaggle. It is only ever(?) obtained through syllabus.yml file in line 90, and use syllabus.yml as dictionary "meeting" parameter make a meeting object in line 97 of ops.py.

You're correct that all this data is stored insyllabus.yml and only needs to be accessed when updating meetings.

So, we could just migrate parse_yaml to "let's remove it" – but that's something that a code review can do, too. 😅

issue #30 kaggleGPU .YML control. only change needed in template

jmuchovej added 📝 todo Items in still in ideation, discovery, or planning "mode." 🔑 required Tasks that **need** to be completed, ASAP. 🍜 nice to have Tasks that should be completed, but not necessarily ASAP. labels Dec 17, 2019

jmuchovej added this to the Winter 2019 Upgrade milestone Dec 17, 2019

jmuchovej assigned jmuchovej, bb912, SirRoboto and Ch1pless Dec 17, 2019

jmuchovej added the 🎆 feature-request label Dec 17, 2019

jmuchovej removed the 🍜 nice to have Tasks that should be completed, but not necessarily ASAP. label Dec 18, 2019

bb912 added the 🚧 in progress Moved from TODO-like state to actual development label Dec 18, 2019

bb912 added a commit that referenced this issue Dec 23, 2019

Merge pull request #42 from ucfai/30 for issue#30

fca1b4b

issue #30 kaggleGPU .YML control. only change needed in template

jmuchovej closed this as completed Dec 28, 2019

jmuchovej added 🚀 done ship it! and removed 🚧 in progress Moved from TODO-like state to actual development 📝 todo Items in still in ideation, discovery, or planning "mode." labels Jan 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`semester-upkeep`] Enable YAML control of Kaggle's GPU setting #30

[`semester-upkeep`] Enable YAML control of Kaggle's GPU setting #30

jmuchovej commented Dec 17, 2019 •

edited

Loading

bb912 commented Dec 18, 2019 •

edited

Loading

bb912 commented Dec 18, 2019

bb912 commented Dec 18, 2019 •

edited

Loading

bb912 commented Dec 18, 2019

bb912 commented Dec 18, 2019

jmuchovej commented Dec 18, 2019 •

edited

Loading

bb912 commented Dec 20, 2019

jmuchovej commented Dec 22, 2019

[semester-upkeep] Enable YAML control of Kaggle's GPU setting #30

[semester-upkeep] Enable YAML control of Kaggle's GPU setting #30

Comments

jmuchovej commented Dec 17, 2019 • edited Loading

Feature Request, for autobot

Needs

Initial Comments

bb912 commented Dec 18, 2019 • edited Loading

bb912 commented Dec 18, 2019

bb912 commented Dec 18, 2019 • edited Loading

bb912 commented Dec 18, 2019

bb912 commented Dec 18, 2019

jmuchovej commented Dec 18, 2019 • edited Loading

bb912 commented Dec 20, 2019

jmuchovej commented Dec 22, 2019

[`semester-upkeep`] Enable YAML control of Kaggle's GPU setting #30

[`semester-upkeep`] Enable YAML control of Kaggle's GPU setting #30

jmuchovej commented Dec 17, 2019 •

edited

Loading

Feature Request, for `autobot`

bb912 commented Dec 18, 2019 •

edited

Loading

bb912 commented Dec 18, 2019 •

edited

Loading

jmuchovej commented Dec 18, 2019 •

edited

Loading