Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package #108

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

mfekadu
Copy link
Member

@mfekadu mfekadu commented Mar 5, 2020

What's New?

  • Instagram made a cool tool called MonkeyType
  • MonkeyType automatically generated static type annotations by collecting runtime types on the nimbus.py script
  • type annotations added to the following modules
    • QA
    • database_wrapper
    • nimbus_nlp.NIMBUS_NLP
    • nimbus_nlp.question_classifier
    • nimbus_nlp.save_and_load_model
    • BUT not auto-generated on the following modules
      • modules.validators
      • modules.formatters
      • flask_api
  • Pipfile && Pipfile.lock now have MonkeyType in the dev-dependencies

Awesome Code Talk

How did I do this?

Installing MonkeyType

$ pipenv install MonkeyType --dev
$ pipenv shell

Running MonkeyType, which generates a monkeytype.sqlite3

$ monkeytype --verbose run nimbus.py

Then I asked 2 simple questions "What is Foaad's email?" and "Who is the contact for Color Coded?"

Checking for what modules MonkeyType explored while the nimbus.py code was running

$ monkeytype list-modules
QA
database_wrapper
nimbus_nlp.NIMBUS_NLP
nimbus_nlp.question_classifier
nimbus_nlp.save_and_load_model

Telling MonkeyType to generate type annotations

$ monkeytype --verbose stub QA
$ monkeytype --verbose stub database_wrapper
$ monkeytype --verbose stub nimbus_nlp.NIMBUS_NLP
$ monkeytype --verbose stub nimbus_nlp.question_classifier
$ monkeytype --verbose stub nimbus_nlp.save_and_load_model

Telling MonkeyType to edit our code and apply the type annotations

$ monkeytype --verbose apply QA
$ monkeytype --verbose apply database_wrapper
$ monkeytype --verbose apply nimbus_nlp.NIMBUS_NLP
$ monkeytype --verbose apply nimbus_nlp.question_classifier
$ monkeytype --verbose apply nimbus_nlp.save_and_load_model

Addresses #14 but does not fix

Type of change (pick-one)

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • I am unsure if the functionality is the same. Hence the [WIP] (work-in-progress) tag.
  • This change requires a documentation update

How Has This Been Tested?

It has not. Let's not merge this until we have tests to verify the same functionality.

However, the auto-generated type annotations can spark discussion on the functions we have written

For example, the monkeytype annotations of these function arguments mismatch the docstring Args in this __init__

-    def __init__(self, q_format, db_query, format_answer):
+    def __init__(self, q_format: str, db_query: partial, format_answer: partial) -> None:
         """
         Args:
             q_format (str): Question format string
@@ -49,13 +51,13 @@ class QA:
         self.db_query = db_query
         self.format_answer = format_answer

api/QA.py

Lines 38 to 48 in 41945a0

def __init__(self, q_format: str, db_query: partial, format_answer: partial) -> None:
"""
Args:
q_format (str): Question format string
db (NimbusDatabase): Object used to access remote database
db_query (DB_Query): Function used to get data from database. Takes
a dict of extracted variables and returns a dict of variables
from the database.
format_function (Answer_Formatter): Function used to format answer
string. Takes two dicts--one of extracted variables and one of
data retrieved from the database--and returns a str.

MonkeyType noticed a string returned from this function when called just once. Is the return value always a string?

-    def _get_data_from_db(self, extracted_vars):
+    def _get_data_from_db(self, extracted_vars: Dict[str, str]) -> str:
         return self.db_query(extracted_vars)

api/QA.py

Lines 54 to 55 in 41945a0

def _get_data_from_db(self, extracted_vars: Dict[str, str]) -> str:
return self.db_query(extracted_vars)

Ah, nice! It is reassuring to see the docstring match the generated type annotation

-def create_qa_mapping(qa_list):
+def create_qa_mapping(qa_list: List[QA]) -> Dict[str, QA]:
     """
     Creates a dictionary whose values are QA objects and keys are the question
     formats of those QA objects.
@@ -146,18 +148,18 @@ def create_qa_mapping(qa_list):
 #     return functools.partial(_single_var_string_sub, a_format)

api/QA.py

Lines 71 to 74 in 41945a0

def create_qa_mapping(qa_list: List[QA]) -> Dict[str, QA]:
"""
Creates a dictionary whose values are QA objects and keys are the question
formats of those QA objects.

hmm.. interesting type: DUMMY_NAME

+from google.cloud.automl_v1.types import PredictResponse
+from monkeytype.encoding import DUMMY_NAME
+from typing import Dict
-    def inline_text_payload(self, sent):
+    def inline_text_payload(self, sent: str) -> Dict[str, DUMMY_NAME]:
         '''
         Converts the input sentence into GCP's callable format

@@ -82,7 +85,7 @@ class Variable_Extraction:

         return {'text_snippet': {'content': sent, 'mime_type': 'text/plain'} }

def inline_text_payload(self, sent: str) -> Dict[str, DUMMY_NAME]:
'''
Converts the input sentence into GCP's callable format
Args: sent (string) - input sentence
Return: (dict) - GCP NER input format
'''
return {'text_snippet': {'content': sent, 'mime_type': 'text/plain'} }

Ah, nice! Another return type that seems legit. Or is it? Haha, I hope so!!

+from sklearn.neighbors.classification import KNeighborsClassifier
-def load_latest_model():
+def load_latest_model() -> KNeighborsClassifier:
     # https://stackoverflow.com/a/39327156
     train_path = PROJECT_DIR + '/models/classification/*'
     list_of_files = glob.glob(train_path)

def load_latest_model() -> KNeighborsClassifier:

There's more, check out the files changed. I have not gone through everything.

Checklist (check-all-before-merge)

formatting help: - [x] means "checked' and - [ ] means "unchecked"

  • I documented my code according to the Google Python Style Guide

  • I ran ./build_docs.sh and the docs look fine

  • I ran ./type_check.sh and got no errors

  • I ran ./format.sh because it automatically cleans my code for me 😄

  • I ran ./lint.sh to check for what "format" missed

  • I added my tests to the /tests directory

  • I ran ./run_tests.sh and all the tests pass

mfekadu added 3 commits March 4, 2020 18:51
pipenv install MonkeyType --dev
pipenv shell
monkeytype --verbose run nimbus.py
monkeytype docs: https://github.com/Instagram/MonkeyType
***************************************************
pipenv install MonkeyType --dev
pipenv shell
monkeytype --verbose run nimbus.py

```
$ monkeytype list-modules
QA
database_wrapper
nimbus_nlp.NIMBUS_NLP
nimbus_nlp.question_classifier
nimbus_nlp.save_and_load_model
```

monkeytype --verbose stub QA
monkeytype --verbose stub database_wrapper
monkeytype --verbose stub nimbus_nlp.NIMBUS_NLP
monkeytype --verbose stub nimbus_nlp.question_classifier
monkeytype --verbose stub nimbus_nlp.save_and_load_model

monkeytype --verbose apply QA
monkeytype --verbose apply database_wrapper
monkeytype --verbose apply nimbus_nlp.NIMBUS_NLP
monkeytype --verbose apply nimbus_nlp.question_classifier
monkeytype --verbose apply nimbus_nlp.save_and_load_model
@mfekadu mfekadu changed the title [WIP] Adds type checking annotations via MonkeyType [WIP] Adds type checking annotations auto-generated via MonkeyType Mar 5, 2020
@mfekadu mfekadu changed the title [WIP] Adds type checking annotations auto-generated via MonkeyType [WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package Mar 5, 2020
@mfekadu mfekadu mentioned this pull request Mar 7, 2020
7 tasks
@mfekadu mfekadu marked this pull request as ready for review March 29, 2020 20:38
Jason-Ku
Jason-Ku previously approved these changes Mar 31, 2020
Copy link
Contributor

@Jason-Ku Jason-Ku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a fine addition you have here to the code home

EDIT: I just realized this needs a rebase (the tests aren't running)

@Jason-Ku Jason-Ku self-requested a review March 31, 2020 05:08
@Jason-Ku Jason-Ku dismissed their stale review March 31, 2020 05:08

Needs rebase

Copy link
Contributor

@Jason-Ku Jason-Ku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase, but otherwise looks exquisite

@mfekadu
Copy link
Member Author

mfekadu commented Apr 3, 2020

Will do!

By the way, what’s a rebase? @Jason-Ku

I’ve only ever done “resolve conflicts”
That’s the extent of my git-knowledge

@snekiam
Copy link
Member

snekiam commented Apr 5, 2020

@mfekadu Here's github's page on rebase
The TL;DR for this situation is that it allows you to squash a bunch of commits into one, which will make the repo cleaner, because then if there's a bug we discover in this code its all in one commit.

There's another page here with some nice ASCII art of some other reasons you might want to use a rebase. Hope this helps 😀

Copy link
Member

@snekiam snekiam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, although I agree that a rebase would be nice. Our discussion about git rebase reminds me that we may want to have some more PR documentation or have some kind of github crash course (maybe next fall, it would be good for any new members next year to be a part of it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants