[WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package #108

mfekadu · 2020-03-05T03:26:29Z

What's New?

Instagram made a cool tool called MonkeyType
MonkeyType automatically generated static type annotations by collecting runtime types on the nimbus.py script
type annotations added to the following modules
- QA
- database_wrapper
- nimbus_nlp.NIMBUS_NLP
- nimbus_nlp.question_classifier
- nimbus_nlp.save_and_load_model
- BUT not auto-generated on the following modules
  - modules.validators
  - modules.formatters
  - flask_api
Pipfile && Pipfile.lock now have MonkeyType in the dev-dependencies

Awesome Code Talk

Carl Meyer - Type-checked Python in the real world - PyCon 2018

How did I do this?

Installing MonkeyType

$ pipenv install MonkeyType --dev
$ pipenv shell

Running MonkeyType, which generates a monkeytype.sqlite3

$ monkeytype --verbose run nimbus.py

Then I asked 2 simple questions "What is Foaad's email?" and "Who is the contact for Color Coded?"

Checking for what modules MonkeyType explored while the nimbus.py code was running

$ monkeytype list-modules
QA
database_wrapper
nimbus_nlp.NIMBUS_NLP
nimbus_nlp.question_classifier
nimbus_nlp.save_and_load_model

Telling MonkeyType to generate type annotations

$ monkeytype --verbose stub QA
$ monkeytype --verbose stub database_wrapper
$ monkeytype --verbose stub nimbus_nlp.NIMBUS_NLP
$ monkeytype --verbose stub nimbus_nlp.question_classifier
$ monkeytype --verbose stub nimbus_nlp.save_and_load_model

Telling MonkeyType to edit our code and apply the type annotations

$ monkeytype --verbose apply QA
$ monkeytype --verbose apply database_wrapper
$ monkeytype --verbose apply nimbus_nlp.NIMBUS_NLP
$ monkeytype --verbose apply nimbus_nlp.question_classifier
$ monkeytype --verbose apply nimbus_nlp.save_and_load_model

Addresses #14 but does not fix

Type of change (pick-one)

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
- I am unsure if the functionality is the same. Hence the [WIP] (work-in-progress) tag.
This change requires a documentation update

How Has This Been Tested?

It has not. Let's not merge this until we have tests to verify the same functionality.

**However, the auto-generated type annotations can spark discussion on the functions we have written**

For example, the monkeytype annotations of these function arguments mismatch the docstring Args in this `init`

-    def __init__(self, q_format, db_query, format_answer):
+    def __init__(self, q_format: str, db_query: partial, format_answer: partial) -> None:
         """
         Args:
             q_format (str): Question format string
@@ -49,13 +51,13 @@ class QA:
         self.db_query = db_query
         self.format_answer = format_answer

api/QA.py

Lines 38 to 48 in 41945a0

    
               def __init__(self, q_format: str, db_query: partial, format_answer: partial) -> None: 
        
                   """ 
        
                   Args: 
        
                       q_format (str): Question format string 
        
                       db (NimbusDatabase): Object used to access remote database 
        
                       db_query (DB_Query): Function used to get data from database. Takes 
        
                           a dict of extracted variables and returns a dict of variables 
        
                           from the database. 
        
                       format_function (Answer_Formatter): Function used to format answer 
        
                           string. Takes two dicts--one of extracted variables and one of 
        
                           data retrieved from the database--and returns a str.

MonkeyType noticed a string returned from this function when called just once. Is the return value always a string?

-    def _get_data_from_db(self, extracted_vars):
+    def _get_data_from_db(self, extracted_vars: Dict[str, str]) -> str:
         return self.db_query(extracted_vars)

api/QA.py

Lines 54 to 55 in 41945a0

    
           def _get_data_from_db(self, extracted_vars: Dict[str, str]) -> str: 
        
               return self.db_query(extracted_vars)

Ah, nice! It is reassuring to see the docstring match the generated type annotation

-def create_qa_mapping(qa_list):
+def create_qa_mapping(qa_list: List[QA]) -> Dict[str, QA]:
     """
     Creates a dictionary whose values are QA objects and keys are the question
     formats of those QA objects.
@@ -146,18 +148,18 @@ def create_qa_mapping(qa_list):
 #     return functools.partial(_single_var_string_sub, a_format)

api/QA.py

Lines 71 to 74 in 41945a0

    
           def create_qa_mapping(qa_list: List[QA]) -> Dict[str, QA]: 
        
               """ 
        
               Creates a dictionary whose values are QA objects and keys are the question 
        
               formats of those QA objects.

hmm.. interesting type: `DUMMY_NAME`

+from google.cloud.automl_v1.types import PredictResponse
+from monkeytype.encoding import DUMMY_NAME
+from typing import Dict

-    def inline_text_payload(self, sent):
+    def inline_text_payload(self, sent: str) -> Dict[str, DUMMY_NAME]:
         '''
         Converts the input sentence into GCP's callable format

@@ -82,7 +85,7 @@ class Variable_Extraction:

         return {'text_snippet': {'content': sent, 'mime_type': 'text/plain'} }

api/nimbus_nlp/NIMBUS_NLP.py

Lines 76 to 86 in 41945a0

    
               def inline_text_payload(self, sent: str) -> Dict[str, DUMMY_NAME]: 
        
                   ''' 
        
                   Converts the input sentence into GCP's callable format 
        
                   Args: sent (string) - input sentence 
        
                   Return: (dict) - GCP NER input format 
        
                   ''' 
        
                   return {'text_snippet': {'content': sent, 'mime_type': 'text/plain'} }

Ah, nice! Another return type that seems legit. Or is it? Haha, I hope so!!

+from sklearn.neighbors.classification import KNeighborsClassifier

-def load_latest_model():
+def load_latest_model() -> KNeighborsClassifier:
     # https://stackoverflow.com/a/39327156
     train_path = PROJECT_DIR + '/models/classification/*'
     list_of_files = glob.glob(train_path)

api/nimbus_nlp/save_and_load_model.py

Line 33 in 41945a0

def load_latest_model() -> KNeighborsClassifier:

There's more, check out the files changed. I have not gone through everything.

Checklist (check-all-before-merge)

formatting help: - [x] means "checked' and - [ ] means "unchecked"

I documented my code according to the Google Python Style Guide
I ran ./build_docs.sh and the docs look fine
I ran ./type_check.sh and got no errors
I ran ./format.sh because it automatically cleans my code for me 😄
I ran ./lint.sh to check for what "format" missed
I added my tests to the /tests directory
I ran ./run_tests.sh and all the tests pass

pipenv install MonkeyType --dev pipenv shell monkeytype --verbose run nimbus.py

monkeytype docs: https://github.com/Instagram/MonkeyType *************************************************** pipenv install MonkeyType --dev pipenv shell monkeytype --verbose run nimbus.py ``` $ monkeytype list-modules QA database_wrapper nimbus_nlp.NIMBUS_NLP nimbus_nlp.question_classifier nimbus_nlp.save_and_load_model ``` monkeytype --verbose stub QA monkeytype --verbose stub database_wrapper monkeytype --verbose stub nimbus_nlp.NIMBUS_NLP monkeytype --verbose stub nimbus_nlp.question_classifier monkeytype --verbose stub nimbus_nlp.save_and_load_model monkeytype --verbose apply QA monkeytype --verbose apply database_wrapper monkeytype --verbose apply nimbus_nlp.NIMBUS_NLP monkeytype --verbose apply nimbus_nlp.question_classifier monkeytype --verbose apply nimbus_nlp.save_and_load_model

Jason-Ku

Quite a fine addition you have here to the code home

EDIT: I just realized this needs a rebase (the tests aren't running)

Needs rebase

Jason-Ku

Rebase, but otherwise looks exquisite

mfekadu · 2020-04-03T02:04:25Z

Will do!

By the way, what’s a rebase? @Jason-Ku

I’ve only ever done “resolve conflicts”
That’s the extent of my git-knowledge

snekiam · 2020-04-05T05:07:37Z

@mfekadu Here's github's page on rebase
The TL;DR for this situation is that it allows you to squash a bunch of commits into one, which will make the repo cleaner, because then if there's a bug we discover in this code its all in one commit.

There's another page here with some nice ASCII art of some other reasons you might want to use a rebase. Hope this helps 😀

snekiam

Looks good to me, although I agree that a rebase would be nice. Our discussion about git rebase reminds me that we may want to have some more PR documentation or have some kind of github crash course (maybe next fall, it would be good for any new members next year to be a part of it).

mfekadu added 3 commits March 4, 2020 18:51

install monkey type as dev-dependency, ignore monkeytype.sqlite3

4d1d9f1

pipenv install MonkeyType --dev pipenv shell monkeytype --verbose run nimbus.py

allow any version of pyre because its a dev-dependency not production

1b83bc6

mfekadu requested review from adamperlin and cameron-toy March 5, 2020 03:26

mfekadu changed the title ~~[WIP] Adds type checking annotations via MonkeyType~~ [WIP] Adds type checking annotations auto-generated via MonkeyType Mar 5, 2020

mfekadu changed the title ~~[WIP] Adds type checking annotations auto-generated via MonkeyType~~ [WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package Mar 5, 2020

mfekadu requested review from snekiam, Jason-Ku, zpdeng and austinsilveria March 5, 2020 03:52

mfekadu mentioned this pull request Mar 7, 2020

Nimbus v1.0 #112

Draft

7 tasks

mfekadu marked this pull request as ready for review March 29, 2020 20:38

Jason-Ku previously approved these changes Mar 31, 2020

View reviewed changes

Jason-Ku self-requested a review March 31, 2020 05:08

Jason-Ku requested changes Mar 31, 2020

View reviewed changes

snekiam approved these changes Apr 5, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package #108

[WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package #108

mfekadu commented Mar 5, 2020 •

edited

Loading

Jason-Ku left a comment •

edited

Loading

Jason-Ku left a comment

mfekadu commented Apr 3, 2020 •

edited

Loading

snekiam commented Apr 5, 2020 •

edited

Loading

snekiam left a comment

	def __init__(self, q_format: str, db_query: partial, format_answer: partial) -> None:
	"""
	Args:
	q_format (str): Question format string
	db (NimbusDatabase): Object used to access remote database
	db_query (DB_Query): Function used to get data from database. Takes
	a dict of extracted variables and returns a dict of variables
	from the database.
	format_function (Answer_Formatter): Function used to format answer
	string. Takes two dicts--one of extracted variables and one of
	data retrieved from the database--and returns a str.

	def _get_data_from_db(self, extracted_vars: Dict[str, str]) -> str:
	return self.db_query(extracted_vars)

	def create_qa_mapping(qa_list: List[QA]) -> Dict[str, QA]:
	"""
	Creates a dictionary whose values are QA objects and keys are the question
	formats of those QA objects.

	def inline_text_payload(self, sent: str) -> Dict[str, DUMMY_NAME]:
	'''
	Converts the input sentence into GCP's callable format

	Args: sent (string) - input sentence

	Return: (dict) - GCP NER input format

	'''

	return {'text_snippet': {'content': sent, 'mime_type': 'text/plain'} }

[WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package #108

Are you sure you want to change the base?

[WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package #108

Conversation

mfekadu commented Mar 5, 2020 • edited Loading

What's New?

Awesome Code Talk

How did I do this?

Addresses #14 but does not fix

Type of change (pick-one)

How Has This Been Tested?

However, the auto-generated type annotations can spark discussion on the functions we have written

For example, the monkeytype annotations of these function arguments mismatch the docstring Args in this __init__

MonkeyType noticed a string returned from this function when called just once. Is the return value always a string?

Ah, nice! It is reassuring to see the docstring match the generated type annotation

hmm.. interesting type: DUMMY_NAME

Ah, nice! Another return type that seems legit. Or is it? Haha, I hope so!!

There's more, check out the files changed. I have not gone through everything.

Checklist (check-all-before-merge)

Jason-Ku left a comment • edited Loading

Choose a reason for hiding this comment

Jason-Ku left a comment

Choose a reason for hiding this comment

mfekadu commented Apr 3, 2020 • edited Loading

snekiam commented Apr 5, 2020 • edited Loading

snekiam left a comment

Choose a reason for hiding this comment

mfekadu commented Mar 5, 2020 •

edited

Loading

**However, the auto-generated type annotations can spark discussion on the functions we have written**

For example, the monkeytype annotations of these function arguments mismatch the docstring Args in this `init`

hmm.. interesting type: `DUMMY_NAME`

Jason-Ku left a comment •

edited

Loading

mfekadu commented Apr 3, 2020 •

edited

Loading

snekiam commented Apr 5, 2020 •

edited

Loading