Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Generating fingerprinting list #128

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

svensevenslow
Copy link

@svensevenslow svensevenslow commented Mar 31, 2020

This a WIP PR. Currently I have just added the code to generate the json file.

I had a few questions:

  1. Regarding generating the PR would creating a fresh branch of shavar-prod-lists, pushing the base-fingerprinting-track.json file and then creating the PR using the Github API be the correct way to go about it?
  2. Also doing everything on the remote branch and not making a local copy of the repo would be the correct way to do it?
  3. Also what exactly files are checked mean in create a PR to shavar-prod-lists when the files are checked and need to be published in publish2cloud.py
  4. Also what exactly do you mean by using the values in Meaning base-fingerprinting-track.json should be generated using values in the respective versioned branch.
  5. Also do we need to compare the existing JSON file with the generated JSON file and only create PR if they are different?
  6. What exactly does this part of the code do?

@say-yawn
Copy link
Contributor

say-yawn commented Apr 2, 2020

Hi @svensevenslow

1. Regarding generating the PR would creating a fresh branch of shavar-prod-lists, pushing the `base-fingerprinting-track.json` file and then creating the PR using the Github API  be the correct way to go about it?

Yes, I think that the is appropriate way to update the fingerprinting JSON files to the versioned branches in shavar-prod-lists repo.

2. Also doing everything on the remote branch and not making a local copy of the repo would be the correct way to do it?

I do not think there is a need to create a local copy of the repo. Can you explain more in detail why you think we would need the local copy?

3. Also what exactly  `files are checked` mean in `create a PR to shavar-prod-lists when the files are checked and need to be published in publish2cloud.py`

"(F)iles are checked" as in we should check to see if the fingerprinting JSON file should be updated just like in the publish2cloud.py here

4. Also what exactly do you mean by `using the values `in `Meaning base-fingerprinting-track.json should be generated using values in the respective versioned branch`.

Since we have versioned lists in shavar-prod-lists repo the creation of fingerprinting JSON should use the values/domains in the respective versioned branch.

5. Also do we need to compare the existing JSON file with the generated JSON file and  only create PR if they are different?

Yes, but I think if the check to upload new data to S3 for the fingerprinting file returns True I think we can assume the upload is also needed. The edge case would be the first time the fingerprinting JSON file is created and is not already uploaded to the shavar-prod-lists repo.

6. What exactly does [this](https://github.com/mozilla-services/shavar-list-creation/blob/master/publish2cloud.py#L221) part of the code do?

publish_to_remote_settings method publishes/uploads the files to the Remote Settings.

@svensevenslow
Copy link
Author

@skim1102 Sorry for the delay on this issue, wasn't able to work on it earlier. Would it be preferred to use the Github Api using requests library or use something like PyGithub (or anything else you might have in mind)?

@say-yawn
Copy link
Contributor

@svensevenslow, I was thinking of using just the GitHub API but using PyGithub, as you mentioned, may be better since we will need to "commit" and create a "PR". Initially I was thinking of creating a placeholder branch, but using PyGithub, I think we should be able to create a branch from the latest versioned branch, commit the latest fingerprint JSON file, and create a PR against the versioned branch (deleting the branch can happen after the PR is merged by the repo maintainer).

@svensevenslow
Copy link
Author

  1. @skim1102 I've updated my pr. I've added a function to create a new branch, create a new commit and open a pull request. Also the access-token I used I have omitted out while committing the code, let me know which one I should use.
  2. You can see the pull requests I generated here and the new branches created here
  3. Also I think the call to the function should me made like this in the publish_to_cloud method
if s3_upload_needed or rs_upload_needed: 
    update_fingerprinting_json():

Let me know what you think.

@svensevenslow
Copy link
Author

@skim1102 could you please have a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants