Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some researchers unsure of difference between "Private URL" and "Anonymous Private URL" #8185

Open
jggautier opened this issue Oct 25, 2021 · 13 comments · May be fixed by #10961
Open

Some researchers unsure of difference between "Private URL" and "Anonymous Private URL" #8185

jggautier opened this issue Oct 25, 2021 · 13 comments · May be fixed by #10961
Labels
Size: Queued PM has called this issue out specifically for sizing UX & UI: Design This issue needs input on the design of the UI and from the product owner

Comments

@jggautier
Copy link
Contributor

jggautier commented Oct 25, 2021

While reviewing the "Anonymous Private URL" feature (#1724), members of the Harvard Dataverse Repository curation team saw that some researchers were unsure of the difference between the "Private URL" and the "Anonymous Private URL". In particular, one or more researchers weren't sure if the feature would let them share an anonymized version of the data in a dataset's data files.

The feature only removes from the page dataset metadata that the repository administrators have decided could reveal the dataset author's identity. The User Guides mention this (item 4), and the popup includes a link that points to that section of the User Guides.

But it's unclear if most users will click on that link or if that User Guide section makes the purpose of the Anonymous Private URL clear enough.

@djbrooke
Copy link
Contributor

djbrooke commented Oct 25, 2021

@jggautier - thanks to you and the team for the feedback. If you have some text I'm happy to work with you on a PR (or you can provide the text to me and I can make a PR)

@jggautier
Copy link
Contributor Author

jggautier commented Oct 25, 2021

Thanks @djbrooke. I don't have text. I think more research could be done before figuring out what changes to make, if any. The number of researchers we spoke with were small and I'm not sure how many of those were unsure of the difference between the two URLs, how many noticed and clicked on the User Guide link, how many found the right section (since the link in this AWS instance has the link that doesn't point to the section), and how many were still unsure of the difference after reading that section in the User Guide. @kmika11 spoke with researchers so I'm following up with her for more info.

@djbrooke
Copy link
Contributor

@jggautier thanks - I'm happy to prioritize this and #8184 whenever there's text to be added.

@kmika11
Copy link
Contributor

kmika11 commented Oct 27, 2021

I agree with @jggautier - I don't think we have enough information to recommend any more changes to the feature. Could be worth keeping an eye on once implemented (eg. if we get a lot of support tickets saying "Hey, I thought this anonymized my data!"). We can potentially add language to the user guide that makes it clear that the dataset metadata is what is withheld. Is there anywhere that we could publish the list of metadata fields that are anonymized in Harvard Dataverse?

@jggautier
Copy link
Contributor Author

We tested a redesigned popup with 6 users. They were able to distinguish between the two types of URLs. (More about the review is described in a comment in another Github issue.)

@jggautier
Copy link
Contributor Author

@TaniaSchlatter agrees that the redesign of the feature name, the popup, banner messages, and relevant guide pages are done and can be moved to development when possible.

The changes are illustrated in mockups in an image and in a section of a virtual whiteboard. They include changes to:

  • The name of the feature in the Edit Dataset dropdown on the dataset page
  • Changes to the text, layout, and interaction of the feature's popup
  • Changes to the text of the "Disable URL" confirmation popup
  • Changes to the name of feature in the URL (e.g. previewurl in https://demo.dataverse.org/previewurl.xhtml?token=39b07d51-e0aa-4a89-a179-cacd63c94d72)
  • Changes to banner messages shown when using the feature

The changes to the guide pages - pages in the User, API, Installation, Developer and Style guides - are in the Google Doc at https://docs.google.com/document/d/1bn4fIPr_yhOj_DYDldzdKEZjmETV-WLYc98sWTgcg58

The change to the name of the feature will require changes to the names of associated code files, e.g. PrivateUrlUtil.java

@mreekie mreekie moved this to Harvard Dataverse Instance (Julian) in IQSS Dataverse Project Nov 5, 2022
@mreekie
Copy link

mreekie commented Dec 1, 2022

Discussion with Sonia:
For the Harvard dataverse, we need to discuss and come up with a plan for how to make use of it's use in the Harvard repository.

If someone creates a collection, then this will expose information that should not be exposed.
So we need to either fix the feature, or make a policy of how this feature will be used.

This will need further discussion as to which approach to take.
We can't move this to dev for sizing until we make the decision.

Next steps for getting this sized:

  • Share question with the community for input. Community input on which way to go is importatn. Do others use this like the harvard dataverse use-case? i.e. We know that some others more tightly control the creation of collections.
  • Collections attached to datasets is the specific case where the leakage can occur.
  • The issue is that the collection level information provides identifiable information. There is currently no way to prevent this with the anonymous peer review link.

@mreekie mreekie added the Size: Queued PM has called this issue out specifically for sizing label Jan 23, 2023
@mreekie
Copy link

mreekie commented Jan 23, 2023

sizing:

  • PM added to ordered sizing queue

@mreekie
Copy link

mreekie commented Jan 23, 2023

Sizing:

  • The first step, the spike for getting input, is lacking an assigned person and it's not clear who would collect the information.
  • Once requirements take shape, we will need dev requirements.

Discussion:

@jggautier
Copy link
Contributor Author

What of changing the wording? Could that be a simple fix to change the context that it communicates?

Just refreshed my memory of the changes we proposed and yeah, we proposed changing the name of the feature from "Private URL" to "Preview URL". The Miro board at https://miro.com/app/board/o9J_leCZVUU=/?moveToWidget=3074457366769811874&cot=14 I think gives an overview of how the feature works now and the changes we proposed.

@mreekie
Copy link

mreekie commented Mar 13, 2023

Last sizing meeting:

  • next step will be a meeting to discuss the how to do this.

@cmbz
Copy link

cmbz commented Oct 2, 2023

2023/10/02

@cmbz
Copy link

cmbz commented Oct 16, 2023

2023/10/16:

  • @sbarbosadataverse and @jggautier discussed the topic and determined that more examples/feedback from people using it are needed. E.g. Examples of how Borealis, Philipp's group are using the feature. Note that these installations may have managed collections that makes activating the feature more straightforward.
  • Therefore, this feature cannot be added into HDV at this time until more investigation is conducted.
  • Re. HDV: How best to prevent collections from disclosing personally identifying information using this feature

@DS-INRAE DS-INRAE moved this to ⚠️ Needed/Important in Recherche Data Gouv Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size: Queued PM has called this issue out specifically for sizing UX & UI: Design This issue needs input on the design of the UI and from the product owner
Projects
Status: No status
Status: ⚠️ Needed/Important
Development

Successfully merging a pull request may close this issue.

6 participants