Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does MIDIPort.id need to be origin-scoped and/or regenerated with cookies? #48

Open
cwilso opened this issue Apr 16, 2013 · 35 comments
Open
Labels
class: substantive https://www.w3.org/policies/process/#correction-classes Needs Edits https://speced.github.io/spec-maintenance/about/
Milestone

Comments

@cwilso
Copy link
Contributor

cwilso commented Apr 16, 2013

From AnneVK:

You need to state how MIDIPort.id is scoped (prolly to origin, no?)
and you should state that it should not be the same across origins. We
don't want to make it easy to track users using these new identifiers.
Furthermore, once the user clears cookies these MIDIPort.id thingies
need to be regenerated too. (I gave similar feedback to the WebRTC
guys.)

@annevk
Copy link

annevk commented Apr 22, 2013

I guess this is also concern with name and manufacturer to a lesser extent. If you have a unique device or are a user of a rather unique localization, tracking will be easier. Of course, it's opt-in, but calling it out and having people look at it would be good. I've been told we lost the war on preventing tracking, but identification is hopefully not entirely lost yet.

@cwilso
Copy link
Contributor Author

cwilso commented Apr 24, 2013

We've lost that battle. There are plenty of other APIs that expose system-specific data like this - the Gamepad API, for example, has precisely the same data.

I'm not sure what you mean by "user of a rather unique localization" - the name and manufacturer are coming from the device, so I presume you mean "user of a relatively rare device produced with a relatively rare localized USB device name", or something like that? This has gotten extremely narrow - you would catch far more users looking for a
MIDI devices are relatively mainstream in production - you don't find a lot of one-off device manufacturers/ids - and they are also frequently unplugged - you can't rely on them always being there; my MIDI configuration changes constantly.

Actually, the harder I've thought about this, the more I think we should NOT regenerate with cookies. Origin-scoped is okay - you wouldn't be handing this identifiers across domains - but other than that, it's not worse than index and name (in fact, my polyfill just generates an ID from the index and the name).

@annevk
Copy link

annevk commented Apr 25, 2013

If you do not regenerate with cookies you can revive the cookies if the ID is a uuid.

@cwilso
Copy link
Contributor Author

cwilso commented Apr 25, 2013

It's not intended to a UUID, just a GUID (that can be used to revive the right connections when an app is run a subsequent time). Perhaps a comment to that effect would be best.

@annevk
Copy link

annevk commented Apr 25, 2013

It still seems like that would allow for reviving given enough other fingerprinting data. If you clear cookies you really want that particular site to have forgotten about you and have no information retained.

@cwilso
Copy link
Contributor Author

cwilso commented Apr 25, 2013

I'm not clear I understand the issue well enough then, because it seems
like there is far more than enough other fingerprinting data elsewhere to
do this. Can you point me to where to understand this better?

On Thu, Apr 25, 2013 at 7:56 AM, Anne van Kesteren <[email protected]

wrote:

It still seems like that would allow for reviving given enough other
fingerprinting data. If you clear cookies you really want that particular
site to have forgotten about you and have no information retained.


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17011953
.

@annevk
Copy link

annevk commented Apr 25, 2013

See e.g. HTML for "fingerprinting".

@jussi-kalliokoski
Copy link
Member

I'm fine with the idea of resetting the ID with cookies. This is pretty simple to implement by just salting the IDs with something like the timestamp when cookies were last created, which would be just now in case of private browsing for example. I agree with Anne that even though the API is opt-in, it's better that there's as little linking to the user's identity as possible when cookies are cleared.

That said, I don't see the advantage of extending this to the manufacturer and name, they become pretty useless if they're obfuscated.

@cwilso
Copy link
Contributor Author

cwilso commented Apr 29, 2013

Well, I'd suggest this kinda makes the id pretty useless anyway - shouldn't
we just cut id, and rely on order, manufacturer and name, avoiding the
potential issue here?

On Sun, Apr 28, 2013 at 9:43 AM, Jussi Kalliokoski <[email protected]

wrote:

I'm fine with the idea of resetting the ID with cookies. This is pretty
simple to implement by just salting the IDs with something like the
timestamp when cookies were last created, which would be just now in case
of private browsing for example. I agree with Anne that even though the API
is opt-in, it's better that there's as little linking to the user's
identity as possible when cookies are cleared.

That said, I don't see the advantage of extending this to the manufacturer
and name, they become pretty useless if they're obfuscated.


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17137115
.

@jussi-kalliokoski
Copy link
Member

Well, I'd suggest this kinda makes the id pretty useless anyway - shouldn't we just cut id, and rely on order, manufacturer and name, avoiding the potential issue here?

Huh?! What do you mean it makes the ID useless? It's not like users go incognito every time, expecting sites to remember their preferences.

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

If I'm an ISV writing MIDI software, I'm going to have to rely on a weird
combination of ID, order, manufacturer and name. It just doesn't seem to
buy much utility anymore.

On Wed, May 1, 2013 at 10:27 AM, Jussi Kalliokoski <[email protected]

wrote:

Well, I'd suggest this kinda makes the id pretty useless anyway -
shouldn't we just cut id, and rely on order, manufacturer and name,
avoiding the potential issue here?

Huh?! What do you mean it makes the ID useless? It's not like users go
incognito every time, expecting sites to remember their preferences.


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17293482
.

@jussi-kalliokoski
Copy link
Member

If I'm an ISV writing MIDI software, I'm going to have to rely on a weird combination of ID, order, manufacturer and name.

Why?

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

ID can fail, and when it does, it's catastrophic (regen cookies will lose
all matching); order/manufacturer/name matching will at worst confuse two
of the same devices when one has been unplugged.

On Wed, May 1, 2013 at 10:58 AM, Jussi Kalliokoski <[email protected]

wrote:

If I'm an ISV writing MIDI software, I'm going to have to rely on a weird
combination of ID, order, manufacturer and name.

Why?


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17295842
.

@jussi-kalliokoski
Copy link
Member

ID can fail, and when it does, it's catastrophic (regen cookies will lose all matching); order/manufacturer/name matching will at worst confuse two of the same devices when one has been unplugged.

But that's intended behavior. When a user cleans up cookies, the intent is to make sites forget their preferences. If a site somehow keeps the user's preferences anyway, for example by remembering the user's devices, it's a bit creepy.

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

This is "once you've logged back in to the service" - otherwise, you'd have
no way to persist the order/man/name data. I'm presuming the site would be
storing this remotely, keyed off your auth, along with your other data
(e.g. your sequences/tracks/etc.)

If you just go to the site, with no auth, the site cannot have persisted
its knowledge of your preferences, fear not.

On Wed, May 1, 2013 at 11:15 AM, Jussi Kalliokoski <[email protected]

wrote:

ID can fail, and when it does, it's catastrophic (regen cookies will lose
all matching); order/manufacturer/name matching will at worst confuse two
of the same devices when one has been unplugged.

But that's intended behavior. When a user cleans up cookies, the intent is
to make sites forget their preferences. If a site somehow keeps the user's
preferences anyway, for example by remembering the user's devices, it's a
bit creepy.


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17296852
.

@jussi-kalliokoski
Copy link
Member

That's silly, it makes no sense to store the users' device preferences remotely, it's not like the user would have the same devices on different computers (at least with same IDs) or stuff like that, so the site would either have to fingerprint the different machines the user logs onto the service with (definitely something we don't want to encourage) to know which preferences to apply or store the preferences locally, in which case the preferences would be lost when the cookies are cleared anyway.

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

It doesn't matter whether you encourage it or not, they will do it. It
provides a better user experience - and it will make no sense to the user
that their MIDI setup has been lost just because their roommate was surfing
porn and cleared cookies afterward. Hell, my bank fingerprints my machine.

On Wed, May 1, 2013 at 11:27 AM, Jussi Kalliokoski <[email protected]

wrote:

That's silly, it makes no sense to store the users' device preferences
remotely, it's not like the user would have the same devices on different
computers (at least with same IDs) or stuff like that, so the site would
either have to fingerprint the different machines the user logs onto the
service with (definitely something we don't want to encourage) to know
which preferences to apply or store the preferences locally, in which case
the preferences would be lost when the cookies are cleared anyway.


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17297514
.

@jussi-kalliokoski
Copy link
Member

and it will make no sense to the user that their MIDI setup has been lost just because their roommate was surfing porn.

The user really should tell his/her roommate to go into private browsing mode. xD Clearing cookies is so last season! Personally in the last few years I've use clearing cookies only as a last resort of making a misbehaving site (usually one that I'm developing) forget about me, but I'm guessing there are people who have access to less biased data than my personal usage. :)

My bank recommends that I clear cookies after I log out. But then again, my bank also says I have to use passwords composed of four digits.

I don't think it provides a better user experience to keep a user's preferences when the user explicitly says to clear them.

@marcoscaceres
Copy link
Contributor

FWIW, let's stop using "clearing cookies". I think what Anne meant was "clear private data" (of which cookies is included). @jussi-kalliokoski brings up a good use case, nontheless... it's basically running two browser sessions: one in private browsing mode and the other in "normal" mode.

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

Happy to use that terminology. "Clear private data" does not mean "clear
all remote data". The remote data WILL likely include setup data; if only
because I will want to work on my sequence with my keyboard and my
Launchpad on my desktop, and then

Look, I'm not trying to argue this should be baked in to the API; I'm just
saying that from a usability perspective, if I was writing the software, I
would absolutely cache this stuff and recreate it. I can do that, no
matter if the API has indices or not; I'll just iterate through all the
ports and match name/manufacturer. (In fact, several of my demos DO this -
they iterate through doing name-matching, because I don't want to set up
every time, and I frequently switch computers.)

Regardless, I'd like to get back to the core issue: I'd like to change
MIDIAccess to:

interface MIDIAccess : EventTarget {
sequence inputs ();
sequence outputs ();
attribute EventHandler onconnect;
attribute EventHandler ondisconnect;
};

Yes? Don't care that particularly if IDs are still present (and therefore,
an index into the sequence). Would slightly prefer cutting them at this
point, but seriously - don't care that much.

On Wed, May 1, 2013 at 12:14 PM, Marcos Caceres [email protected]:

FWIW, let's stop using "clearing cookies". I think what Anne meant was
"clear private data" (of which cookies is included). @jussi-kalliokoskihttps://github.com/jussi-kalliokoskibrings up a good use case, nontheless... it's basically running two browser
sessions: one in private browsing mode and the other in "normal" mode.


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17300255
.

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

Argh.

"and then, ... switch PCs."

On Wed, May 1, 2013 at 12:26 PM, Chris Wilson [email protected] wrote:

Happy to use that terminology. "Clear private data" does not mean "clear
all remote data". The remote data WILL likely include setup data; if only
because I will want to work on my sequence with my keyboard and my
Launchpad on my desktop, and then

Look, I'm not trying to argue this should be baked in to the API; I'm just
saying that from a usability perspective, if I was writing the software, I
would absolutely cache this stuff and recreate it. I can do that, no
matter if the API has indices or not; I'll just iterate through all the
ports and match name/manufacturer. (In fact, several of my demos DO this -
they iterate through doing name-matching, because I don't want to set up
every time, and I frequently switch computers.)

Regardless, I'd like to get back to the core issue: I'd like to change
MIDIAccess to:

interface MIDIAccess : EventTarget {
sequence inputs ();
sequence outputs ();
attribute EventHandler onconnect;
attribute EventHandler ondisconnect;
};

Yes? Don't care that particularly if IDs are still present (and
therefore, an index into the sequence). Would slightly prefer cutting them
at this point, but seriously - don't care that much.

On Wed, May 1, 2013 at 12:14 PM, Marcos Caceres [email protected]:

FWIW, let's stop using "clearing cookies". I think what Anne meant was
"clear private data" (of which cookies is included). @jussi-kalliokoskihttps://github.com/jussi-kalliokoskibrings up a good use case, nontheless... it's basically running two browser
sessions: one in private browsing mode and the other in "normal" mode.


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17300255
.

@marcoscaceres
Copy link
Contributor

just rewriting @cwilso proposal because GH email support is currently broken:

interface MIDIAccess : EventTarget { 
   sequence<MIDIInput> inputs (); 
   sequence<MIDIOutput> outputs (); 
   attribute EventHandler onconnect; 
   attribute EventHandler ondisconnect; 
}; 

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

Gack. Sorry about that.

@marcoscaceres
Copy link
Contributor

So, can't we support both approaches? Crazy thought:

   sequence<MIDIInput> inputs (optional (DOMString or DOMString[]) ids); 
   sequence<MIDIInput> outputs (optional (DOMString or DOMString[]) ids); 

So:

var all = midiaccess.inputs(); 
var some = midiaccess.inputs(["foo", "bar"]);
var one = midiaccess.inputs("foo"); 

@cwilso
Copy link
Contributor Author

cwilso commented May 1, 2013

Well, that's certainly crazy. :) (Come on, you were expecting that
response. I'm kidding.)

That seems like overengineering.

On Wed, May 1, 2013 at 12:34 PM, Marcos Caceres [email protected]:

So, can't we support both approaches? Crazy thought:

sequence inputs (optional (DOMString or DOMString[]) ids);
sequence outputs (optional (DOMString or DOMString[]) ids);

So:

var all = midiacces.inputs();
var some = midiacces.inputs(["foo", "bar"]);
var one = midiacces.inputs("foo");


Reply to this email directly or view it on GitHubhttps://github.com//issues/48#issuecomment-17301348
.

@toyoshim
Copy link
Contributor

Someone want the Port ID to be permanently unique, but someone want it to be randomized for privacy reasons.
Origin-based idea may be good goal, but I still feel it's a little overengineering.

Technically speaking, in some Operating Systems, keep on using the same unique ID for the same device is not easy, and browsers may provide unreliable unique IDs which are hopefully permanently unique. As a result, HTML applications may want to handle device identification by themselves using other port information. So I want to give up that browsers provide unique IDs.

In additions, considering privacy, I'm planning to introduce MIDIAccess instance based port ID randomization. If we don't stick on using permanent ID, it looks easy and safe enough.

FYI, here is an interesting site to check how unique your browser is.
https://panopticlick.eff.org/
Already there are too many information to make your browser unique :(

@jussi-kalliokoski
Copy link
Member

considering privacy

MIDI access is inevitably a large source of entropy due to other identifiers in the MIDI ports, that's one of the reasons why there's the permission model. Getting rid of the pseudo-unique IDs won't solve that.

keep on using the same unique ID for the same device is not easy

Perfectly true, but it's intended to be best guess, just like anything a web developer can come up with using the available information (except that the browser mostly has access to more information about the ports), and for most of the cases it should be enough.

As a result, HTML applications may want to handle device identification by themselves using other port information.

If a developer feels that it won't just cut it, then there's the other properties to roll out your own identifier system, but I seriously doubt most people need it (or can make a better system) unless implementers make bad implementations.

I'm planning to introduce MIDIAccess instance based port ID randomization

Randomized IDs are an implementation choice, but I don't think a very good one. If the implementer cares about the privacy, they don't let web developers access this source of inevitable entropy without a permission.

@cwilso
Copy link
Contributor Author

cwilso commented Dec 12, 2013

If an implementer really cares that deeply about privacy (i.e., they've tried to randomize and protect against all the things that https://panopticlick.eff.org/ comes up with), then they would only enable ANY MIDI access after prompting. The API explicitly allows for this; however, no one that I've spoken to in the security/privacy space thinks this is pragmatically a problem. It is adding a drop of water to the ocean.

At the same time, if IDs are truly randomized across instances, then they're worthless to store across instances, and really, they're probably not worth even exposing then (because the instance of the MIDIInput or MIDIOutput could be identifier enough). Obviously, this would require some slight rework of the design.

If this is the case, as Takashi said, each implementer would need to try to provide port identification using the other exposed information. Unfortunately, this is not possible in many common cases - for example, in my home studio, where I have two of the same USB MIDI interfaces connected. If order is not significant and persistent (and defining how order could be significant across OSes seems like a very steep challenge, if possible at all), then this is not possible, and we would not be able to make even a good attempt at enabling developers to cache persistent MIDI setups.

I think the best thing to do is say:
Identifiers SHOULD persist across instances (but are domain-randomized). In many instances, this will not be possible, so developers must be prepared to recover from IDs not being found. In addition, browsers may choose to reset IDs for privacy or other reasons.

@cwilso cwilso added the Agenda+ https://speced.github.io/spec-maintenance/about/ label Mar 6, 2015
@cwilso cwilso removed this from the V1 milestone Mar 6, 2015
@cwilso cwilso removed their assignment Mar 10, 2015
@cwilso
Copy link
Contributor Author

cwilso commented Jun 2, 2015

"User agents SHOULD regenerate ids when privacy information is cleared."

@cwilso cwilso added Needs Edits https://speced.github.io/spec-maintenance/about/ and removed Agenda+ https://speced.github.io/spec-maintenance/about/ labels Jun 2, 2015
@cwilso cwilso added this to the V1 milestone Jun 2, 2015
@cwilso cwilso self-assigned this Jun 2, 2015
@annevk
Copy link

annevk commented Jun 9, 2015

Why not MUST? If they have such a feature surely they can take this into account?

@cwilso
Copy link
Contributor Author

cwilso commented Jun 9, 2015

That would require being far more specific about "when privacy information is cleared", which I'd rather avoid touching.

@annevk
Copy link

annevk commented Jun 9, 2015

Hmm. We should get clearer about that. Basically it needs to be cleared whenever cookies/storage is cleared.

@agoode
Copy link

agoode commented Jun 9, 2015

So, id should be:

  • generated on demand, unique if generated again even for the same hardware device
  • stored in an origin-based store
  • deleted upon clearing of that store

?

@annevk
Copy link

annevk commented Jun 10, 2015

Yeah, and in particular identifiers need to be globally unique and cannot be reused across origins.

@agoode
Copy link

agoode commented Jun 10, 2015

Ok. Something like a version 4 (random) UUID would work then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
class: substantive https://www.w3.org/policies/process/#correction-classes Needs Edits https://speced.github.io/spec-maintenance/about/
Projects
None yet
Development

No branches or pull requests

7 participants