Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identification of Units #3

Open
jonaskello opened this issue Sep 6, 2018 · 9 comments
Open

Identification of Units #3

jonaskello opened this issue Sep 6, 2018 · 9 comments

Comments

@jonaskello
Copy link
Member

jonaskello commented Sep 6, 2018

From offline discussion with @AdamLuotonen:

Regarding Unit and Units ....

Problems:

  1. PoundLbPerPoundLb and KilogramPerKilogram, etc. are reduced down to the same unit (One) because units are always reduced to their simplest form. This leads to trouble identifying which unit the user wants to display. The same value is displayed, but the unit label should be different (kg/kg vs lb/lb).

  2. Known units vs. unknown units. We define a large amount of known units (MeterPerSecond, Percent, LiterPerMinute, etc.). However, the library allows creation of new units by the functions of unit-times.ts and unit-divide.ts. Units are dynamic and can be created in the applications (but it is not used a lot in practice). When serializing these new units they cannot be serialized in the form 123.22:CentiMeter because that only works with known units.

  3. Performance. Today, we send around instances of Unit everywhere, mainly in Amount objects. Unit contains a lot of information and may be a deep hierarchy describing how the unit is derived. When searching for the known name for the unit (for example, serialization) we run the following code:

export function getStringFromUnit (unit: Unit.Unit <Quantity>): string {
  _ensureMetaAdded ();
  const name = _unitToString [getUnitKey (unit)];
  if (name === undefined) {
    throw new error ("unknown unit" + unit);
  }
  return name;
}

uom/src/units.ts

Line 1382 in 9c7a38c

export function getStringFromUnit(unit: Unit.Unit<Quantity>): string {

function getUnitKey (unit: Unit.Unit <Quantity>): string {
  // Iterates whole array. ES6 does not have find
  const foundUnit = getAllUnits () .filter (u => Unit.equals (u, unit)) [0];
  if (! foundUnit) {
    throw new error ("Unknown Unit" + JSON.stringify (unit));
  }

  return JSON.stringify (foundUnit);
}

uom/src/units.ts

Line 1432 in 9c7a38c

function getUnitKey(unit: Unit.Unit<Quantity>): string {

getUnitKey is expensive (and therefore getStringFromUnit is also expensive) because we have to walk through all units and compare the structure. Even if we did not compare the structure it would be expensive, there is still a loop over a growing list of units. Running JSON.stringify is also not free (imagine thousands of calls to this).

Well. My suggestion is that instead of passing around the representation of the units in the application, I suggest that we only pass around the known unit names. When you want to do something with the unit, the library internally merges the unit structure into a map, and performs the same code as before. Serializing and deserializing becomes straightforward, a short string of unit name.

A potential problem with this may be that some applications may be dependent on Unit's internal structure. For example, Unit.quantity is available today, but it could be replaced by a single function call getQuantityForUnit(unit). Another potential issue may be if someone serializes and saves Unit with JSON.stringify, and then expects to be able to deserialize it and use it with the new code. There are some tests that suggest that:

  • Base unit One should be equal. Order should not matter
  • Alternate unit compare different object references

A bonus with this new approach is that you can easily see which unit a volume has when debugging. Instead of a complicated JSON structure, one gets a name.

Basically, the problem is that we want to identify units by name, but at the same time have a dynamic unit system that can build new units on the fly. I think it's enough that an application could record some own units at boot, but do not use dynamic units in the calculations?

@jonaskello
Copy link
Member Author

Offline comment from @geon:

pass around the known unit names

Would it not be better to compare references to known objects?

@jonaskello
Copy link
Member Author

Offline comment from @AdamLuotonen:

That is what we are already doing, but if you check the unit tests we are supposed to support comparison of different references too. It would of course be most effective to just compare references, but that is not how the current thinking seems to be..

@jonaskello
Copy link
Member Author

Offline comment from @AdamLuotonen:

If one should identity the unit in another way than a reference it has to have some identifier (for example a unit name) or you will have to keep track of where the information about it came from. Then you can separate PoundLbPerPoundLb from KilogramPerKilogram. I think my vote is for the appraoch with identifier since we just use one set of known units everywhere anyway. And when you have a unique identifier in the unit you can put the derivation of the units in a registry in the library.

@jonaskello
Copy link
Member Author

My thinking is that it would be OK for an application to have a set of known units which it will have to keep within (which is what this proposal with identifier requires). In that case it should be up to each application to register the set of unit it wants to support. We could provide some units in the library, but the application could also add it's own to get a complete set. Perferable it would be possible to publish extra units as separate packages that are easily integrated with the uom core package.

@jonaskello
Copy link
Member Author

Here is a screenshot from an actual application to illustrate the problem:

image

@jonaskello
Copy link
Member Author

My suggestion is that instead of passing around the representation of the units in the application, I suggest that we only pass around the known unit names. When you want to do something with the unit, the library internally merges the unit structure into a map, and performs the same code as before. Serializing and deserializing becomes straightforward, a short string of unit name.

One solution that does not break compatibility could be to add an optional name to the unit while keeping the structural representation of the unit as it is today. That would probably clean up some of the code related to serialization. However this solution introduces more complexity as the unit now has another field and in some cases that field will not be set (when a new unit is the result of a calculation).

@Jontem
Copy link
Contributor

Jontem commented Oct 4, 2018

@jonaskello @AdamLuotonen so can we please implement the suggested solution? I could give it a try but I don't feel like I have the full understanding of how it works today and how to implement it. We really need it to be resolved in a month or so

@Jontem
Copy link
Contributor

Jontem commented Oct 31, 2018

I believe this is fixed now. Can we close this issue?

@jonaskello
Copy link
Member Author

While the problem of identification is "solved", I'm still unsure if the solution is a good one. Having a name for each unit makes two equal units unequal just because they have different names. This is useful for some cases such as picking from a list where you need to identify each unit even if they are equal. But makes other cases impossible, such as deriving a new unit and finding out if it was the same as an existing unit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants