-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Project references #156
Comments
Hello, with respect to to the disadvantage of approach 1, has a SQL-like implementation been considered? Without necessarily migrating to a database, we could implement something similar by maintaining a projectsDB.json:
|
@jwnpoh Yes, this is what I was thinking of. Would there be any difference between your approach and approach #3?
|
@dominikbraun It does seem to be the same. Would I be correct in understanding that with approach #3, renaming a project would still require going through all the records and changing the project key in the record files, thereby still resulting in an O(n) complexity? |
@jwnpoh In theory, no, because we'd store the project IDs in the records, not the project keys. Renaming a project would just mean to remove the key from the index and create a new key that points to the project ID. In practice however, I'm not sure how this could be done backwards-compatible so that old records with keys in them still work... 🤔 |
We could simply treat current project keys as those new numeric IDs and add the project key to the index. For new projects: {
"new-project": "abc123"
} For existing projects: {
"existing-project": "existing-project"
} That way, the project ID lookup is the same. |
This makes sense. I was confused by this line in the original proposal:
Can I clarify that this disadvantage:
results from the complexity of doing a "reverse lookup" in Go's map structure? Having to range through all the keys in the map to find a match for the value? If so, would it make sense, to maintain a two-way map in
When renaming a project, we first lookup the ID using the reference as the key, store the ID in a variable and set the new key, then use the ID stored in the variable as the key to lookup the reference and set the new value. For example, renaming project
There'll probably need to be an
|
Yes. And your approach where a two-way mapping stored would work fine - but I think it would be ok to load the individual projects by ID and display their project key. A two-way mapping would indeed remove the need for this as the project key would be directly available. However, this two-way mapping has to be taken into account by every command that adds, removes or changes a project key, which is a slight management overhead. I think loading all projects by ID for reading their key would suffice as long as we don't run into performance issues. |
The problem
When renaming a project, the project file - whose name corresponds to the project key - will be renamed as well (see #81). The project key is also stored for each record associated with that project. At the moment, when renaming the project, all project keys in the record files have to be renamed. In other words, renaming a project involves traversing and editing all record files since the project key is used to associate the record withe the project!
This is bad and doesn't scale well. I strive for a solution with O(1) runtime complexity for common scenarios.
The solution
The obvious solution to this problem is to separate the reference stored in the records from the project name. Such a separation can be accomplished by generating an ID for each project, which could be a simple hash value like
a3c03f8f
. Instead of using the project name in the record files, this ID would be used.I've found three approaches to implement this.
ID as filename, ID as reference
The easiest approach is to not store a project like
make-coffee
asmake-coffee.json
and refer to it asmake-coffee
, but to store it asa3c03f8f.json
and refer to it asa3c03f8f
instead.Advantages
This approach is simple to implement.
Disadvantages
When starting tracking using
timetrace start make-coffee
, the project ID formake-coffee
has to be determined in order to store it in the record file. This involves traversing all project files, opening them and checking their key. With a runtime complexity of O(n), this approach is out of question.Project key as filename, ID as reference
Another approach is to store the project
make-coffee
asmake-coffee.json
, refer to it asa3c03f8f
in record files and maintain a mapping of the project ID against the project key. For example, there would be a file calleda3c03f8f
which would containmake-coffee
as project key.Advantages
When displaying a record that stores a project reference like
a3c03f8f
, looking up the project data is an O(1) operation.Disadvantages
On the other hand, when starting tracking using
timetrace start make-coffee
, looking up the project ID formake-coffee
in order to refer to it in the record file still is an O(n) operation. This is also true for other commands likecreate record
orget project
.ID as filename, project key as reference
The previous approach can also be reversed: The
make-coffee
project can be stored asa3c03f8f.json
and referred asmake-coffee
in the record files while maintaining a mapping of the project key against the project ID. For example, there would be a file calledmake-coffee
containing the IDa3c03f8f
.Advantages
Looking up the project ID for a given key, for example when starting tracking or displaying a project, is an O(1) operation. Additionally, timetrace internally could exclusively use generated IDs, and project keys would merely be an "alias" or human-readable pointer to a project - just like a Git tag is a human-readable pointer to a commit.
In theory, this approach would still result in an O(n) complexity when projects have to be looked up the other way around, i.e. when finding the project name for a given ID. The
timetrace list records
command would be an example for this, as the project name for each reference stored in the records has to be displayed. In practice however, under the above mentioned assumption that timetrace internally exclusively uses IDs, the project can be loaded directly by its ID without any indirection.The result is that this approach scales with O(1) for all operations.
Disadvantages
At this time, I don't see any drawbacks.
Conclusion
My suggestion is to implement project references as explained in the third approach.
PTAL @aligator @joshuaherrera
The text was updated successfully, but these errors were encountered: