Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/#397 scenario duplication #2373

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

toan-quach
Copy link
Member

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

Related Tickets & Documents

#397

@toan-quach toan-quach marked this pull request as draft December 27, 2024 07:17
@toan-quach toan-quach force-pushed the feature/#397-duplicate-scenarios branch from 658f02f to 2644479 Compare December 27, 2024 07:17
Copy link
Contributor

github-actions bot commented Dec 27, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
19786 17236 87% 0% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
taipy/core/_repository/_filesystem_repository.py 89% 🟢
taipy/core/data/_data_manager.py 99% 🟢
taipy/core/data/_file_datanode_mixin.py 98% 🟢
taipy/core/data/csv.py 98% 🟢
taipy/core/data/data_node.py 97% 🟢
taipy/core/data/excel.py 84% 🟢
taipy/core/data/json.py 97% 🟢
taipy/core/data/parquet.py 97% 🟢
taipy/core/data/pickle.py 100% 🟢
taipy/core/scenario/_scenario_manager.py 97% 🟢
taipy/core/taipy.py 91% 🟢
taipy/core/task/_task_manager.py 97% 🟢
taipy/core/task/task.py 98% 🟢
TOTAL 96% 🟢

updated for commit: ffeb1c8 by action🐍

@jrobinAV jrobinAV added Core Related to Taipy Core Core: Data node Core: 🎬 Scenario & Cycle 🟨 Priority: Medium Not blocking but should be addressed labels Jan 6, 2025
@toan-quach toan-quach force-pushed the feature/#397-duplicate-scenarios branch from 2644479 to f79d591 Compare January 13, 2025 07:26
@toan-quach toan-quach marked this pull request as ready for review January 15, 2025 13:10
taipy/core/scenario/_scenario_manager.py Outdated Show resolved Hide resolved
taipy/core/data/_data_manager.py Show resolved Hide resolved
taipy/core/data/_data_manager.py Outdated Show resolved Hide resolved
taipy/core/data/_file_datanode_mixin.py Outdated Show resolved Hide resolved
taipy/core/scenario/_scenario_manager.py Outdated Show resolved Hide resolved
taipy/core/scenario/_scenario_manager.py Show resolved Hide resolved
taipy/core/scenario/_scenario_manager.py Outdated Show resolved Hide resolved
taipy/core/scenario/_scenario_manager.py Outdated Show resolved Hide resolved
taipy/core/task/_task_manager.py Outdated Show resolved Hide resolved
taipy/core/data/csv.py Outdated Show resolved Hide resolved
@toan-quach toan-quach force-pushed the feature/#397-duplicate-scenarios branch from fe1826e to 5b53c5f Compare January 21, 2025 09:57
@jrobinAV jrobinAV self-requested a review January 22, 2025 08:29
jrobinAV
jrobinAV previously approved these changes Jan 22, 2025
@toan-quach toan-quach linked an issue Jan 23, 2025 that may be closed by this pull request
Copy link
Member

@trgiangdo trgiangdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are still a few areas in the code that can be improve

Comment on lines +343 to +348
def _duplicate_data(self):
new_data_path = self._duplicate_data_file(self.id)
if hasattr(self._properties, "_entity_owner"):
del self._properties._entity_owner
self._properties[self._PATH_KEY] = new_data_path
return new_data_path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this repetitive?
Can we put this in the parent _FileDataNodeMixin class and override it when necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm I think it's a yes and no answer. The self._properties attribute doesn't exist in _FileDataNodeMixin, but since ExcelDN, etc. are implementing _FileDataNodeMixin, it can access it. But the syntax feels off for me, and since _FileDataNodeMixin is a Mixin, I'm not comfortable putting self._properties in it. (I understand that we do have similar things for def path(self, value) (setter) in _FileDataNodeMixin, but I'm still not sure if it's the best way to do it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point. However, other methods in the Mixin class doesn't seem to fit as well.

I think we did discuss this once. @jrobinAV Let us know what you think about this

Comment on lines +229 to +231
if base_name.startswith(self.__TAIPY_CLONED_PREFIX):
base_name = "".join(base_name.split("_")[5:])
new_base_path = os.path.join(folder_path, f"{self.__TAIPY_CLONED_PREFIX}_{id}_{base_name}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have 2 questions.
What would be the new file name if a data node is duplicated twice?
Similarly, if a duplicated data node is duplicated (dn -> duplicated_dn -> duplicated duplicated_dn)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep so take example.csv as the file name, after you first duplicate the dn, it will create a new file TAIPY_CLONED_new_dn_id_example.csv. But if you want to duplicate this newly duplicated dn, another file will be created with the name pattern being TAIPY_CLONED_another_new_dn_id_example.csv. The TAIPY_CLONED prefix won't be repeated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So what about duplicating a data node twice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we will have TAIPY_CLONED_dn_duplicated_id_1_example.csv for the 1st dn, and for 2nd dn, we will have TAIPY_CLONED_dn_duplicated_id_2_example.csv

cycle = _CycleManagerFactory._build_manager()._get_or_create(frequency, creation_date) if frequency else None
cycle_id = cycle.id if cycle else None

# TODO: update sequences
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not handled yet right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yep, I'm not sure about it at the moment as if we want to duplicate a sequence, we either:

  • duplicate a scenario (which is weird)
  • duplicate the sequence and store it within the same scenario (this is also weird as not sure about its benefit)

return cloned_dn

@classmethod
def _can_duplicate(cls, dn: DataNode) -> ReasonCollection:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add a Reason here when the datanode is not a file-based one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well eventually we want to be able to duplicate all DNs so it will only be applicable for now and will be removed later. So I think it won't bring much value for now as we will change it later anyway.

@@ -869,3 +869,29 @@ def test_get_entities_by_config_id_in_multiple_versions_environment(self):
assert len(tp.get_scenarios()) == 5
assert len(tp.get_entities_by_config_id(scenario_config_1.id)) == 3
assert len(tp.get_entities_by_config_id(scenario_config_2.id)) == 2

def test_can_duplicate(self):
dn_config = Config.configure_in_memory_data_node("dn", 10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an InMemory data node is duplicated, won't it raise the NotImplementedError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, as we're currently focusing on file-based DN, I decided to ignore other cases and let NotImplementedError be raised

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return a Reason, not raise an exception though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yep, agree. But I leftNotImplementedError there as we will fill in for other data nodes eventually. By then, Reason will become outdated. So to avoid confusion in later date, I just raise NotImplementedError for now :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core: Data node Core: 🎬 Scenario & Cycle Core Related to Taipy Core 🟨 Priority: Medium Not blocking but should be addressed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possibility to duplicate scenarios
4 participants