Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model config manager class #73

Merged
merged 70 commits into from
Feb 28, 2024

Conversation

Vela-zz
Copy link
Contributor

@Vela-zz Vela-zz commented Dec 10, 2023

Implement a model manager class for centralized type model management.

The model manager can manage transformer models used in different metric for different language. The manager support list_model_in_use, set_model_for_metric, reset, load_from_file, and save_to_disk, by using this, I think langcheck can

  • a centralized evalution model management
  • allow user to use the model they perfered, or let them built the eval pipeline with their own build.
  • evaluation version control, allow user save evaluation result align with the models evaluate them for better evaluation experinment management and repeated.
  • replace the pydoc in same metric in different language that only has a little differernce with the same one.

As it is a prototype, so I base it on zh-support branch to make it do not influence any other in-progress feature's updating.

It can be used like this,
屏幕截图 2023-12-10 175028

 Changes to be committed:
	modified:   src/langcheck/metrics/__init__.py
	new file:   src/langcheck/metrics/_model_management.py
	new file:   src/langcheck/metrics/modelconfig.ini
	modified:   src/langcheck/metrics/zh/reference_based_text_quality.py

 Changes to be committed:
	modified:   src/langcheck/metrics/__init__.py
	new file:   src/langcheck/metrics/_model_management.py
	new file:   src/langcheck/metrics/modelconfig.ini
	modified:   src/langcheck/metrics/zh/reference_based_text_quality.py
@Vela-zz Vela-zz marked this pull request as draft December 10, 2023 13:31
@Vela-zz Vela-zz changed the title model config Manager class model config manager class Dec 10, 2023
@yosukehigashi
Copy link
Contributor

Implement a model manager class for centralized type model management.

The model manager can manage transformer models used in different metric for different language. The manager support list_model_in_use, set_model_for_metric, reset, load_from_file, and save_to_disk, by using this, I think langcheck can

  • a centralized evalution model management
  • allow user to use the model they perfered, or let them built the eval pipeline with their own build.
  • evaluation version control, allow user save evaluation result align with the models evaluate them for better evaluation experinment management and repeated.
  • replace the pydoc in same metric in different language that only has a little differernce with the same one.

Thank you for writing this up!! I like the idea of a Model Manager, and agree that the properties you listed are all nice to have. I feel like it's not quite a "Model Manager" yet though, since it mostly just manages the model name/path.

Concretely, I think that it would be helpful to make the Model Manager return an actual model, rather than just the model name. Here’s an example of what I have in mind:

# Define various model loaders
def load_sentence_transformer(name):
    return SentenceTransformer(name)

def load_auto_model_for_sequence_classification(name):
    tokenizer = AutoTokenizer.from_pretrained(name)
    model = AutoModelForSequenceClassification.from_pretrained(name)
    return tokenizer, model

# Other loaders
# Model Manager loads a model based on the config
class ModelManager:
    def __init__(self, config):
        # Read config

    def get_model(self, language, metric):
        # NOTE: will also need to ensure that we don't reload models unnecessarily
        model_config = self.config[language][metric]
        if model_config['type'] == 'sentence_transformer':
            return load_sentence_transformer(model_config['name'])
        # Handle other types

    def validate_config(self, language='all', metric='all'):
        # Check that the model(s) can be loaded for the specified language and metric given the config

    # Other methods like set, reset, etc.

A couple additional benefits that I think this could bring

  • This makes it easy for us to add validation code. Otherwise, the user will need to actually run the metric to see if their config is valid or not
  • This unifies LangCheck code around loading local models

Let me know what you think!

Copy link
Contributor

@yosukehigashi yosukehigashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, this looks great overall!

I committed a few cleanup changes directly, and made a few minor comments in this initial review. I'll play around with the model manager a bit more and try out the caching / config interface and send another follow-up review

Copy link
Contributor

@yosukehigashi yosukehigashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @Vela-zz!! This is looking really good now 😄

(FYI I made a bunch of cleanup commits, docstrings and comments mostly)

@Vela-zz Vela-zz marked this pull request as ready for review February 19, 2024 10:00
Copy link
Contributor

@yosukehigashi yosukehigashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM! Couple last questions, but looks good to merge after that 😄

Copy link
Contributor

@yosukehigashi yosukehigashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for all of your work on this @Vela-zz!! I'll merge this to the zh branch once tests pass 🎉

@yosukehigashi yosukehigashi merged commit 14fa90e into citadel-ai:Vela-zz/zh-support Feb 28, 2024
2 checks passed
@Vela-zz Vela-zz deleted the zh_model_config branch March 5, 2024 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants