Categorical_Method LHS Input Encoding Error #265

HiddenBao · 2023-08-15T16:32:50Z

Operating System: Windows 11 OS
Python version: 3.9.17
summit version used: 0.8.9

Description

I am currently testing a future experimental setup condition in which I am trying to create initial experiments using LHS as a strategy whilst containing categorical and continuous data. When running the code I am getting an area when generating the experimental conditions. I am working within JupyterLab, I have removed some of the names from the file path but everything is the same.

What I Did

import cython
import summit
from summit.benchmarks import ExperimentalEmulator
from summit.domain import *
from summit.utils.dataset import DataSet
from summit.strategies import SOBO, MultitoSingleObjective, LHS
import numpy as np
import pandas as pd
import pkg_resources
import pathlib

DATA_PATH = pathlib.Path("F:/Python Programs/NKData")
input_df = pd.read_csv(DATA_PATH / 'BoundariesV2.csv')

domain = Domain()
for idx, row in input_df.iterrows():
    name = row[0]  
    description = row[5]  
    data_type = row['Type']
    
    if data_type == 'Categorical':
        levels = row[2].split(',')  
        
        domain += CategoricalVariable(
            name=name,
            description=description,
            levels=levels
        )
    elif data_type == 'Continuous':
        bounds = [row[3], row[4]]
        
        domain += ContinuousVariable(
            name=name,
            description=description,
            bounds=bounds
        )
    elif data_type == 'Objective':
        bounds = [row[3], row[4]]
        maximize = row[6]
        
        domain += ContinuousVariable(
            name=name,
            description=description,
            bounds=bounds,
            is_objective=True,
            maximize=maximize
        )

domain

categorical_method: str = "one-hot"
StartStrat = LHS(domain, random_state = np.random.RandomState(808), categorical_method=categorical_method)
StartExp = StartStrat.suggest_experiments(10)
StartExp

Output

Name	Type	Description	Values
Temperature	continuous, input	Reaction temperature in degrees Celsius (ºC)	[40.0,80.0]
Catalyst_Amount	continuous, input	Catalyst amounts in molar equivalents (Equiv.)	[0.01,1.0]
Starting_Reagent	continuous, input	2-Methylimidozole amounts in molar equivalents (Equiv.)	[1.1,2.0]
Solvent	continuous, input	Solvent amount in milliliters (mL)	[0.1,0.35]
Time	continuous, input	Duration of reaction in hours (hr)	[2.0,24.0]
Base	continuous, input	Base amount in molar equivalents (Equiv.)	[1.0,5.0]
Catalyst_Type	categorical, input	Catalyst Types	3 levels
Main_Product	continuous, maximize objective	LCAP of Main Product	[0.0,1.0]
Main_Impurity	continuous, minimize objective	LCAP of Main Impurity	[0.0,1.0]

Error I get from running the very last cell

AttributeError Traceback (most recent call last)
Cell In[4], line 4
2 categorical_method: str = "one-hot"
3 StartStrat = LHS(domain, random_state = np.random.RandomState(808), categorical_method=categorical_method)
----> 4 StartExp = StartStrat.suggest_experiments(10)
5 StartExp

File ~\AppData\Roaming\Python\Python39\site-packages\summit\strategies\random.py:286, in LHS.suggest_experiments(self, num_experiments, criterion, exclude, **kwargs)
284 design = DataSet.from_df(design)
285 design[("strategy", "METADATA")] = "LHS"
--> 286 return self.transform.un_transform(
287 design, categorical_method=self.categorical_method
288 )

File ~\AppData\Roaming\Python\Python39\site-packages\summit\strategies\base.py:324, in Transform.un_transform(self, ds, **kwargs)
318 # Categorical variables using one-hot encoding
319 elif (
320 isinstance(variable, CategoricalVariable)
321 and categorical_method == "one-hot"
322 ):
323 # Get one-hot encoder
--> 324 enc = self.encoders[variable.name]
326 # Get array to be transformed
327 one_hot_names = [f"{variable.name}_{l}" for l in variable.levels]

AttributeError: 'Transform' object has no attribute 'encoders'

I apologise for any formatting issues I am new to this but I would greatly appreciate any help or advice for workarounds. Thank you for providing this library it is amazing.

marcosfelt · 2024-05-05T02:51:59Z

I am so sorry that I did not see this. If it is still relevant for me to take a look, please respond @HiddenBao

I will take a look later this week.

HiddenBao · 2024-05-09T14:59:43Z

No worries! I briefly went back and redid the code again without requiring the .csv file to see if it was the issue however I get the same error.

What I Did

import summit
from summit.benchmarks import ExperimentalEmulator
from summit.domain import *
from summit.utils.dataset import DataSet
from summit.strategies import LHS, MTBO

domain = Domain()

domain += CategoricalVariable(
    name = "Catalyst", 
    description = "Test",
    levels = [
        "A",
        "B",
        "C",
        "D"
    ],
)

domain += ContinuousVariable(
    name = "Temperature",
    description = "Test",
    bounds = [40, 80]
)

domain += ContinuousVariable(
    name = "Catalyst_Amount",
    description = "Test",
    bounds = [0.01, 1.0]
)

domain += ContinuousVariable(
    name = "Reagent",
    description = "Test",
    bounds = [1.1, 2.0]
)

domain += ContinuousVariable(
    name = "Solvent",
    description = "Test",
    bounds = [0.1, 0.35]
)

domain += ContinuousVariable(
    name = "Time",
    description = "Test",
    bounds = [2.0, 24]
)

domain += ContinuousVariable(
    name = "Base",
    description = "Test",
    bounds = [1.0, 5.0]
)


domain += ContinuousVariable(
    name = "Main_Product",
    description = "Test",
    bounds = [0, 1],
    is_objective = True,
    maximize = True
)

domain += ContinuousVariable(
    name = "Main_Impurity",
    description = "Test",
    bounds = [0, 1],
    is_objective = True,
    maximize = False
)
domain

The domain gets created perfectly fine to my knowledge, I then tried running the transform in the LHS and MTBO strategy using the following.

strategy = LHS(domain,
                 random_state = np.random.RandomState(808),
                 categorical_method="one-hot"
                )
StartExp = StartStrat.suggest_experiments(10)
StartExp

strategy = MTBO(domain,
                 random_state = np.random.RandomState(808),
                 categorical_method="one-hot"
                )
StartExp = StartStrat.suggest_experiments(10)
StartExp

And both return the same error still.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[15], line 5
      1 strategy = MTBO(domain,
      2                  random_state = np.random.RandomState(808),
      3                  categorical_method="one-hot"
      4                 )
----> 5 StartExp = StartStrat.suggest_experiments(10)
      6 StartExp

File ~\AppData\Roaming\Python\Python39\site-packages\summit\strategies\random.py:286, in LHS.suggest_experiments(self, num_experiments, criterion, exclude, **kwargs)
    284 design = DataSet.from_df(design)
    285 design[("strategy", "METADATA")] = "LHS"
--> 286 return self.transform.un_transform(
    287     design, categorical_method=self.categorical_method
    288 )

File ~\AppData\Roaming\Python\Python39\site-packages\summit\strategies\base.py:324, in Transform.un_transform(self, ds, **kwargs)
    318 # Categorical variables using one-hot encoding
    319 elif (
    320     isinstance(variable, CategoricalVariable)
    321     and categorical_method == "one-hot"
    322 ):
    323     # Get one-hot encoder
--> 324     enc = self.encoders[variable.name]
    326     # Get array to be transformed
    327     one_hot_names = [f"{variable.name}_{l}" for l in variable.levels]

AttributeError: 'Transform' object has no attribute 'encoders'

Thank you again for creating this library, it is great and extremely useful. I greatly appreciate all the work that has been done for it.

marcosfelt · 2024-05-10T01:20:44Z

This definitely seems like a bug - I will take a look this weekend!

marcosfelt · 2024-05-11T15:10:13Z

Can confirm that I can reproduce the bug. Going to look into a fix now

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Categorical_Method LHS Input Encoding Error #265

Categorical_Method LHS Input Encoding Error #265

HiddenBao commented Aug 15, 2023

marcosfelt commented May 5, 2024

HiddenBao commented May 9, 2024

marcosfelt commented May 10, 2024

marcosfelt commented May 11, 2024

Categorical_Method LHS Input Encoding Error #265

Categorical_Method LHS Input Encoding Error #265

Comments

HiddenBao commented Aug 15, 2023

Description

What I Did

Error I get from running the very last cell

marcosfelt commented May 5, 2024

HiddenBao commented May 9, 2024

What I Did

marcosfelt commented May 10, 2024

marcosfelt commented May 11, 2024