Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] to_df to also return the gaussian parameters #319

Open
chapochn opened this issue Mar 6, 2024 · 6 comments
Open

[ENH] to_df to also return the gaussian parameters #319

chapochn opened this issue Mar 6, 2024 · 6 comments

Comments

@chapochn
Copy link

chapochn commented Mar 6, 2024

Hello,
It would be great if there would be an option so that the function to_df could also return the gaussian parameters. Wondering how difficult it would be to add that?

@TomDonoghue
Copy link
Member

Hey @chapochn - do you mean extracting the gaussian parameters as opposed to extracting the peak parameters, as it does currently? That's probably quite doable - it would probably mean updating model_to_dict (which does the work of organizing extracted parameters) to be able to specify what to extract.

@chapochn
Copy link
Author

chapochn commented Mar 6, 2024

Yes, that would be great!

@danieltomasz
Copy link

hi @chapochn my workaround before the update

import pandas as pd
import numpy as np
from specparam.core.funcs import infer_ap_func
from specparam.core.info import get_ap_indices


def specparam2pandas(fg):
    """
    Converts a SpectralGroupModel object into a pandas DataFrame, with peak parameters and
    corresponding aperiodic fit information.

    Args:
    -----
    fg : specpramGroup
        The SpectralGroupModel object containing the fitting results.

    Returns:
    --------
    peaks_df : pandas.DataFrame
        A DataFrame with the peak parameters and corresponding aperiodic fit information.
        The columns are:
        - 'CF': center frequency of each peak
        - 'PW': power of each peak
        - 'BW': bandwidth of each peak
        - 'error': fitting error of the aperiodic component
        - 'r_squared': R-squared value of the aperiodic fit
        - 'exponent': exponent of the aperiodic component
        - 'offset': offset of the aperiodic component
        - 'knee': knee parameter of the aperiodic component [if is initially present in the fg object]
    Notes:
    ------
    This function creates two DataFrames. The first DataFrame `specparam_aperiodic`
    contains the aperiodic fit information and is based on the `aperiodic_params`
    attribute of the SpectralGroupModel object. The columns are inferred using the
    `get_ap_indices()` and `infer_ap_func()` functions from the specparam package.
    The second DataFrame `peak_df` contains the peak parameters and is based on the
    `peak_params` attribute of the SpectralGroupModel object. The column names are renamed
    to match the headers of `fooof_aperiodic`, and the 'ID' column is cast to integer.
    The two DataFrames are then merged based on a shared 'ID' column.
    """

    specparam_aperiodic = (
        pd.DataFrame(
            fg.get_params("aperiodic_params"),
            columns=get_ap_indices(
                infer_ap_func(np.transpose(fg.get_params("aperiodic_params")))
            ),
        )
        .assign(error=fg.get_params("error"), r_squared=fg.get_params("r_squared"))
        .reset_index(names=["ID"])
    )
    return (
        pd.DataFrame(fg.get_params("peak_params"))  # prepare peaks dataframe
        .set_axis(["CF", "PW", "BW", "ID"], axis=1)  # rename cols
        .astype({"ID": int})
        .join(specparam_aperiodic.set_index("ID"), on="ID")
    )

@danieltomasz
Copy link

danieltomasz commented Mar 27, 2024

@TomDonoghue did you considered to add option to just convert fitResults object to dict or dataframe "loseless" - as it is, without specifying the number of peaks of bands of interest (when peak_org is None)?

@TomDonoghue
Copy link
Member

@danieltomasz - is the output you're thinking of in such a case equivalent to set the peak_org to the maximum number of peaks? Then you would have the full set of results in the dataframe. I'm not sure I want that to be what happens if peak_org is None, but maybe we could add a special value like (e.g. peak_org='all') to do that as a convenience?

@danieltomasz
Copy link

danieltomasz commented Mar 27, 2024

@TomDonoghue yes, I meant maximum number of peak :) peak_org='all' will do the work, I am just recently using None as a default argument (then the function use the predefined parameters from the ) , but indeed Explicit is better than implicit. so `peak_org='all' is better;

I am able to use my custom functions to extract it, but as you plan to update model_to_dict this might be a good momement to think if it might benefit more users (current cases cover most of needs, but I prefer just to see/visualise all results with other python packages/R and then filter the the results to specific criteria)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants