Skip to content

Merging Waves and Discarding Time Indices

What is the MerWavTimeMinus module?

The MerWavTimeMinus module transforms longitudinal data by merging all features across waves into a single set, discarding temporal information. This simplifies the dataset for traditional machine learning algorithms but loses temporal dependencies. It provides methods for data preparation and transformation, including prepare_data and transform.

What are features_group and non_longitudinal_features?

Two key attributes, features_group and non_longitudinal_features, enable algorithms to interpret the temporal structure of longitudinal data.

  • features_group: A list of lists where each sublist contains indices of a longitudinal attribute's waves, ordered from oldest to most recent. This captures temporal dependencies.
  • non_longitudinal_features: A list of indices for static, non-temporal features excluded from the temporal matrix.

Proper setup of these attributes is critical for leveraging temporal patterns effectively.

See More In Temporal Dependency Guide

MerWavTimeMinus

Bases: DataPreparationMixin

MerwavTimeMinus stands for Merge Waves yet discards time indices in longitudinal datasets.

The MerWavTimeMinus transforms longitudinal data by merging all features across waves into a single set, effectively discarding temporal information. This approach treats different values of the same original longitudinal feature as distinct features, simplifying the dataset for traditional machine learning algorithms but losing temporal dependencies.

Parameters:

Name Type Description Default
features_group List[List[int]]

A temporal matrix representing the temporal dependency of a longitudinal dataset. Each sublist contains indices of a longitudinal attribute's waves. Defaults to None.

None
non_longitudinal_features List[Union[int, str]]

A list of indices or names of non-longitudinal features. Defaults to None.

None
feature_list_names List[str]

A list of feature names in the dataset. Defaults to None.

None

Attributes:

Name Type Description
features_group List[List[int]]

The temporal matrix of feature groups.

non_longitudinal_features List[Union[int, str]]

The non-longitudinal features.

feature_list_names List[str]

The feature names in the dataset.

Examples:

Below is an example demonstrating the usage of the MerWavTimeMinus class with the "stroke.csv" dataset. Please, note that "stroke.csv" is a placeholder and should be replaced with the actual path to your dataset.

Basic Usage

from scikit_longitudinal.data_preparation import MerWavTimeMinus

# Load dataset
dataset = LongitudinalDataset('./stroke_longitudinal.csv')
dataset.load_data()
dataset.load_target(target_column="stroke_w2")
dataset.setup_features_group("elsa")
dataset.load_train_test_split(test_size=0.2, random_state=42)

# Initialize MerWavTimeMinus
mer_wav = MerWavTimeMinus(
    features_group=dataset.feature_groups(),
    non_longitudinal_features=dataset.non_longitudinal_features(),
    feature_list_names=dataset.data.columns.tolist()
)

# No need to apply any transformation, MerWavTimeMinus takes the dataset as it is
# Meaning that it does not care about any of the temporal dependency.

# We let this there for compatibility but it has little value alone.
Source code in scikit_longitudinal/data_preparation/merwav_time_minus.py
class MerWavTimeMinus(DataPreparationMixin):
    """MerwavTimeMinus stands for Merge Waves yet discards time indices in longitudinal datasets.

    The `MerWavTimeMinus` transforms longitudinal data by merging all features across waves into a single set,
    effectively discarding temporal information. This approach treats different values of the same original longitudinal
    feature as distinct features, simplifying the dataset for traditional machine learning algorithms but losing temporal
    dependencies.

    Args:
        features_group (List[List[int]], optional): A temporal matrix representing the temporal dependency of a
            longitudinal dataset. Each sublist contains indices of a longitudinal attribute's waves. Defaults to None.
        non_longitudinal_features (List[Union[int, str]], optional): A list of indices or names of non-longitudinal
            features. Defaults to None.
        feature_list_names (List[str], optional): A list of feature names in the dataset. Defaults to None.

    Attributes:
        features_group (List[List[int]]): The temporal matrix of feature groups.
        non_longitudinal_features (List[Union[int, str]]): The non-longitudinal features.
        feature_list_names (List[str]): The feature names in the dataset.

    Examples:
        Below is an example demonstrating the usage of the `MerWavTimeMinus` class with the "stroke.csv" dataset.
        Please, note that "stroke.csv" is a placeholder and should be replaced with the actual path to your dataset.

        !!! example "Basic Usage"
            ```python
            from scikit_longitudinal.data_preparation import MerWavTimeMinus

            # Load dataset
            dataset = LongitudinalDataset('./stroke_longitudinal.csv')
            dataset.load_data()
            dataset.load_target(target_column="stroke_w2")
            dataset.setup_features_group("elsa")
            dataset.load_train_test_split(test_size=0.2, random_state=42)

            # Initialize MerWavTimeMinus
            mer_wav = MerWavTimeMinus(
                features_group=dataset.feature_groups(),
                non_longitudinal_features=dataset.non_longitudinal_features(),
                feature_list_names=dataset.data.columns.tolist()
            )

            # No need to apply any transformation, MerWavTimeMinus takes the dataset as it is
            # Meaning that it does not care about any of the temporal dependency.

            # We let this there for compatibility but it has little value alone.
            ```
    """

    def __init__(
        self,
        features_group: List[List[int]] = None,
        non_longitudinal_features: List[Union[int, str]] = None,
        feature_list_names: List[str] = None,
    ):
        self.features_group = features_group
        self.non_longitudinal_features = non_longitudinal_features
        self.feature_list_names = feature_list_names

    def get_params(self, deep: bool = True):  # pylint: disable=W0613
        """Get the parameters of the MerWavTimeMinus instance.

        This method retrieves the configuration parameters of the `MerWavTimeMinus` instance, useful for inspection or
        hyperparameter tuning.

        Args:
            deep (bool, optional): Unused parameter but kept for consistency with the scikit-learn API.

        Returns:
            dict: The parameters of the MerWavTimeMinus instance.
        """
        return {}

    @override
    def _prepare_data(self, X: np.ndarray, y: np.ndarray = None) -> "MerWavTimeMinus":
        """Prepare the data for transformation.

        Overridden from `DataPreparationMixin`.

        Args:
            X (np.ndarray): The input data.
            y (np.ndarray, optional): The target data. Defaults to None.

        Returns:
            MerWavTimeMinus: The instance with prepared data.
        """
        return self

get_params(deep=True)

Get the parameters of the MerWavTimeMinus instance.

This method retrieves the configuration parameters of the MerWavTimeMinus instance, useful for inspection or hyperparameter tuning.

Parameters:

Name Type Description Default
deep bool

Unused parameter but kept for consistency with the scikit-learn API.

True

Returns:

Name Type Description
dict

The parameters of the MerWavTimeMinus instance.

Source code in scikit_longitudinal/data_preparation/merwav_time_minus.py
def get_params(self, deep: bool = True):  # pylint: disable=W0613
    """Get the parameters of the MerWavTimeMinus instance.

    This method retrieves the configuration parameters of the `MerWavTimeMinus` instance, useful for inspection or
    hyperparameter tuning.

    Args:
        deep (bool, optional): Unused parameter but kept for consistency with the scikit-learn API.

    Returns:
        dict: The parameters of the MerWavTimeMinus instance.
    """
    return {}

_prepare_data(X, y=None)

Prepare the data for transformation.

Overridden from DataPreparationMixin.

Parameters:

Name Type Description Default
X ndarray

The input data.

required
y ndarray

The target data. Defaults to None.

None

Returns:

Name Type Description
MerWavTimeMinus MerWavTimeMinus

The instance with prepared data.

Source code in scikit_longitudinal/data_preparation/merwav_time_minus.py
@override
def _prepare_data(self, X: np.ndarray, y: np.ndarray = None) -> "MerWavTimeMinus":
    """Prepare the data for transformation.

    Overridden from `DataPreparationMixin`.

    Args:
        X (np.ndarray): The input data.
        y (np.ndarray, optional): The target data. Defaults to None.

    Returns:
        MerWavTimeMinus: The instance with prepared data.
    """
    return self