Merging Waves and Keeping Time Indices¶
What is the MerWavTimePlus module?
The MerWavTimePlus module transforms longitudinal data by merging all features across waves into a single set while preserving their time indices. This maintains the temporal structure, enabling longitudinal machine learning methods to leverage temporal dependencies and patterns. It provides methods for data preparation and transformation, including prepare_data and transform.
What are features_group and non_longitudinal_features?
Two key attributes, features_group and non_longitudinal_features, enable algorithms to interpret the
temporal structure of longitudinal data.
- features_group: A list of lists where each sublist contains indices of a longitudinal attribute's waves, ordered from oldest to most recent. This captures temporal dependencies.
- non_longitudinal_features: A list of indices for static, non-temporal features excluded from the temporal matrix.
Proper setup of these attributes is critical for leveraging temporal patterns effectively.
MerWavTimePlus ¶
Bases: DataPreparationMixin
MerWavTimePlus stands for Merge waves while keeping time indices in longitudinal datasets.
The MerWavTimePlus class transforms longitudinal data by merging all features across waves into a single set
while preserving their time indices. This maintains the temporal structure, enabling longitudinal machine learning
methods to leverage temporal dependencies and patterns. See all
longitudinal-data-aware machine learning estimators.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features_group
|
List[List[int]]
|
A temporal matrix representing the temporal dependency of a longitudinal dataset. Each sublist contains indices of a longitudinal attribute's waves. Defaults to None. |
None
|
non_longitudinal_features
|
List[Union[int, str]]
|
A list of indices or names of non-longitudinal features. Defaults to None. |
None
|
feature_list_names
|
List[str]
|
A list of feature names in the dataset. Defaults to None. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
features_group |
List[List[int]]
|
The temporal matrix of feature groups. |
non_longitudinal_features |
List[Union[int, str]]
|
The non-longitudinal features. |
feature_list_names |
List[str]
|
The feature names in the dataset. |
Examples:
Below is an example using the "stroke.csv" dataset to demonstrate the MerWavTimePlus class.
Please, note that "stroke.csv" is a placeholder and should be replaced with the actual path to your dataset.
Basic Usage
from scikit_longitudinal.data_preparation import LongitudinalDataset
from scikit_longitudinal.data_preparation import MerWavTimePlus
# Load dataset
dataset = LongitudinalDataset('./stroke_longitudinal.csv')
dataset.load_data()
dataset.load_target(target_column="stroke_w2")
dataset.setup_features_group("elsa")
dataset.load_train_test_split(test_size=0.2, random_state=42)
# Initialize MerWavTimePlus
mer_wav_plus = MerWavTimePlus(
features_group=dataset.feature_groups(),
non_longitudinal_features=dataset.non_longitudinal_features(),
feature_list_names=dataset.data.columns.tolist()
)
# No need to apply any transformation, MerWavTimePlus takes the dataset as it is
# Meaning that it keeps the temporal dependency intact.
# Later on, primitives understand this temporal dependency via the `features_group` attribute.
Source code in scikit_longitudinal/data_preparation/merwav_time_plus.py
get_params(deep=True)
¶
Get the parameters of the MerWavTimePlus instance.
Retrieves the configuration parameters of the instance, useful for inspection or integration with scikit-learn pipelines.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
deep
|
bool
|
Unused parameter but kept for consistency with the scikit-learn API. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
dict |
The parameters of the MerWavTimePlus instance. |
Source code in scikit_longitudinal/data_preparation/merwav_time_plus.py
_prepare_data(X, y=None)
¶
Prepare the data for transformation.
Overridden from DataPreparationMixin.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
The input data. |
required |
y
|
ndarray
|
The target data, stored but not used in transformation. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
MerWavTimePlus |
MerWavTimePlus
|
The instance with prepared data. |