Merging Waves and Discarding Time Indices¶
What is the MerWavTimeMinus module?
The MerWavTimeMinus module transforms longitudinal data by merging all features across waves into a single set,
discarding temporal information. This simplifies the dataset for traditional machine learning algorithms but loses
temporal dependencies. It provides methods for data preparation and transformation, including prepare_data and
transform.
What are features_group and non_longitudinal_features?
Two key attributes, features_group and non_longitudinal_features, enable algorithms to interpret the
temporal structure of longitudinal data.
- features_group: A list of lists where each sublist contains indices of a longitudinal attribute's waves, ordered from oldest to most recent. This captures temporal dependencies.
- non_longitudinal_features: A list of indices for static, non-temporal features excluded from the temporal matrix.
Proper setup of these attributes is critical for leveraging temporal patterns effectively.
MerWavTimeMinus ¶
Bases: DataPreparationMixin
MerwavTimeMinus stands for Merge Waves yet discards time indices in longitudinal datasets.
The MerWavTimeMinus transforms longitudinal data by merging all features across waves into a single set,
effectively discarding temporal information. This approach treats different values of the same original longitudinal
feature as distinct features, simplifying the dataset for traditional machine learning algorithms but losing temporal
dependencies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features_group
|
List[List[int]]
|
A temporal matrix representing the temporal dependency of a longitudinal dataset. Each sublist contains indices of a longitudinal attribute's waves. Defaults to None. |
None
|
non_longitudinal_features
|
List[Union[int, str]]
|
A list of indices or names of non-longitudinal features. Defaults to None. |
None
|
feature_list_names
|
List[str]
|
A list of feature names in the dataset. Defaults to None. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
features_group |
List[List[int]]
|
The temporal matrix of feature groups. |
non_longitudinal_features |
List[Union[int, str]]
|
The non-longitudinal features. |
feature_list_names |
List[str]
|
The feature names in the dataset. |
Examples:
Below is an example demonstrating the usage of the MerWavTimeMinus class with the "stroke.csv" dataset.
Please, note that "stroke.csv" is a placeholder and should be replaced with the actual path to your dataset.
Basic Usage
from scikit_longitudinal.data_preparation import MerWavTimeMinus
# Load dataset
dataset = LongitudinalDataset('./stroke_longitudinal.csv')
dataset.load_data()
dataset.load_target(target_column="stroke_w2")
dataset.setup_features_group("elsa")
dataset.load_train_test_split(test_size=0.2, random_state=42)
# Initialize MerWavTimeMinus
mer_wav = MerWavTimeMinus(
features_group=dataset.feature_groups(),
non_longitudinal_features=dataset.non_longitudinal_features(),
feature_list_names=dataset.data.columns.tolist()
)
# No need to apply any transformation, MerWavTimeMinus takes the dataset as it is
# Meaning that it does not care about any of the temporal dependency.
# We let this there for compatibility but it has little value alone.
Source code in scikit_longitudinal/data_preparation/merwav_time_minus.py
get_params(deep=True)
¶
Get the parameters of the MerWavTimeMinus instance.
This method retrieves the configuration parameters of the MerWavTimeMinus instance, useful for inspection or
hyperparameter tuning.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
deep
|
bool
|
Unused parameter but kept for consistency with the scikit-learn API. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
dict |
The parameters of the MerWavTimeMinus instance. |
Source code in scikit_longitudinal/data_preparation/merwav_time_minus.py
_prepare_data(X, y=None)
¶
Prepare the data for transformation.
Overridden from DataPreparationMixin.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
The input data. |
required |
y
|
ndarray
|
The target data. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
MerWavTimeMinus |
MerWavTimeMinus
|
The instance with prepared data. |