Developers¶
Welcome to the contributor and developer guide for Sklong. This page helps you get set up locally, understand the main platform considerations, and know how to contribute.
Project Status
Scikit-longitudinal is actively evolving. Expect changes, and if you encounter issues, please open a GitHub Issue.
Installation & Troubleshooting¶
- macOS: supported on Python
3.10to3.13. - Linux: supported on Python
3.10to3.13on stable distributions such asUbuntu. - Windows: use
Google Colabor a Linux-basedDockerimage for now.
Windows support
Native Windows installation is not the primary supported workflow right now.
The project is developed and validated first on macOS and stable Linux environments, so the most reliable Windows path is to use Google Colab or a Linux-based Docker image.
Apple Silicon (x86_64 wheels)¶
Apple Silicon users should install and run Scikit-longitudinal with an x86_64 CPython via uv, because some dependencies only distribute Intel wheels. The steps below rely entirely on uv—no Rosetta shell juggling or conda environment needed.
-
Locate an Intel CPython build Use
uvto list all macOS x86_64 interpreters (Python 3.10–3.13 shown below): -
Install your preferred version Pick any listed
cpython-<version>-macos-x86_64-noneand install it: -
Pin the interpreter for this project Tell
uvto use that Intel build whenever you work in this repo: -
Install project dependencies
-
Verify the install
After pinning the x86_64 interpreter, uv will automatically use it for future commands, providing a smooth Apple Silicon experience.
Install From Source Scikit-Longitudinal¶
Prerequisites¶
- Python 3.10–3.13: Download
- UV: Installation Guide
Environment Setup¶
- Clone the Repository:
- Install and Pin Python Version:
uv python install cpython-3.10.16 # or any other 3.10+ wheel
uv python pin cpython-3.10.16 # or any other 3.10+ wheel
- Create and Activate Virtual Environment:
- Install Dependencies:
Prefer pip or conda? You can adapt the setup, but UV is recommended for its speed and efficiency.
Linting and Formatting¶
We use Ruff (with default rules — no custom configuration) to keep code quality in check.
- Check Issues:
- Fix Formatting:
Pre-Commit Hooks¶
Enforce standards with pre-commit hooks: 1. Install:
2. Run Manually (optional):Adding New Components¶
Scikit-longitudinal currently exposes shared extension templates for three component families: classifiers, transformers, and data-preparation tools. Regressors do not yet have an equivalent shared base template, so new regressor work should follow the existing lexicographical regressor implementations more closely.
Add new classifier primitives to estimators/.
- Location: Create the implementation in the most appropriate estimator package, such as
estimators/ensemble/orestimators/trees/. - Class Definition: Inherit from
CustomClassifierMixinEstimatorfromscikit_longitudinal.templates. - Implementation: Implement
_fit,_predict, and_predict_proba. The publicfit,predict, andpredict_probamethods are already provided by the template and perform input validation for you. - Temporal Metadata: Accept and store
features_groupor any other longitudinal metadata your classifier needs. - Exports: Update the relevant
__init__.pywhen you want the class to be importable from the public package surface. Discovery scans modules automatically, so public exports and discovery are related but not the same thing.
Example:
import numpy as np
from overrides import override
from scikit_longitudinal.templates import CustomClassifierMixinEstimator
class MyClassifier(CustomClassifierMixinEstimator):
def __init__(self, features_group=None):
self.features_group = features_group
self._majority_class = None
@override
def _fit(self, X: np.ndarray, y: np.ndarray, sample_weight=None):
_ = X, sample_weight
values, counts = np.unique(y, return_counts=True)
self.classes_ = values
self._majority_class = values[np.argmax(counts)]
return self
@override
def _predict(self, X: np.ndarray) -> np.ndarray:
return np.full(X.shape[0], self._majority_class)
@override
def _predict_proba(self, X: np.ndarray) -> np.ndarray:
proba = np.zeros((X.shape[0], len(self.classes_)))
majority_index = np.where(self.classes_ == self._majority_class)[0][0]
proba[:, majority_index] = 1.0
return proba
Add data transformation tools to preprocessors/.
- Location: Create the implementation in the relevant preprocessing package, such as
preprocessors/feature_selection/. - Class Definition: Inherit from
CustomTransformerMixinEstimator. - Implementation: Implement
_fitand_transform. The publicfitandtransformmethods are already provided by the template and validate inputs before delegation. - Exports: Update the relevant
__init__.pywhen you want the preprocessor exposed as a stable public import.
Example:
import numpy as np
from overrides import override
from scikit_longitudinal.templates import CustomTransformerMixinEstimator
class MyPreprocessor(CustomTransformerMixinEstimator):
def __init__(self, keep_first_n: int = 5):
self.keep_first_n = keep_first_n
@override
def _fit(self, X: np.ndarray, y: np.ndarray = None):
_ = y
self.selected_indices_ = list(range(min(self.keep_first_n, X.shape[1])))
return self
@override
def _transform(self, X: np.ndarray) -> np.ndarray:
return X[:, self.selected_indices_]
Add utilities to data_preparation/.
- Location: Create a new module in
data_preparation/, for examplemy_data_tool.py. - Class Definition: Inherit from
DataPreparationMixin. - Required Method: Implement
_prepare_data. That is the only method required by the mixin itself. - Optional Transformation Stage: Add
_transform()when your tool follows the same pattern as existing preparation components that first cache input state throughprepare_data(...)and then expose a transformation step for downstream pipeline helpers. - Exports: Update
__init__.pywhen you want a stable public import path.
Example:
import numpy as np
import pandas as pd
from overrides import override
from scikit_longitudinal.templates import DataPreparationMixin
class MyDataTool(DataPreparationMixin):
def __init__(self, feature_list_names=None):
self.feature_list_names = feature_list_names
self.dataset_ = None
self.target_ = None
@override
def _prepare_data(self, X: np.ndarray, y: np.ndarray = None):
self.dataset_ = pd.DataFrame(X, columns=self.feature_list_names)
self.target_ = y
return self
def _transform(self):
transformed = self.dataset_.copy()
feature_list_names = transformed.columns.tolist()
return transformed, None, None, feature_list_names
Template Usage
Mirror the style of a nearby component, add focused tests under scikit_longitudinal/tests/, and update package exports when you want the new primitive to be part of the public import surface.
Running Tests¶
Validate your changes:
Generating Documentation¶
Update and preview docs locally:
- Install the docs dependencies:
http://127.0.0.1:8000.
Submitting Contributions¶
Follow this Git workflow:
- Create a Branch:
- Commit Changes:
- Rebase:
- Push and Open PR:
- Submit a pull request against
main.
Commit Messages
Use the Conventional Commit-style prefixes from Karma (e.g., feat: add new estimator, docs: clarify installation).