Skip to content

Developers

Welcome to the contributor and developer guide for Sklong. This page helps you get set up locally, understand the main platform considerations, and know how to contribute.

Project Status

Scikit-longitudinal is actively evolving. Expect changes, and if you encounter issues, please open a GitHub Issue.


Installation & Troubleshooting

  • macOS: supported on Python 3.10 to 3.13.
  • Linux: supported on Python 3.10 to 3.13 on stable distributions such as Ubuntu.
  • Windows: use Google Colab or a Linux-based Docker image for now.

Windows support

Native Windows installation is not the primary supported workflow right now. The project is developed and validated first on macOS and stable Linux environments, so the most reliable Windows path is to use Google Colab or a Linux-based Docker image.

Apple Silicon (x86_64 wheels)

Apple Silicon users should install and run Scikit-longitudinal with an x86_64 CPython via uv, because some dependencies only distribute Intel wheels. The steps below rely entirely on uv—no Rosetta shell juggling or conda environment needed.

  1. Locate an Intel CPython build Use uv to list all macOS x86_64 interpreters (Python 3.10–3.13 shown below):

    uv python list --all-versions --all-platforms \
    | grep 'macos-x86_64' \
    | egrep '3\.10|3\.11|3\.12|3\.13'
    

  2. Install your preferred version Pick any listed cpython-<version>-macos-x86_64-none and install it:

    uv python install cpython-3.12.10-macos-x86_64-none
    

  3. Pin the interpreter for this project Tell uv to use that Intel build whenever you work in this repo:

    uv python pin cpython-3.12.10-macos-x86_64-none
    

  4. Install project dependencies

    uv sync
    

  5. Verify the install

    uv run python -c "import scikit_longitudinal"
    

After pinning the x86_64 interpreter, uv will automatically use it for future commands, providing a smooth Apple Silicon experience.


Install From Source Scikit-Longitudinal

Prerequisites

Environment Setup

  1. Clone the Repository:
git clone https://github.com/simonprovost/scikit-longitudinal.git
cd scikit-longitudinal
  1. Install and Pin Python Version:
uv python install cpython-3.10.16 # or any other 3.10+ wheel
uv python pin cpython-3.10.16 # or any other 3.10+ wheel
  1. Create and Activate Virtual Environment:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
  1. Install Dependencies:
uv sync --all-groups

Prefer pip or conda? You can adapt the setup, but UV is recommended for its speed and efficiency.


Linting and Formatting

We use Ruff (with default rules — no custom configuration) to keep code quality in check.

  • Check Issues:
    uv run ruff check
    
  • Fix Formatting:
    uv run ruff check --fix
    

Pre-Commit Hooks

Enforce standards with pre-commit hooks: 1. Install:

uv run pre-commit install
2. Run Manually (optional):
uv run pre-commit run --all-files


Adding New Components

Scikit-longitudinal currently exposes shared extension templates for three component families: classifiers, transformers, and data-preparation tools. Regressors do not yet have an equivalent shared base template, so new regressor work should follow the existing lexicographical regressor implementations more closely.

Add new classifier primitives to estimators/.

  1. Location: Create the implementation in the most appropriate estimator package, such as estimators/ensemble/ or estimators/trees/.
  2. Class Definition: Inherit from CustomClassifierMixinEstimator from scikit_longitudinal.templates.
  3. Implementation: Implement _fit, _predict, and _predict_proba. The public fit, predict, and predict_proba methods are already provided by the template and perform input validation for you.
  4. Temporal Metadata: Accept and store features_group or any other longitudinal metadata your classifier needs.
  5. Exports: Update the relevant __init__.py when you want the class to be importable from the public package surface. Discovery scans modules automatically, so public exports and discovery are related but not the same thing.

Example:

import numpy as np
from overrides import override

from scikit_longitudinal.templates import CustomClassifierMixinEstimator

class MyClassifier(CustomClassifierMixinEstimator):
    def __init__(self, features_group=None):
        self.features_group = features_group
        self._majority_class = None

    @override
    def _fit(self, X: np.ndarray, y: np.ndarray, sample_weight=None):
        _ = X, sample_weight
        values, counts = np.unique(y, return_counts=True)
        self.classes_ = values
        self._majority_class = values[np.argmax(counts)]
        return self

    @override
    def _predict(self, X: np.ndarray) -> np.ndarray:
        return np.full(X.shape[0], self._majority_class)

    @override
    def _predict_proba(self, X: np.ndarray) -> np.ndarray:
        proba = np.zeros((X.shape[0], len(self.classes_)))
        majority_index = np.where(self.classes_ == self._majority_class)[0][0]
        proba[:, majority_index] = 1.0
        return proba

Add data transformation tools to preprocessors/.

  1. Location: Create the implementation in the relevant preprocessing package, such as preprocessors/feature_selection/.
  2. Class Definition: Inherit from CustomTransformerMixinEstimator.
  3. Implementation: Implement _fit and _transform. The public fit and transform methods are already provided by the template and validate inputs before delegation.
  4. Exports: Update the relevant __init__.py when you want the preprocessor exposed as a stable public import.

Example:

import numpy as np
from overrides import override

from scikit_longitudinal.templates import CustomTransformerMixinEstimator

class MyPreprocessor(CustomTransformerMixinEstimator):
    def __init__(self, keep_first_n: int = 5):
        self.keep_first_n = keep_first_n

    @override
    def _fit(self, X: np.ndarray, y: np.ndarray = None):
        _ = y
        self.selected_indices_ = list(range(min(self.keep_first_n, X.shape[1])))
        return self

    @override
    def _transform(self, X: np.ndarray) -> np.ndarray:
        return X[:, self.selected_indices_]

Add utilities to data_preparation/.

  1. Location: Create a new module in data_preparation/, for example my_data_tool.py.
  2. Class Definition: Inherit from DataPreparationMixin.
  3. Required Method: Implement _prepare_data. That is the only method required by the mixin itself.
  4. Optional Transformation Stage: Add _transform() when your tool follows the same pattern as existing preparation components that first cache input state through prepare_data(...) and then expose a transformation step for downstream pipeline helpers.
  5. Exports: Update __init__.py when you want a stable public import path.

Example:

import numpy as np
import pandas as pd
from overrides import override

from scikit_longitudinal.templates import DataPreparationMixin

class MyDataTool(DataPreparationMixin):
    def __init__(self, feature_list_names=None):
        self.feature_list_names = feature_list_names
        self.dataset_ = None
        self.target_ = None

    @override
    def _prepare_data(self, X: np.ndarray, y: np.ndarray = None):
        self.dataset_ = pd.DataFrame(X, columns=self.feature_list_names)
        self.target_ = y
        return self

    def _transform(self):
        transformed = self.dataset_.copy()
        feature_list_names = transformed.columns.tolist()
        return transformed, None, None, feature_list_names

Template Usage

Mirror the style of a nearby component, add focused tests under scikit_longitudinal/tests/, and update package exports when you want the new primitive to be part of the public import surface.


Running Tests

Validate your changes:

uv run pytest -sv scikit_longitudinal/tests/


Generating Documentation

Update and preview docs locally:

  1. Install the docs dependencies:

uv sync --dev
2. Build Docs:

uv run zensical build
3. Serve Docs:

uv run zensical serve
4. View: Open http://127.0.0.1:8000.


Submitting Contributions

Follow this Git workflow:

  1. Create a Branch:
    git checkout -b feat/your-feature
    
  2. Commit Changes:
    git commit -m "feat: describe your change"
    
  3. Rebase:
    git fetch origin
    git rebase origin/main
    
  4. Push and Open PR:
    git push origin feat/your-feature
    
  5. Submit a pull request against main.

Commit Messages

Use the Conventional Commit-style prefixes from Karma (e.g., feat: add new estimator, docs: clarify installation).