Pin setuptools version<82 to fix pkg_resources not found error (#1516 )

* Pin setuptools version<82 to fix pkg_resources not found error * Add quote * Pin all setuptools
Bump webpack from 5.94.0 to 5.105.0 in /website (#1515 )
2026-02-12 19:59:18 +08:00 · 2026-02-12 12:38:37 +08:00 · 2026-02-08 16:29:18 +08:00 · 2026-01-28 09:00:21 +08:00 · 2026-01-25 21:10:05 +08:00 · 2026-01-23 10:20:59 +08:00
112 changed files with 4471 additions and 50193 deletions
--- a/.coveragerc
+++ b/.coveragerc
@@ -1,5 +1,7 @@
 [run]
 branch = True
-source = flaml
+source =
+  flaml
 omit =
-  *test*
+  */test/*
+  */flaml/autogen/*
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,243 @@
+# GitHub Copilot Instructions for FLAML
+
+## Project Overview
+
+FLAML (Fast Library for Automated Machine Learning & Tuning) is a lightweight Python library for efficient automation of machine learning and AI operations. It automates workflow based on large language models, machine learning models, etc. and optimizes their performance.
+
+**Key Components:**
+
+- `flaml/automl/`: AutoML functionality for classification and regression
+- `flaml/tune/`: Generic hyperparameter tuning
+- `flaml/default/`: Zero-shot AutoML with default configurations
+- `flaml/autogen/`: Legacy autogen code (note: AutoGen has moved to a separate repository)
+- `flaml/fabric/`: Microsoft Fabric integration
+- `test/`: Comprehensive test suite
+
+## Build and Test Commands
+
+### Installation
+
+```bash
+# Basic installation
+pip install -e .
+
+# Install with test dependencies
+pip install -e .[test]
+
+# Install with automl dependencies
+pip install -e .[automl]
+
+# Install with forecast dependencies (Linux only)
+pip install -e .[forecast]
+```
+
+### Running Tests
+
+```bash
+# Run all tests (excluding autogen)
+pytest test/ --ignore=test/autogen --reruns 2 --reruns-delay 10
+
+# Run tests with coverage
+coverage run -a -m pytest test --ignore=test/autogen --reruns 2 --reruns-delay 10
+coverage xml
+
+# Check dependencies
+python test/check_dependency.py
+```
+
+### Linting and Formatting
+
+```bash
+# Run pre-commit hooks
+pre-commit run --all-files
+
+# Format with black (line length: 120)
+black . --line-length 120
+
+# Run ruff for linting and auto-fix
+ruff check . --fix
+```
+
+## Code Style and Formatting
+
+### Python Style
+
+- **Line length:** 120 characters (configured in both Black and Ruff)
+- **Formatter:** Black (v23.3.0+)
+- **Linter:** Ruff with Pyflakes and pycodestyle rules
+- **Import sorting:** Use isort (via Ruff)
+- **Python version:** Supports Python >= 3.10 (full support for 3.10, 3.11, 3.12 and 3.13)
+
+### Code Quality Rules
+
+- Follow Black formatting conventions
+- Keep imports sorted and organized
+- Avoid unused imports (F401) - these are flagged but not auto-fixed
+- Avoid wildcard imports (F403) where possible
+- Complexity: Max McCabe complexity of 10
+- Use type hints where appropriate
+- Write clear docstrings for public APIs
+
+### Pre-commit Hooks
+
+The repository uses pre-commit hooks for:
+
+- Checking for large files, AST syntax, YAML/TOML/JSON validity
+- Detecting merge conflicts and private keys
+- Trailing whitespace and end-of-file fixes
+- pyupgrade for Python 3.8+ syntax
+- Black formatting
+- Markdown formatting (mdformat with GFM and frontmatter support)
+- Ruff linting with auto-fix
+
+## Testing Strategy
+
+### Test Organization
+
+- Tests are in the `test/` directory, organized by module
+- `test/automl/`: AutoML feature tests
+- `test/tune/`: Hyperparameter tuning tests
+- `test/default/`: Zero-shot AutoML tests
+- `test/nlp/`: NLP-related tests
+- `test/spark/`: Spark integration tests
+
+### Test Requirements
+
+- Write tests for new functionality
+- Ensure tests pass on multiple Python versions (3.10, 3.11, 3.12 and 3.13)
+- Tests should work on both Ubuntu and Windows
+- Use pytest markers for platform-specific tests (e.g., `@pytest.mark.spark`)
+- Tests should be idempotent and not depend on external state
+- Use `--reruns 2 --reruns-delay 10` for flaky tests
+
+### Coverage
+
+- Aim for good test coverage on new code
+- Coverage reports are generated for Python 3.11 builds
+- Coverage reports are uploaded to Codecov
+
+## Git Workflow and Best Practices
+
+### Branching
+
+- Main branch: `main`
+- Create feature branches from `main`
+- PR reviews are required before merging
+
+### Commit Messages
+
+- Use clear, descriptive commit messages
+- Reference issue numbers when applicable
+- ALWAYS run `pre-commit run --all-files` before each commit to avoid formatting issues
+
+### Pull Requests
+
+- Ensure all tests pass before requesting review
+- Update documentation if adding new features
+- Follow the PR template in `.github/PULL_REQUEST_TEMPLATE.md`
+- ALWAYS run `pre-commit run --all-files` before each commit to avoid formatting issues
+
+## Project Structure
+
+```
+flaml/
+├── automl/         # AutoML functionality
+├── tune/           # Hyperparameter tuning
+├── default/        # Zero-shot AutoML
+├── autogen/        # Legacy autogen (deprecated, moved to separate repo)
+├── fabric/         # Microsoft Fabric integration
+├── onlineml/       # Online learning
+└── version.py      # Version information
+
+test/               # Test suite
+├── automl/
+├── tune/
+├── default/
+├── nlp/
+└── spark/
+
+notebook/           # Example notebooks
+website/            # Documentation website
+```
+
+## Dependencies and Package Management
+
+### Core Dependencies
+
+- NumPy >= 1.17
+- Python >= 3.10 (officially supported: 3.10, 3.11, 3.12 and 3.13)
+
+### Optional Dependencies
+
+- `[automl]`: lightgbm, xgboost, scipy, pandas, scikit-learn
+- `[test]`: Full test suite dependencies
+- `[spark]`: PySpark and joblib dependencies
+- `[forecast]`: holidays, prophet, statsmodels, hcrystalball, pytorch-forecasting, pytorch-lightning, tensorboardX
+- `[hf]`: Hugging Face transformers and datasets
+- See `setup.py` for complete list
+
+### Version Constraints
+
+- Be mindful of Python version-specific dependencies (check setup.py)
+- XGBoost versions differ based on Python version
+- NumPy 2.0+ only for Python >= 3.13
+- Some features (like vowpalwabbit) only work with older Python versions
+
+## Boundaries and Restrictions
+
+### Do NOT Modify
+
+- `.git/` directory and Git configuration
+- `LICENSE` file
+- Version information in `flaml/version.py` (unless explicitly updating version)
+- GitHub Actions workflows without careful consideration
+- Existing test files unless fixing bugs or adding coverage
+
+### Be Cautious With
+
+- `setup.py`: Changes to dependencies should be carefully reviewed
+- `pyproject.toml`: Linting and testing configuration
+- `.pre-commit-config.yaml`: Pre-commit hook configuration
+- Backward compatibility: FLAML is a library with external users
+
+### Security Considerations
+
+- Never commit secrets or API keys
+- Be careful with external data sources in tests
+- Validate user inputs in public APIs
+- Follow secure coding practices for ML operations
+
+## Special Notes
+
+### AutoGen Migration
+
+- AutoGen has moved to a separate repository: https://github.com/microsoft/autogen
+- The `flaml/autogen/` directory contains legacy code
+- Tests in `test/autogen/` are ignored in the main test suite
+- Direct users to the new AutoGen repository for AutoGen-related issues
+
+### Platform-Specific Considerations
+
+- Some tests only run on Linux (e.g., forecast tests with prophet)
+- Windows and Ubuntu are the primary supported platforms
+- macOS support exists but requires special libomp setup for lgbm/xgboost
+
+### Performance
+
+- FLAML focuses on efficient automation and tuning
+- Consider computational cost when adding new features
+- Optimize for low resource usage where possible
+
+## Documentation
+
+- Main documentation: https://microsoft.github.io/FLAML/
+- Update documentation when adding new features
+- Provide clear examples in docstrings
+- Add notebook examples for significant new features
+
+## Contributing
+
+- Follow the contributing guide: https://microsoft.github.io/FLAML/docs/Contribute
+- Sign the Microsoft CLA when making your first contribution
+- Be respectful and follow the Microsoft Open Source Code of Conduct
+- Join the Discord community for discussions: https://discord.gg/Cppx2vSPVP
--- a/.github/workflows/CD.yml
+++ b/.github/workflows/CD.yml
@@ -13,7 +13,7 @@ jobs:
    strategy:
      matrix:
        os: ["ubuntu-latest"]
-        python-version: ["3.10"]
+        python-version: ["3.12"]
    runs-on: ${{ matrix.os }}
    environment: package
    steps:
@@ -33,7 +33,7 @@ jobs:
      - name: Build
        shell: pwsh
        run: |
-          pip install twine wheel setuptools
+          pip install twine wheel "setuptools<82"
          python setup.py sdist bdist_wheel
      - name: Publish to PyPI
        env:
--- a/.github/workflows/deploy-website.yml
+++ b/.github/workflows/deploy-website.yml
@@ -37,11 +37,11 @@ jobs:
      - name: setup python
        uses: actions/setup-python@v4
        with:
-          python-version: "3.10"
+          python-version: "3.12"
      - name: pydoc-markdown install
        run: |
          python -m pip install --upgrade pip
-          pip install pydoc-markdown==4.7.0
+          pip install pydoc-markdown==4.7.0 "setuptools<82"
      - name: pydoc-markdown run
        run: |
          pydoc-markdown
@@ -73,11 +73,11 @@ jobs:
      - name: setup python
        uses: actions/setup-python@v4
        with:
-          python-version: "3.10"
+          python-version: "3.12"
      - name: pydoc-markdown install
        run: |
          python -m pip install --upgrade pip
-          pip install pydoc-markdown==4.7.0
+          pip install pydoc-markdown==4.7.0 "setuptools<82"
      - name: pydoc-markdown run
        run: |
          pydoc-markdown
--- a/.github/workflows/openai.yml
+++ b/.github/workflows/openai.yml
@@ -4,14 +4,15 @@
 name: OpenAI

 on:
-  pull_request:
-    branches: ['main']
-    paths:
-      - 'flaml/autogen/**'
-      - 'test/autogen/**'
-      - 'notebook/autogen_openai_completion.ipynb'
-      - 'notebook/autogen_chatgpt_gpt4.ipynb'
-      - '.github/workflows/openai.yml'
+  workflow_dispatch:
+#   pull_request:
+#     branches: ['main']
+#     paths:
+#       - 'flaml/autogen/**'
+#       - 'test/autogen/**'
+#       - 'notebook/autogen_openai_completion.ipynb'
+#       - 'notebook/autogen_chatgpt_gpt4.ipynb'
+#       - '.github/workflows/openai.yml'

 permissions: {}

--- a/.github/workflows/pre-commit.yml
+++ b/.github/workflows/pre-commit.yml
@@ -1,9 +1,7 @@
 name: Code formatting

 # see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
-on:  # Trigger the workflow on push or pull request, but only for the main branch
-  push:
-    branches: [main]
+on:
  pull_request: {}

 defaults:
--- a/.github/workflows/python-package.yml
+++ b/.github/workflows/python-package.yml
@@ -39,11 +39,8 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
-        os: [ubuntu-latest, macos-latest, windows-latest]
-        python-version: ["3.10", "3.11"]
-        exclude:
-          - os: macos-latest
-            python-version: "3.10"
+        os: [ubuntu-latest, windows-latest]
+        python-version: ["3.10", "3.11", "3.12", "3.13"]
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python ${{ matrix.python-version }}
@@ -63,15 +60,10 @@ jobs:
          export LDFLAGS="$LDFLAGS -Wl,-rpath,/usr/local/opt/libomp/lib -L/usr/local/opt/libomp/lib -lomp"
      - name: Install packages and dependencies
        run: |
-          python -m pip install --upgrade pip wheel setuptools
+          python -m pip install --upgrade pip wheel "setuptools<82"
          pip install -e .
          python -c "import flaml"
          pip install -e .[test]
-      - name: On Ubuntu python 3.10, install pyspark 3.4.1
-        if: matrix.python-version == '3.10' && matrix.os == 'ubuntu-latest'
-        run: |
-          pip install pyspark==3.4.1
-          pip list | grep "pyspark"
      - name: On Ubuntu python 3.11, install pyspark 3.5.1
        if: matrix.python-version == '3.11' && matrix.os == 'ubuntu-latest'
        run: |
@@ -82,6 +74,11 @@ jobs:
        run: |
          pip install pyspark==4.0.1
          pip list | grep "pyspark"
+      - name: On Ubuntu python 3.13, install pyspark 4.1.0
+        if: matrix.python-version == '3.13' && matrix.os == 'ubuntu-latest'
+        run: |
+          pip install pyspark==4.1.0
+          pip list | grep "pyspark"
      # # TODO: support ray
      # - name: If linux and python<3.11, install ray 2
      #   if: matrix.os == 'ubuntu-latest' && matrix.python-version < '3.11'
@@ -106,22 +103,25 @@ jobs:
        run: |
          pip cache purge
      - name: Test with pytest
-        if: matrix.python-version != '3.10'
+        timeout-minutes: 120
+        if: matrix.python-version != '3.11'
        run: |
          pytest test/ --ignore=test/autogen --reruns 2 --reruns-delay 10
      - name: Coverage
-        if: matrix.python-version == '3.10'
+        timeout-minutes: 120
+        if: matrix.python-version == '3.11'
        run: |
          pip install coverage
          coverage run -a -m pytest test --ignore=test/autogen --reruns 2 --reruns-delay 10
          coverage xml
      - name: Upload coverage to Codecov
-        if: matrix.python-version == '3.10'
+        if: matrix.python-version == '3.11'
        uses: codecov/codecov-action@v3
        with:
          file: ./coverage.xml
          flags: unittests
      - name: Save dependencies
+        if: github.ref == 'refs/heads/main'
        shell: bash
        run: |
          git config --global user.name 'github-actions[bot]'
@@ -130,10 +130,7 @@ jobs:

          BRANCH=unit-tests-installed-dependencies
          git fetch origin
-          git checkout -B "$BRANCH"
-          if git show-ref --verify --quiet "refs/remotes/origin/$BRANCH"; then
-            git rebase "origin/$BRANCH"
-          fi
+          git checkout -B "$BRANCH" "origin/$BRANCH"

          pip freeze > installed_all_dependencies_${{ matrix.python-version }}_${{ matrix.os }}.txt
          python test/check_dependency.py > installed_first_tier_dependencies_${{ matrix.python-version }}_${{ matrix.os }}.txt
@@ -141,4 +138,4 @@ jobs:
          mv coverage.xml ./coverage_${{ matrix.python-version }}_${{ matrix.os }}.xml || true
          git add -f ./coverage_${{ matrix.python-version }}_${{ matrix.os }}.xml || true
          git commit -m "Update installed dependencies for Python ${{ matrix.python-version }} on ${{ matrix.os }}" || exit 0
-          git push origin "$BRANCH"
+          git push origin "$BRANCH" --force
--- a/.gitignore
+++ b/.gitignore
@@ -60,6 +60,7 @@ coverage.xml
 .hypothesis/
 .pytest_cache/
 cover/
+junit

 # Translations
 *.mo
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -36,7 +36,7 @@ repos:
    - id: black

  - repo: https://github.com/executablebooks/mdformat
-    rev: 0.7.17
+    rev: 0.7.22
    hooks:
      - id: mdformat
        additional_dependencies:
--- a/NOTICE.md
+++ b/NOTICE.md
@@ -4,8 +4,8 @@ This repository incorporates material as listed below or described in the code.

 ## Component. Ray.

-Code in tune/\[analysis.py, sample.py, trial.py, result.py\],
-searcher/\[suggestion.py, variant_generator.py\], and scheduler/trial_scheduler.py is adapted from
+Code in tune/[analysis.py, sample.py, trial.py, result.py],
+searcher/[suggestion.py, variant_generator.py], and scheduler/trial_scheduler.py is adapted from
 https://github.com/ray-project/ray/blob/master/python/ray/tune/

 ## Open Source License/Copyright Notice.
--- a/README.md
+++ b/README.md
@@ -34,7 +34,7 @@ FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source,

 ## Installation

-FLAML requires **Python version >= 3.9**. It can be installed from pip:
+The latest version of FLAML requires **Python >= 3.10 and < 3.14**. While other Python versions may work for core components, full model support is not guaranteed. FLAML can be installed via `pip`:

 ```bash
 pip install flaml
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -12,7 +12,7 @@ If you believe you have found a security vulnerability in any Microsoft-owned re

 Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).

-If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com).  If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).
+If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).

 You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).

--- a/coverage_3.10_windows-latest.xml
+++ b/coverage_3.10_windows-latest.xml
--- a/coverage_3.11_macos-latest.xml
+++ b/coverage_3.11_macos-latest.xml
--- a/coverage_3.11_ubuntu-latest.xml
+++ b/coverage_3.11_ubuntu-latest.xml
--- a/coverage_3.11_windows-latest.xml
+++ b/coverage_3.11_windows-latest.xml
--- a/flaml/autogen/init.py
+++ b/flaml/autogen/init.py
@@ -1,3 +1,12 @@
+import warnings
+
 from .agentchat import *
 from .code_utils import DEFAULT_MODEL, FAST_MODEL
 from .oai import *
+
+warnings.warn(
+    "The `flaml.autogen` module is deprecated and will be removed in a future release. "
+    "Please refer to `https://github.com/microsoft/autogen` for latest usage.",
+    DeprecationWarning,
+    stacklevel=2,
+)
--- a/flaml/automl/automl.py
+++ b/flaml/automl/automl.py
@@ -4,6 +4,7 @@
 #  * project root for license information.
 from __future__ import annotations

+import inspect
 import json
 import logging
 import os
@@ -117,6 +118,8 @@ class AutoML(BaseEstimator):
                e.g., 'accuracy', 'roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'roc_auc_weighted',
                'roc_auc_ovo_weighted', 'roc_auc_ovr_weighted', 'f1', 'micro_f1', 'macro_f1',
                'log_loss', 'mae', 'mse', 'r2', 'mape'. Default is 'auto'.
+                For a full list of supported built-in metrics, please refer to
+                https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#optimization-metric
                If passing a customized metric function, the function needs to
                have the following input arguments:

@@ -153,6 +156,10 @@ class AutoML(BaseEstimator):
                "pred_time": pred_time,
            }
        ```
+                **Note:** When passing a custom metric function, pass the function itself
+                (e.g., `metric=custom_metric`), not the result of calling it
+                (e.g., `metric=custom_metric(...)`). FLAML will call your function
+                internally during the training process.
            task: A string of the task type, e.g.,
                'classification', 'regression', 'ts_forecast', 'rank',
                'seq-classification', 'seq-regression', 'summarization',
@@ -173,14 +180,20 @@ class AutoML(BaseEstimator):
                and 'final_estimator' to specify the passthrough and
                final_estimator in the stacker. The dict can also contain
                'n_jobs' as the key to specify the number of jobs for the stacker.
+                Note: The hyperparameters of a custom 'final_estimator' are NOT
+                automatically tuned. If you provide an estimator instance (e.g.,
+                CatBoostClassifier()), it will use the parameters you specified
+                or their defaults. If 'final_estimator' is not provided, the best
+                model found during the search will be used as the final estimator.
            eval_method: A string of resampling strategy, one of
                ['auto', 'cv', 'holdout'].
            split_ratio: A float of the valiation data percentage for holdout.
            n_splits: An integer of the number of folds for cross - validation.
-            log_type: A string of the log type, one of
-                ['better', 'all'].
-                'better' only logs configs with better loss than previos iters
-                'all' logs all the tried configs.
+            log_type: Specifies which logs to save. One of ['better', 'all']. Default is 'better'.
+                - 'better': Logs configs and models (if `model_history` is True) only when the loss improves,
+                  to `log_file_name` and MLflow, respectively.
+                - 'all': Logs all configs and models (if `model_history` is True), regardless of performance.
+                Note: Configs are always logged to MLflow if MLflow logging is enabled.
            model_history: A boolean of whether to keep the best
                model per estimator. Make sure memory is large enough if setting to True. Default False.
            log_training_metric: A boolean of whether to log the training
@@ -330,6 +343,12 @@ class AutoML(BaseEstimator):
         }
        ```
            skip_transform: boolean, default=False | Whether to pre-process data prior to modeling.
+            allow_label_overlap: boolean, default=True | For classification tasks with holdout evaluation,
+                whether to allow label overlap between train and validation sets. When True (default),
+                uses a fast strategy that adds the first instance of missing labels to the set that is
+                missing them, which may create some overlap. When False, uses a precise but slower
+                strategy that intelligently re-splits instances to avoid overlap when possible.
+                Only affects classification tasks with holdout evaluation method.
            fit_kwargs_by_estimator: dict, default=None | The user specified keywords arguments, grouped by estimator name.
                e.g.,

@@ -360,7 +379,10 @@ class AutoML(BaseEstimator):
        settings["split_ratio"] = settings.get("split_ratio", SPLIT_RATIO)
        settings["n_splits"] = settings.get("n_splits", N_SPLITS)
        settings["auto_augment"] = settings.get("auto_augment", True)
+        settings["allow_label_overlap"] = settings.get("allow_label_overlap", True)
        settings["metric"] = settings.get("metric", "auto")
+        # Validate that custom metric is callable if not a string
+        self._validate_metric_parameter(settings["metric"], allow_auto=True)
        settings["estimator_list"] = settings.get("estimator_list", "auto")
        settings["log_file_name"] = settings.get("log_file_name", "")
        settings["max_iter"] = settings.get("max_iter")  # no budget by default
@@ -411,13 +433,69 @@ class AutoML(BaseEstimator):
        """

        state = self.__dict__.copy()
-        state.pop("mlflow_integration", None)
+        # Keep mlflow_integration for post-load visualization (e.g., infos), but
+        # strip non-picklable runtime-only members (thread futures, clients).
+        mlflow_integration = state.get("mlflow_integration", None)
+        if mlflow_integration is not None:
+            import copy
+
+            mi = copy.copy(mlflow_integration)
+            # These are runtime-only and often contain locks / threads.
+            if hasattr(mi, "futures"):
+                mi.futures = {}
+            if hasattr(mi, "futures_log_model"):
+                mi.futures_log_model = {}
+            if hasattr(mi, "train_func"):
+                mi.train_func = None
+            if hasattr(mi, "mlflow_client"):
+                mi.mlflow_client = None
+            state["mlflow_integration"] = mi
+        # MLflow signature objects may hold references to Spark/pandas-on-Spark
+        # inputs and can indirectly capture SparkContext, which is not picklable.
+        state.pop("estimator_signature", None)
+        state.pop("pipeline_signature", None)
        return state

    def __setstate__(self, state):
        self.__dict__.update(state)
-        # Ensure attribute exists post-unpickle.
-        self.mlflow_integration = None
+        # Ensure mlflow_integration runtime members exist post-unpickle.
+        mi = getattr(self, "mlflow_integration", None)
+        if mi is not None:
+            if not hasattr(mi, "futures") or mi.futures is None:
+                mi.futures = {}
+            if not hasattr(mi, "futures_log_model") or mi.futures_log_model is None:
+                mi.futures_log_model = {}
+            if not hasattr(mi, "train_func"):
+                mi.train_func = None
+            if not hasattr(mi, "mlflow_client") or mi.mlflow_client is None:
+                try:
+                    import mlflow as _mlflow
+
+                    mi.mlflow_client = _mlflow.tracking.MlflowClient()
+                except Exception:
+                    mi.mlflow_client = None
+
+    @staticmethod
+    def _validate_metric_parameter(metric, allow_auto=True):
+        """Validate that the metric parameter is either a string or a callable function.
+
+        Args:
+            metric: The metric parameter to validate.
+            allow_auto: Whether to allow "auto" as a valid string value.
+
+        Raises:
+            ValueError: If metric is not a string or callable function.
+        """
+        if allow_auto and metric == "auto":
+            return
+        if not isinstance(metric, str) and not callable(metric):
+            raise ValueError(
+                f"The 'metric' parameter must be either a string or a callable function, "
+                f"but got {type(metric).__name__}. "
+                f"If you defined a custom_metric function, make sure to pass the function itself "
+                f"(e.g., metric=custom_metric) and not the result of calling it "
+                f"(e.g., metric=custom_metric(...))."
+            )

    def get_params(self, deep: bool = False) -> dict:
        return self._settings.copy()
@@ -467,18 +545,135 @@ class AutoML(BaseEstimator):

    @property
    def best_config(self):
-        """A dictionary of the best configuration."""
+        """A dictionary of the best configuration.
+
+        The returned config dictionary can be used to:
+        1. Pass as `starting_points` to a new AutoML run.
+        2. Initialize the corresponding FLAML estimator directly.
+        3. Initialize the original model (e.g., LightGBM, XGBoost) after converting
+           FLAML-specific parameters.
+
+        Note:
+            The config contains FLAML's search space parameters, which may differ from
+            the original model's parameters. For example, FLAML uses `log_max_bin` for
+            LightGBM instead of `max_bin`. Use the FLAML estimator's `config2params()`
+            method to convert to the original model's parameters.
+
+        Example:
+
+        ```python
+        from flaml import AutoML
+        from flaml.automl.model import LGBMEstimator
+        from lightgbm import LGBMClassifier
+        from sklearn.datasets import load_iris
+
+        X, y = load_iris(return_X_y=True)
+
+        # Train with AutoML
+        automl = AutoML()
+        automl.fit(X, y, task="classification", time_budget=10)
+
+        # Get the best config
+        best_config = automl.best_config
+        print("Best config:", best_config)
+        # Example output: {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20,
+        #                  'learning_rate': 0.1, 'log_max_bin': 8, ...}
+
+        # Option 1: Use FLAML estimator directly (handles parameter conversion internally)
+        flaml_estimator = LGBMEstimator(task="classification", **best_config)
+        flaml_estimator.fit(X, y)
+
+        # Option 2: Convert to original model parameters using config2params()
+        # This converts FLAML-specific params (e.g., log_max_bin -> max_bin)
+        original_params = flaml_estimator.params  # or use flaml_estimator.config2params(best_config)
+        print("Original model params:", original_params)
+        # Example output: {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20,
+        #                  'learning_rate': 0.1, 'max_bin': 255, ...}  # log_max_bin converted to max_bin
+
+        # Now use with original LightGBM
+        lgbm_model = LGBMClassifier(**original_params)
+        lgbm_model.fit(X, y)
+        ```
+        """
        state = self._search_states.get(self._best_estimator)
        config = state and getattr(state, "best_config", None)
        return config and AutoMLState.sanitize(config)

    @property
    def best_config_per_estimator(self):
-        """A dictionary of all estimators' best configuration."""
-        return {
-            e: e_search_state.best_config and AutoMLState.sanitize(e_search_state.best_config)
-            for e, e_search_state in self._search_states.items()
-        }
+        """A dictionary of all estimators' best configuration.
+
+        Returns a dictionary where keys are estimator names (e.g., 'lgbm', 'xgboost')
+        and values are the best hyperparameter configurations found for each estimator.
+        The config may include `FLAML_sample_size` which indicates the sample size used
+        during training.
+
+        This is useful for:
+        1. Passing as `starting_points` to a new AutoML run for warm-starting.
+        2. Comparing the best configurations across different estimators.
+        3. Initializing the original models after converting FLAML-specific parameters.
+
+        Note:
+            The configs contain FLAML's search space parameters, which may differ from
+            the original models' parameters. Use each estimator's `config2params()` method
+            to convert to the original model's parameters.
+
+        Example:
+
+        ```python
+        from flaml import AutoML
+        from flaml.automl.model import LGBMEstimator, XGBoostEstimator
+        from lightgbm import LGBMClassifier
+        from xgboost import XGBClassifier
+        from sklearn.datasets import load_iris
+
+        X, y = load_iris(return_X_y=True)
+
+        # Train with AutoML
+        automl = AutoML()
+        automl.fit(X, y, task="classification", time_budget=30,
+                   estimator_list=['lgbm', 'xgboost'])
+
+        # Get best configs for all estimators
+        configs = automl.best_config_per_estimator
+        print(configs)
+        # Example output: {'lgbm': {'n_estimators': 4, 'num_leaves': 4, 'log_max_bin': 8, ...},
+        #                  'xgboost': {'n_estimators': 4, 'max_leaves': 4, ...}}
+
+        # Use as starting points for a new AutoML run (warm start)
+        new_automl = AutoML()
+        new_automl.fit(X, y, task="classification", time_budget=30,
+                       starting_points=configs)
+
+        # Or convert to original model parameters for direct use
+        if configs.get('lgbm'):
+            lgbm_config = configs['lgbm'].copy()
+            lgbm_config.pop('FLAML_sample_size', None)  # Remove FLAML internal param
+            flaml_lgbm = LGBMEstimator(task="classification", **lgbm_config)
+            original_lgbm_params = flaml_lgbm.params  # Converted params (log_max_bin -> max_bin), or use flaml_lgbm.config2params(lgbm_config)
+            lgbm_model = LGBMClassifier(**original_lgbm_params)
+            lgbm_model.fit(X, y)
+
+        if configs.get('xgboost'):
+            xgb_config = configs['xgboost'].copy()
+            xgb_config.pop('FLAML_sample_size', None)  # Remove FLAML internal param
+            flaml_xgb = XGBoostEstimator(task="classification", **xgb_config)
+            original_xgb_params = flaml_xgb.params  # Converted params
+            xgb_model = XGBClassifier(**original_xgb_params)
+            xgb_model.fit(X, y)
+        ```
+        """
+        result = {}
+        for e, e_search_state in self._search_states.items():
+            if e_search_state.best_config:
+                config = e_search_state.best_config.get("ml", e_search_state.best_config).copy()
+                # Remove internal keys that are not needed for starting_points, but keep FLAML_sample_size
+                config.pop("learner", None)
+                config.pop("_choice_", None)
+                result[e] = config
+            else:
+                result[e] = None
+        return result

    @property
    def best_loss_per_estimator(self):
@@ -594,7 +789,7 @@ class AutoML(BaseEstimator):

    def predict(
        self,
-        X: np.array | DataFrame | list[str] | list[list[str]] | psDataFrame,
+        X: np.ndarray | DataFrame | list[str] | list[list[str]] | psDataFrame,
        **pred_kwargs,
    ):
        """Predict label from features.
@@ -660,6 +855,50 @@ class AutoML(BaseEstimator):
        proba = self._trained_estimator.predict_proba(X, **pred_kwargs)
        return proba

+    def preprocess(
+        self,
+        X: np.ndarray | DataFrame | list[str] | list[list[str]] | psDataFrame,
+    ):
+        """Preprocess data using task-level preprocessing.
+
+        This method applies task-level preprocessing transformations to the input data,
+        including handling of data types, sparse matrices, and feature transformations
+        that were learned during the fit phase. This should be called before any
+        estimator-level preprocessing.
+
+        Args:
+            X: A numpy array or pandas dataframe or pyspark.pandas dataframe
+                of featurized instances, shape n * m,
+                or for time series forecast tasks:
+                    a pandas dataframe with the first column containing
+                    timestamp values (datetime type) or an integer n for
+                    the predict steps (only valid when the estimator is
+                    arima or sarimax). Other columns in the dataframe
+                    are assumed to be exogenous variables (categorical
+                    or numeric).
+
+        Returns:
+            Preprocessed data in the same format as input (numpy array, DataFrame, etc.).
+
+        Raises:
+            AttributeError: If the model has not been fitted yet.
+
+        Example:
+            ```python
+            automl = AutoML()
+            automl.fit(X_train, y_train, task="classification")
+
+            # Apply task-level preprocessing to new data
+            X_test_preprocessed = automl.preprocess(X_test)
+            ```
+        """
+        if not hasattr(self, "_state") or self._state is None:
+            raise AttributeError("AutoML instance has not been fitted yet. Please call fit() first.")
+        if not hasattr(self, "_transformer"):
+            raise AttributeError("Transformer not initialized. Please call fit() first.")
+
+        return self._state.task.preprocess(X, self._transformer)
+
    def add_learner(self, learner_name, learner_class):
        """Add a customized learner.

@@ -818,6 +1057,14 @@ class AutoML(BaseEstimator):
                the searched learners, such as sample_weight. Below are a few examples of
                estimator-specific parameters:
                    period: int | forecast horizon for all time series forecast tasks.
+                        This is the number of time steps ahead to forecast (e.g., period=12 means
+                        forecasting 12 steps into the future). This represents the forecast horizon
+                        used during model training. Note: during prediction, the output length
+                        equals the length of X_test. FLAML automatically handles feature
+                        engineering for you - sklearn-based models (lgbm, rf, xgboost, etc.) will have
+                        lagged features created automatically, while time series native models (prophet,
+                        arima, sarimax) use their built-in forecasting capabilities. You do NOT need
+                        to manually create lagged features of the target variable.
                    gpu_per_trial: float, default = 0 | A float of the number of gpus per trial,
                        only used by TransformersEstimator, XGBoostSklearnEstimator, and
                        TemporalFusionTransformerEstimator.
@@ -925,6 +1172,7 @@ class AutoML(BaseEstimator):
        eval_method = self._decide_eval_method(eval_method, time_budget)
        self.modelcount = 0
        self._auto_augment = auto_augment
+        self._allow_label_overlap = self._settings.get("allow_label_overlap", True)
        self._prepare_data(eval_method, split_ratio, n_splits)
        self._state.time_budget = -1
        self._state.free_mem_ratio = 0
@@ -1112,17 +1360,344 @@ class AutoML(BaseEstimator):
        return self._state.data_size[0] if self._sample else None

    def pickle(self, output_file_name):
+        """Serialize the AutoML instance to a pickle file.
+
+        Notes:
+            When the trained estimator(s) are Spark-based, they may hold references
+            to SparkContext/SparkSession via Spark ML objects. Such objects are not
+            safely picklable and can cause pickling/broadcast errors.
+
+            This method externalizes Spark ML models into an adjacent artifact
+            directory and stores only lightweight metadata in the pickle.
+        """
+
+        import os
+        import pickle
+        import re
+
+        def _safe_name(name: str) -> str:
+            return re.sub(r"[^A-Za-z0-9_.-]+", "_", name)
+
+        def _iter_trained_estimators():
+            trained = getattr(self, "_trained_estimator", None)
+            if trained is not None:
+                yield "_trained_estimator", trained
+            for est_name in getattr(self, "estimator_list", []) or []:
+                ss = getattr(self, "_search_states", {}).get(est_name)
+                te = ss and getattr(ss, "trained_estimator", None)
+                if te is not None:
+                    yield f"_search_states.{est_name}.trained_estimator", te
+
+        def _scrub_pyspark_refs(root_obj):
+            """Best-effort removal of pyspark objects prior to pickling.
+
+            SparkContext/SparkSession and Spark DataFrame objects are not picklable.
+            This function finds such objects within common containers and instance
+            attributes and replaces them with None, returning a restore mapping.
+            """
+
+            try:
+                import pyspark
+                from pyspark.broadcast import Broadcast
+                from pyspark.sql import DataFrame as SparkDataFrame
+                from pyspark.sql import SparkSession
+
+                try:
+                    import pyspark.pandas as ps
+
+                    psDataFrameType = getattr(ps, "DataFrame", None)
+                    psSeriesType = getattr(ps, "Series", None)
+                except Exception:
+                    psDataFrameType = None
+                    psSeriesType = None
+
+                bad_types = [
+                    pyspark.SparkContext,
+                    SparkSession,
+                    SparkDataFrame,
+                    Broadcast,
+                ]
+                if psDataFrameType is not None:
+                    bad_types.append(psDataFrameType)
+                if psSeriesType is not None:
+                    bad_types.append(psSeriesType)
+                bad_types = tuple(t for t in bad_types if t is not None)
+            except Exception:
+                return {}
+
+            restore = {}
+            visited = set()
+
+            def _mark(parent, key, value, path):
+                restore[(id(parent), key)] = (parent, key, value)
+                try:
+                    if isinstance(parent, dict):
+                        parent[key] = None
+                    elif isinstance(parent, list):
+                        parent[key] = None
+                    elif isinstance(parent, tuple):
+                        # tuples are immutable; we can't modify in-place
+                        pass
+                    else:
+                        setattr(parent, key, None)
+                except Exception:
+                    # Best-effort.
+                    pass
+
+            def _walk(obj, depth, parent=None, key=None, path="self"):
+                if obj is None:
+                    return
+                oid = id(obj)
+                if oid in visited:
+                    return
+                visited.add(oid)
+
+                if isinstance(obj, bad_types):
+                    if parent is not None:
+                        _mark(parent, key, obj, path)
+                    return
+                if depth <= 0:
+                    return
+
+                if isinstance(obj, dict):
+                    for k, v in list(obj.items()):
+                        _walk(v, depth - 1, parent=obj, key=k, path=f"{path}[{k!r}]")
+                    return
+                if isinstance(obj, list):
+                    for i, v in enumerate(list(obj)):
+                        _walk(v, depth - 1, parent=obj, key=i, path=f"{path}[{i}]")
+                    return
+                if isinstance(obj, tuple):
+                    # Can't scrub inside tuples safely; but still inspect for diagnostics.
+                    for i, v in enumerate(obj):
+                        _walk(v, depth - 1, parent=None, key=None, path=f"{path}[{i}]")
+                    return
+                if isinstance(obj, set):
+                    for v in list(obj):
+                        _walk(v, depth - 1, parent=None, key=None, path=f"{path}{{...}}")
+                    return
+
+                d = getattr(obj, "__dict__", None)
+                if isinstance(d, dict):
+                    for attr, v in list(d.items()):
+                        _walk(v, depth - 1, parent=obj, key=attr, path=f"{path}.{attr}")
+
+            _walk(root_obj, depth=6)
+            return restore
+
+        # Temporarily remove non-picklable pieces (e.g., SparkContext-backed objects)
+        # and externalize spark models.
+        estimator_to_training_function = {}
+        spark_restore = []
+        artifact_dir = None
+        state_restore = {}
+        automl_restore = {}
+        scrub_restore = {}
+
+        try:
+            # Signatures are only used for MLflow logging; they are not required
+            # for inference and can capture SparkContext via pyspark objects.
+            for attr in ("estimator_signature", "pipeline_signature"):
+                if hasattr(self, attr):
+                    automl_restore[attr] = getattr(self, attr)
+                    setattr(self, attr, None)
+
+            for estimator in self.estimator_list:
+                search_state = self._search_states[estimator]
+                if hasattr(search_state, "training_function"):
+                    estimator_to_training_function[estimator] = search_state.training_function
+                    del search_state.training_function
+
+            # AutoMLState may keep Spark / pandas-on-Spark dataframes which are not picklable.
+            # They are not required for inference, so strip them for serialization.
+            state = getattr(self, "_state", None)
+            if state is not None:
+                for attr in (
+                    "X_train",
+                    "y_train",
+                    "X_train_all",
+                    "y_train_all",
+                    "X_val",
+                    "y_val",
+                    "weight_val",
+                    "groups_val",
+                    "sample_weight_all",
+                    "groups",
+                    "groups_all",
+                    "kf",
+                ):
+                    if hasattr(state, attr):
+                        state_restore[attr] = getattr(state, attr)
+                        setattr(state, attr, None)
+
+            for key, est in _iter_trained_estimators():
+                if getattr(est, "estimator_baseclass", None) != "spark":
+                    continue
+
+                # Drop training data reference (Spark DataFrame / pandas-on-Spark).
+                old_df_train = getattr(est, "df_train", None)
+                old_model = getattr(est, "_model", None)
+
+                model_meta = None
+                if old_model is not None:
+                    if artifact_dir is None:
+                        artifact_dir = output_file_name + ".flaml_artifacts"
+                        os.makedirs(artifact_dir, exist_ok=True)
+                        # store relative dirname so the pickle+folder can be moved together
+                        self._flaml_pickle_artifacts_dirname = os.path.basename(artifact_dir)
+
+                    model_dir = os.path.join(artifact_dir, _safe_name(key))
+                    # Spark ML models are saved as directories.
+                    try:
+                        writer = old_model.write()
+                        writer.overwrite().save(model_dir)
+                    except Exception as e:
+                        raise RuntimeError(
+                            "Failed to externalize Spark model for pickling. "
+                            "Please ensure the Spark ML model supports write().overwrite().save(path)."
+                        ) from e
+
+                    model_meta = {
+                        "path": os.path.relpath(model_dir, os.path.dirname(output_file_name) or "."),
+                        "class": old_model.__class__.__module__ + "." + old_model.__class__.__name__,
+                    }
+                    # Replace in-memory Spark model with metadata only.
+                    est._model = None
+                    est._flaml_spark_model_meta = model_meta
+
+                est.df_train = None
+                spark_restore.append((est, old_model, old_df_train, model_meta))
+
+            with open(output_file_name, "wb") as f:
+                try:
+                    pickle.dump(self, f, pickle.HIGHEST_PROTOCOL)
+                except Exception:
+                    # Some pyspark objects can still be captured indirectly.
+                    scrub_restore = _scrub_pyspark_refs(self)
+                    if scrub_restore:
+                        f.seek(0)
+                        f.truncate()
+                        pickle.dump(self, f, pickle.HIGHEST_PROTOCOL)
+                    else:
+                        raise
+        finally:
+            # Restore training_function and Spark models so current object remains usable.
+            for estimator, tf in estimator_to_training_function.items():
+                self._search_states[estimator].training_function = tf
+
+            for attr, val in automl_restore.items():
+                setattr(self, attr, val)
+
+            state = getattr(self, "_state", None)
+            if state is not None and state_restore:
+                for attr, val in state_restore.items():
+                    setattr(state, attr, val)
+
+            for est, old_model, old_df_train, model_meta in spark_restore:
+                est._model = old_model
+                est.df_train = old_df_train
+                if model_meta is not None and hasattr(est, "_flaml_spark_model_meta"):
+                    delattr(est, "_flaml_spark_model_meta")
+
+            if scrub_restore:
+                for _, (parent, key, value) in scrub_restore.items():
+                    try:
+                        if isinstance(parent, dict):
+                            parent[key] = value
+                        elif isinstance(parent, list):
+                            parent[key] = value
+                        else:
+                            setattr(parent, key, value)
+                    except Exception:
+                        pass
+
+    @classmethod
+    def load_pickle(cls, input_file_name: str, load_spark_models: bool = True):
+        """Load an AutoML instance saved by :meth:`pickle`.
+
+        Args:
+            input_file_name: Path to the pickle file created by :meth:`pickle`.
+            load_spark_models: Whether to load externalized Spark ML models back
+                into the estimator objects. If False, Spark estimators will remain
+                without their underlying Spark model and cannot be used for predict.
+
+        Returns:
+            The deserialized AutoML instance.
+        """
+        import importlib
+        import os
        import pickle

-        estimator_to_training_function = {}
-        for estimator in self.estimator_list:
-            search_state = self._search_states[estimator]
-            if hasattr(search_state, "training_function"):
-                estimator_to_training_function[estimator] = search_state.training_function
-                del search_state.training_function
+        with open(input_file_name, "rb") as f:
+            automl = pickle.load(f)

-        with open(output_file_name, "wb") as f:
-            pickle.dump(self, f, pickle.HIGHEST_PROTOCOL)
+        # Recreate per-estimator training_function if it was removed for pickling.
+        try:
+            for est_name, ss in getattr(automl, "_search_states", {}).items():
+                if not hasattr(ss, "training_function"):
+                    ss.training_function = partial(
+                        AutoMLState._compute_with_config_base,
+                        state=automl._state,
+                        estimator=est_name,
+                    )
+        except Exception:
+            # Best-effort; training_function is only needed for re-searching.
+            pass
+
+        if not load_spark_models:
+            return automl
+
+        base_dir = os.path.dirname(input_file_name) or "."
+
+        def _iter_trained_estimators_loaded():
+            trained = getattr(automl, "_trained_estimator", None)
+            if trained is not None:
+                yield trained
+            for ss in getattr(automl, "_search_states", {}).values():
+                te = ss and getattr(ss, "trained_estimator", None)
+                if te is not None:
+                    yield te
+
+        for est in _iter_trained_estimators_loaded():
+            meta = getattr(est, "_flaml_spark_model_meta", None)
+            if not meta:
+                continue
+            model_path = meta.get("path")
+            model_class = meta.get("class")
+            if not model_path or not model_class:
+                continue
+
+            abs_model_path = os.path.join(base_dir, model_path)
+
+            module_name, _, class_name = model_class.rpartition(".")
+            try:
+                module = importlib.import_module(module_name)
+                model_cls = getattr(module, class_name)
+            except Exception as e:
+                raise RuntimeError(f"Failed to import Spark model class '{model_class}'") from e
+
+            # Most Spark ML models support either Class.load(path) or Class.read().load(path).
+            if hasattr(model_cls, "load"):
+                est._model = model_cls.load(abs_model_path)
+            elif hasattr(model_cls, "read"):
+                est._model = model_cls.read().load(abs_model_path)
+            else:
+                try:
+                    from pyspark.ml.pipeline import PipelineModel
+
+                    loaded_model = PipelineModel.load(abs_model_path)
+                    if not isinstance(loaded_model, model_cls):
+                        raise RuntimeError(
+                            f"Loaded model type '{type(loaded_model).__name__}' does not match expected type '{model_class}'."
+                        )
+                    est._model = loaded_model
+                except Exception as e:
+                    raise RuntimeError(
+                        f"Spark model class '{model_class}' does not support load/read(). "
+                        "Unable to restore Spark model from artifacts."
+                    ) from e
+
+        return automl

    @property
    def trainable(self) -> Callable[[dict], float | None]:
@@ -1201,6 +1776,7 @@ class AutoML(BaseEstimator):
            n_splits,
            self._df,
            self._sample_weight_full,
+            self._allow_label_overlap,
        )
        self.data_size_full = self._state.data_size_full

@@ -1257,6 +1833,7 @@ class AutoML(BaseEstimator):
        time_col=None,
        cv_score_agg_func=None,
        skip_transform=None,
+        allow_label_overlap=True,
        mlflow_logging=None,
        fit_kwargs_by_estimator=None,
        mlflow_exp_name=None,
@@ -1285,6 +1862,8 @@ class AutoML(BaseEstimator):
                e.g., 'accuracy', 'roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'roc_auc_weighted',
                'roc_auc_ovo_weighted', 'roc_auc_ovr_weighted', 'f1', 'micro_f1', 'macro_f1',
                'log_loss', 'mae', 'mse', 'r2', 'mape'. Default is 'auto'.
+                For a full list of supported built-in metrics, please refer to
+                https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#optimization-metric
                If passing a customized metric function, the function needs to
                have the following input arguments:

@@ -1321,6 +1900,10 @@ class AutoML(BaseEstimator):
                "pred_time": pred_time,
            }
        ```
+                **Note:** When passing a custom metric function, pass the function itself
+                (e.g., `metric=custom_metric`), not the result of calling it
+                (e.g., `metric=custom_metric(...)`). FLAML will call your function
+                internally during the training process.
            task: A string of the task type, e.g.,
                'classification', 'regression', 'ts_forecast_regression',
                'ts_forecast_classification', 'rank', 'seq-classification',
@@ -1343,6 +1926,11 @@ class AutoML(BaseEstimator):
                and 'final_estimator' to specify the passthrough and
                final_estimator in the stacker. The dict can also contain
                'n_jobs' as the key to specify the number of jobs for the stacker.
+                Note: The hyperparameters of a custom 'final_estimator' are NOT
+                automatically tuned. If you provide an estimator instance (e.g.,
+                CatBoostClassifier()), it will use the parameters you specified
+                or their defaults. If 'final_estimator' is not provided, the best
+                model found during the search will be used as the final estimator.
            eval_method: A string of resampling strategy, one of
                ['auto', 'cv', 'holdout'].
            split_ratio: A float of the valiation data percentage for holdout.
@@ -1532,6 +2120,12 @@ class AutoML(BaseEstimator):
        ```

            skip_transform: boolean, default=False | Whether to pre-process data prior to modeling.
+            allow_label_overlap: boolean, default=True | For classification tasks with holdout evaluation,
+                whether to allow label overlap between train and validation sets. When True (default),
+                uses a fast strategy that adds the first instance of missing labels to the set that is
+                missing them, which may create some overlap. When False, uses a precise but slower
+                strategy that intelligently re-splits instances to avoid overlap when possible.
+                Only affects classification tasks with holdout evaluation method.
            mlflow_logging: boolean, default=None | Whether to log the training results to mlflow.
                Default value is None, which means the logging decision is made based on
                AutoML.__init__'s mlflow_logging argument. Not valid if mlflow is not installed.
@@ -1565,6 +2159,14 @@ class AutoML(BaseEstimator):
                the searched learners, such as sample_weight. Below are a few examples of
                estimator-specific parameters:
                    period: int | forecast horizon for all time series forecast tasks.
+                        This is the number of time steps ahead to forecast (e.g., period=12 means
+                        forecasting 12 steps into the future). This represents the forecast horizon
+                        used during model training. Note: during prediction, the output length
+                        equals the length of X_test. FLAML automatically handles feature
+                        engineering for you - sklearn-based models (lgbm, rf, xgboost, etc.) will have
+                        lagged features created automatically, while time series native models (prophet,
+                        arima, sarimax) use their built-in forecasting capabilities. You do NOT need
+                        to manually create lagged features of the target variable.
                    gpu_per_trial: float, default = 0 | A float of the number of gpus per trial,
                        only used by TransformersEstimator, XGBoostSklearnEstimator, and
                        TemporalFusionTransformerEstimator.
@@ -1601,7 +2203,10 @@ class AutoML(BaseEstimator):
        split_ratio = split_ratio or self._settings.get("split_ratio")
        n_splits = n_splits or self._settings.get("n_splits")
        auto_augment = self._settings.get("auto_augment") if auto_augment is None else auto_augment
-        metric = metric or self._settings.get("metric")
+        allow_label_overlap = (
+            self._settings.get("allow_label_overlap") if allow_label_overlap is None else allow_label_overlap
+        )
+        metric = self._settings.get("metric") if metric is None else metric
        estimator_list = estimator_list or self._settings.get("estimator_list")
        log_file_name = self._settings.get("log_file_name") if log_file_name is None else log_file_name
        max_iter = self._settings.get("max_iter") if max_iter is None else max_iter
@@ -1783,6 +2388,7 @@ class AutoML(BaseEstimator):

        self._retrain_in_budget = retrain_full == "budget" and (eval_method == "holdout" and self._state.X_val is None)
        self._auto_augment = auto_augment
+        self._allow_label_overlap = allow_label_overlap

        _sample_size_from_starting_points = {}
        if isinstance(starting_points, dict):
@@ -1840,6 +2446,9 @@ class AutoML(BaseEstimator):
                and (self._min_sample_size * SAMPLE_MULTIPLY_FACTOR < self._state.data_size[0])
            )

+        # Validate metric parameter before processing
+        self._validate_metric_parameter(metric, allow_auto=True)
+
        metric = task.default_metric(metric)
        self._state.metric = metric

@@ -2174,7 +2783,7 @@ class AutoML(BaseEstimator):
                use_spark=True,
                force_cancel=self._force_cancel,
                mlflow_exp_name=self._mlflow_exp_name,
-                automl_info=(mlflow_log_latency,),  # pass automl info to tune.run
+                automl_info=(mlflow_log_latency, self._log_type),  # pass automl info to tune.run
                extra_tag=self.autolog_extra_tag,
                # raise_on_failed_trial=False,
                # keep_checkpoints_num=1,
@@ -2237,7 +2846,9 @@ class AutoML(BaseEstimator):
                if better or self._log_type == "all":
                    self._log_trial(search_state, estimator)
                if self.mlflow_integration:
-                    self.mlflow_integration.record_state(self, search_state, estimator)
+                    self.mlflow_integration.record_state(
+                        self, search_state, estimator, better or self._log_type == "all"
+                    )

    def _log_trial(self, search_state, estimator):
        if self._training_log:
@@ -2479,10 +3090,12 @@ class AutoML(BaseEstimator):
                if better or self._log_type == "all":
                    self._log_trial(search_state, estimator)
                if self.mlflow_integration:
-                    self.mlflow_integration.record_state(self, search_state, estimator)
+                    self.mlflow_integration.record_state(
+                        self, search_state, estimator, better or self._log_type == "all"
+                    )

                logger.info(
-                    " at {:.1f}s,\testimator {}'s best error={:.4f},\tbest estimator {}'s best error={:.4f}".format(
+                    " at {:.1f}s,\testimator {}'s best error={:.4e},\tbest estimator {}'s best error={:.4e}".format(
                        self._state.time_from_start,
                        estimator,
                        search_state.best_loss,
@@ -2659,6 +3272,10 @@ class AutoML(BaseEstimator):
                    # the total degree of parallelization = parallelization degree per estimator * parallelization degree of ensemble
                )
                if isinstance(self._ensemble, dict):
+                    # Note: If a custom final_estimator is provided, it is used as-is without
+                    # hyperparameter tuning. The user is responsible for setting appropriate
+                    # parameters or using defaults. If not provided, the best model found
+                    # during the search (self._trained_estimator) is used.
                    final_estimator = self._ensemble.get("final_estimator", self._trained_estimator)
                    passthrough = self._ensemble.get("passthrough", True)
                    ensemble_n_jobs = self._ensemble.get("n_jobs", ensemble_n_jobs)
--- a/flaml/automl/data.py
+++ b/flaml/automl/data.py
@@ -5,6 +5,7 @@
 import json
 import os
 import random
+import re
 import uuid
 from datetime import datetime, timedelta
 from decimal import ROUND_HALF_UP, Decimal
@@ -708,6 +709,14 @@ def auto_convert_dtypes_pandas(
    """
    if na_values is None:
        na_values = {"NA", "na", "NULL", "null", ""}
+    # Remove the empty string separately (handled by the regex `^\s*$`)
+    vals = [re.escape(v) for v in na_values if v != ""]
+    # Build inner alternation group
+    inner = "|".join(vals) if vals else ""
+    if inner:
+        pattern = re.compile(rf"^\s*(?:{inner})?\s*$")
+    else:
+        pattern = re.compile(r"^\s*$")

    df_converted = df.convert_dtypes()
    schema = {}
@@ -721,7 +730,11 @@ def auto_convert_dtypes_pandas(
    for col in df.columns:
        series = df[col]
        # Replace NA-like values if string
-        series_cleaned = series.map(lambda x: np.nan if isinstance(x, str) and x.strip() in na_values else x)
+        if series.dtype == object:
+            mask = series.astype(str).str.match(pattern)
+            series_cleaned = series.where(~mask, np.nan)
+        else:
+            series_cleaned = series

        # Skip conversion if already non-object data type, except bool which can potentially be categorical
        if (
--- a/flaml/automl/ml.py
+++ b/flaml/automl/ml.py
@@ -311,14 +311,14 @@ def get_y_pred(estimator, X, eval_metric, task: Task):
    else:
        y_pred = estimator.predict(X)

-    if isinstance(y_pred, Series) or isinstance(y_pred, DataFrame):
+    if isinstance(y_pred, (Series, DataFrame)):
        y_pred = y_pred.values

    return y_pred


 def to_numpy(x):
-    if isinstance(x, Series or isinstance(x, DataFrame)):
+    if isinstance(x, (Series, DataFrame)):
        x = x.values
    else:
        x = np.ndarray(x)
@@ -586,7 +586,7 @@ def _eval_estimator(

        # TODO: why are integer labels being cast to str in the first place?

-        if isinstance(val_pred_y, Series) or isinstance(val_pred_y, DataFrame) or isinstance(val_pred_y, np.ndarray):
+        if isinstance(val_pred_y, (Series, DataFrame, np.ndarray)):
            test = val_pred_y if isinstance(val_pred_y, np.ndarray) else val_pred_y.values
            if not np.issubdtype(test.dtype, np.number):
                # some NLP models return a list
@@ -616,7 +616,12 @@ def _eval_estimator(
            logger.warning(f"ValueError {e} happened in `metric_loss_score`, set `val_loss` to `np.inf`")
        metric_for_logging = {"pred_time": pred_time}
        if log_training_metric:
-            train_pred_y = get_y_pred(estimator, X_train, eval_metric, task)
+            # For time series forecasting, X_train may be a sampled dataset whose
+            # test partition can be empty. Use the training partition from X_val
+            # (which is the dataset used to define y_train above) to keep shapes
+            # aligned and avoid empty prediction inputs.
+            X_train_for_metric = X_val.X_train if isinstance(X_val, TimeSeriesDataset) else X_train
+            train_pred_y = get_y_pred(estimator, X_train_for_metric, eval_metric, task)
            metric_for_logging["train_loss"] = metric_loss_score(
                eval_metric,
                train_pred_y,
--- a/flaml/automl/model.py
+++ b/flaml/automl/model.py
@@ -26,6 +26,13 @@ from sklearn.preprocessing import Normalizer
 from sklearn.svm import LinearSVC
 from xgboost import __version__ as xgboost_version

+try:
+    from sklearn.utils._tags import ClassifierTags, RegressorTags
+
+    SKLEARN_TAGS_AVAILABLE = True
+except ImportError:
+    SKLEARN_TAGS_AVAILABLE = False
+
 from flaml import tune
 from flaml.automl.data import group_counts
 from flaml.automl.spark import ERROR as SPARK_ERROR
@@ -135,6 +142,7 @@ class BaseEstimator(sklearn.base.ClassifierMixin, sklearn.base.BaseEstimator):
        self._task = task if isinstance(task, Task) else task_factory(task, None, None)
        self.params = self.config2params(config)
        self.estimator_class = self._model = None
+        self.estimator_baseclass = "sklearn"
        if "_estimator_type" in self.params:
            self._estimator_type = self.params.pop("_estimator_type")
        else:
@@ -147,6 +155,25 @@ class BaseEstimator(sklearn.base.ClassifierMixin, sklearn.base.BaseEstimator):
            params["_estimator_type"] = self._estimator_type
        return params

+    def __sklearn_tags__(self):
+        """Override sklearn tags to respect the _estimator_type attribute.
+
+        This is needed for sklearn 1.7+ which uses get_tags() instead of
+        checking _estimator_type directly. Since BaseEstimator inherits from
+        ClassifierMixin, it would otherwise always be tagged as a classifier.
+        """
+        tags = super().__sklearn_tags__()
+        if hasattr(self, "_estimator_type") and SKLEARN_TAGS_AVAILABLE:
+            if self._estimator_type == "regressor":
+                tags.estimator_type = "regressor"
+                tags.regressor_tags = RegressorTags()
+                tags.classifier_tags = None
+            elif self._estimator_type == "classifier":
+                tags.estimator_type = "classifier"
+                tags.classifier_tags = ClassifierTags()
+                tags.regressor_tags = None
+        return tags
+
    @property
    def classes_(self):
        return self._model.classes_
@@ -294,6 +321,35 @@ class BaseEstimator(sklearn.base.ClassifierMixin, sklearn.base.BaseEstimator):
            train_time = self._fit(X_train, y_train, **kwargs)
        return train_time

+    def preprocess(self, X):
+        """Preprocess data using estimator-level preprocessing.
+
+        This method applies estimator-specific preprocessing transformations to the input data.
+        This is the second level of preprocessing that should be applied after task-level
+        preprocessing (automl.preprocess()). Different estimator types may apply different
+        preprocessing steps (e.g., sparse matrix conversion, dataframe handling).
+
+        Args:
+            X: A numpy array or a dataframe of featurized instances, shape n*m.
+
+        Returns:
+            Preprocessed data ready for the estimator's predict/fit methods.
+
+        Example:
+            ```python
+            automl = AutoML()
+            automl.fit(X_train, y_train, task="classification")
+
+            # First apply task-level preprocessing
+            X_test_task = automl.preprocess(X_test)
+
+            # Then apply estimator-level preprocessing
+            estimator = automl.model
+            X_test_estimator = estimator.preprocess(X_test_task)
+            ```
+        """
+        return self._preprocess(X)
+
    def predict(self, X, **kwargs):
        """Predict label from features.

@@ -439,6 +495,7 @@ class SparkEstimator(BaseEstimator):
            raise SPARK_ERROR
        super().__init__(task, **config)
        self.df_train = None
+        self.estimator_baseclass = "spark"

    def _preprocess(
        self,
@@ -974,7 +1031,7 @@ class TransformersEstimator(BaseEstimator):
        from .nlp.huggingface.utils import tokenize_text
        from .nlp.utils import is_a_list_of_str

-        is_str = str(X.dtypes[0]) in ("string", "str")
+        is_str = str(X.dtypes.iloc[0]) in ("string", "str")
        is_list_of_str = is_a_list_of_str(X[list(X.keys())[0]].to_list()[0])

        if is_str or is_list_of_str:
@@ -1139,16 +1196,31 @@ class TransformersEstimator(BaseEstimator):
                    control.should_save = True
                    control.should_evaluate = True

-        self._trainer = TrainerForAuto(
-            args=self._training_args,
-            model_init=self._model_init,
-            train_dataset=train_dataset,
-            eval_dataset=eval_dataset,
-            tokenizer=self.tokenizer,
-            data_collator=self.data_collator,
-            compute_metrics=self._compute_metrics_by_dataset_name,
-            callbacks=[EarlyStoppingCallbackForAuto],
-        )
+        # Use processing_class for transformers >= 4.44.0, tokenizer for older versions
+        trainer_kwargs = {
+            "args": self._training_args,
+            "model_init": self._model_init,
+            "train_dataset": train_dataset,
+            "eval_dataset": eval_dataset,
+            "data_collator": self.data_collator,
+            "compute_metrics": self._compute_metrics_by_dataset_name,
+            "callbacks": [EarlyStoppingCallbackForAuto],
+        }
+
+        # Check if processing_class parameter is supported (transformers >= 4.44.0)
+        try:
+            import transformers
+            from packaging import version
+
+            if version.parse(transformers.__version__) >= version.parse("4.44.0"):
+                trainer_kwargs["processing_class"] = self.tokenizer
+            else:
+                trainer_kwargs["tokenizer"] = self.tokenizer
+        except (ImportError, AttributeError, ValueError):
+            # Fallback to tokenizer if version check fails
+            trainer_kwargs["tokenizer"] = self.tokenizer
+
+        self._trainer = TrainerForAuto(**trainer_kwargs)

        if self._task in NLG_TASKS:
            setattr(self._trainer, "_is_seq2seq", True)
@@ -2347,8 +2419,11 @@ class SGDEstimator(SKLearnEstimator):
        params = super().config2params(config)
        params["tol"] = params.get("tol", 0.0001)
        params["loss"] = params.get("loss", None)
-        if params["loss"] is None and self._task.is_classification():
-            params["loss"] = "log_loss" if SKLEARN_VERSION >= "1.1" else "log"
+        if params["loss"] is None:
+            if self._task.is_classification():
+                params["loss"] = "log_loss" if SKLEARN_VERSION >= "1.1" else "log"
+            else:
+                params["loss"] = "squared_error"
        if not self._task.is_classification() and "n_jobs" in params:
            params.pop("n_jobs")

--- a/flaml/automl/nlp/huggingface/training_args.py
+++ b/flaml/automl/nlp/huggingface/training_args.py
@@ -5,7 +5,7 @@ from typing import List, Optional
 from flaml.automl.task.task import NLG_TASKS

 try:
-    from transformers import TrainingArguments
+    from transformers import Seq2SeqTrainingArguments as TrainingArguments
 except ImportError:
    TrainingArguments = object

--- a/flaml/automl/nlp/huggingface/utils.py
+++ b/flaml/automl/nlp/huggingface/utils.py
@@ -211,29 +211,28 @@ def tokenize_onedataframe(
    hf_args=None,
    prefix_str=None,
 ):
-    with tokenizer.as_target_tokenizer():
-        _, tokenized_column_names = tokenize_row(
-            dict(X.iloc[0]),
+    _, tokenized_column_names = tokenize_row(
+        dict(X.iloc[0]),
+        tokenizer,
+        prefix=(prefix_str,) if task is SUMMARIZATION else None,
+        task=task,
+        hf_args=hf_args,
+        return_column_name=True,
+    )
+    d = X.apply(
+        lambda x: tokenize_row(
+            x,
            tokenizer,
            prefix=(prefix_str,) if task is SUMMARIZATION else None,
            task=task,
            hf_args=hf_args,
-            return_column_name=True,
-        )
-        d = X.apply(
-            lambda x: tokenize_row(
-                x,
-                tokenizer,
-                prefix=(prefix_str,) if task is SUMMARIZATION else None,
-                task=task,
-                hf_args=hf_args,
-            ),
-            axis=1,
-            result_type="expand",
-        )
-        X_tokenized = pd.DataFrame(columns=tokenized_column_names)
-        X_tokenized[tokenized_column_names] = d
-        return X_tokenized
+        ),
+        axis=1,
+        result_type="expand",
+    )
+    X_tokenized = pd.DataFrame(columns=tokenized_column_names)
+    X_tokenized[tokenized_column_names] = d
+    return X_tokenized


 def tokenize_row(
@@ -396,7 +395,7 @@ def load_model(checkpoint_path, task, num_labels=None):

        if task in (SEQCLASSIFICATION, SEQREGRESSION):
            return AutoModelForSequenceClassification.from_pretrained(
-                checkpoint_path, config=model_config, ignore_mismatched_sizes=True
+                checkpoint_path, config=model_config, ignore_mismatched_sizes=True, trust_remote_code=True
            )
        elif task == TOKENCLASSIFICATION:
            return AutoModelForTokenClassification.from_pretrained(checkpoint_path, config=model_config)
--- a/flaml/automl/nlp/utils.py
+++ b/flaml/automl/nlp/utils.py
@@ -25,9 +25,7 @@ def load_default_huggingface_metric_for_task(task):


 def is_a_list_of_str(this_obj):
-    return (isinstance(this_obj, list) or isinstance(this_obj, np.ndarray)) and all(
-        isinstance(x, str) for x in this_obj
-    )
+    return isinstance(this_obj, (list, np.ndarray)) and all(isinstance(x, str) for x in this_obj)


 def _clean_value(value: Any) -> str:
--- a/flaml/automl/spark/init.py
+++ b/flaml/automl/spark/init.py
@@ -1,3 +1,5 @@
+import atexit
+import logging
 import os

 os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
@@ -10,13 +12,14 @@ try:
    from pyspark.pandas import Series as psSeries
    from pyspark.pandas import set_option
    from pyspark.sql import DataFrame as sparkDataFrame
+    from pyspark.sql import SparkSession
    from pyspark.util import VersionUtils
 except ImportError:

    class psDataFrame:
        pass

-    F = T = ps = sparkDataFrame = psSeries = psDataFrame
+    F = T = ps = sparkDataFrame = SparkSession = psSeries = psDataFrame
    _spark_major_minor_version = set_option = None
    ERROR = ImportError(
        """Please run pip install flaml[spark]
@@ -32,3 +35,60 @@ try:
    from pandas import DataFrame, Series
 except ImportError:
    DataFrame = Series = pd = None
+
+
+logger = logging.getLogger(__name__)
+
+
+def disable_spark_ansi_mode():
+    """Disable Spark ANSI mode if it is enabled."""
+    spark = SparkSession.getActiveSession() if hasattr(SparkSession, "getActiveSession") else None
+    adjusted = False
+    try:
+        ps_conf = ps.get_option("compute.fail_on_ansi_mode")
+    except Exception:
+        ps_conf = None
+    ansi_conf = [None, ps_conf]  # ansi_conf and ps_conf original values
+    # Spark may store the config as string 'true'/'false' (or boolean in some contexts)
+    if spark is not None:
+        ansi_conf[0] = spark.conf.get("spark.sql.ansi.enabled")
+        ansi_enabled = (
+            (isinstance(ansi_conf[0], str) and ansi_conf[0].lower() == "true")
+            or (isinstance(ansi_conf[0], bool) and ansi_conf[0] is True)
+            or ansi_conf[0] is None
+        )
+        try:
+            if ansi_enabled:
+                logger.debug("Adjusting spark.sql.ansi.enabled to false")
+                spark.conf.set("spark.sql.ansi.enabled", "false")
+                adjusted = True
+        except Exception:
+            # If reading/setting options fail for some reason, keep going and let
+            # pandas-on-Spark raise a meaningful error later.
+            logger.exception("Failed to set spark.sql.ansi.enabled")
+
+    if ansi_conf[1]:
+        logger.debug("Adjusting pandas-on-Spark compute.fail_on_ansi_mode to False")
+        ps.set_option("compute.fail_on_ansi_mode", False)
+        adjusted = True
+
+    return spark, ansi_conf, adjusted
+
+
+def restore_spark_ansi_mode(spark, ansi_conf, adjusted):
+    """Restore Spark ANSI mode to its original setting."""
+    # Restore the original spark.sql.ansi.enabled to avoid persistent side-effects.
+    if adjusted and spark and ansi_conf[0] is not None:
+        try:
+            logger.debug(f"Restoring spark.sql.ansi.enabled to {ansi_conf[0]}")
+            spark.conf.set("spark.sql.ansi.enabled", ansi_conf[0])
+        except Exception:
+            logger.exception("Failed to restore spark.sql.ansi.enabled")
+
+    if adjusted and ansi_conf[1]:
+        logger.debug(f"Restoring pandas-on-Spark compute.fail_on_ansi_mode to {ansi_conf[1]}")
+        ps.set_option("compute.fail_on_ansi_mode", ansi_conf[1])
+
+
+spark, ansi_conf, adjusted = disable_spark_ansi_mode()
+atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
--- a/flaml/automl/spark/utils.py
+++ b/flaml/automl/spark/utils.py
@@ -59,17 +59,29 @@ def to_pandas_on_spark(
    ```
    """
    set_option("compute.default_index_type", default_index_type)
-    if isinstance(df, (DataFrame, Series)):
-        return ps.from_pandas(df)
-    elif isinstance(df, sparkDataFrame):
-        if _spark_major_minor_version[0] == 3 and _spark_major_minor_version[1] < 3:
-            return df.to_pandas_on_spark(index_col=index_col)
+    try:
+        orig_ps_conf = ps.get_option("compute.fail_on_ansi_mode")
+    except Exception:
+        orig_ps_conf = None
+    if orig_ps_conf:
+        ps.set_option("compute.fail_on_ansi_mode", False)
+
+    try:
+        if isinstance(df, (DataFrame, Series)):
+            return ps.from_pandas(df)
+        elif isinstance(df, sparkDataFrame):
+            if _spark_major_minor_version[0] == 3 and _spark_major_minor_version[1] < 3:
+                return df.to_pandas_on_spark(index_col=index_col)
+            else:
+                return df.pandas_api(index_col=index_col)
+        elif isinstance(df, (psDataFrame, psSeries)):
+            return df
        else:
-            return df.pandas_api(index_col=index_col)
-    elif isinstance(df, (psDataFrame, psSeries)):
-        return df
-    else:
-        raise TypeError(f"{type(df)} is not one of pandas.DataFrame, pandas.Series and pyspark.sql.DataFrame")
+            raise TypeError(f"{type(df)} is not one of pandas.DataFrame, pandas.Series and pyspark.sql.DataFrame")
+    finally:
+        # Restore original config
+        if orig_ps_conf:
+            ps.set_option("compute.fail_on_ansi_mode", orig_ps_conf)


 def train_test_split_pyspark(
--- a/flaml/automl/state.py
+++ b/flaml/automl/state.py
@@ -37,10 +37,9 @@ class SearchState:
        if isinstance(domain_one_dim, sample.Domain):
            renamed_type = list(inspect.signature(domain_one_dim.is_valid).parameters.values())[0].annotation
            type_match = (
-                renamed_type == Any
+                renamed_type is Any
                or isinstance(value_one_dim, renamed_type)
-                or isinstance(value_one_dim, int)
-                and renamed_type is float
+                or (renamed_type is float and isinstance(value_one_dim, int))
            )
            if not (type_match and domain_one_dim.is_valid(value_one_dim)):
                return False
--- a/flaml/automl/task/generic_task.py
+++ b/flaml/automl/task/generic_task.py
@@ -365,6 +365,465 @@ class GenericTask(Task):
            X_train, X_val, y_train, y_val = GenericTask._split_pyspark(state, X, y, split_ratio, stratify)
        return X_train, X_val, y_train, y_val

+    def _handle_missing_labels_fast(
+        self,
+        state,
+        X_train,
+        X_val,
+        y_train,
+        y_val,
+        X_train_all,
+        y_train_all,
+        is_spark_dataframe,
+        data_is_df,
+    ):
+        """Handle missing labels by adding first instance to the set with missing label.
+
+        This is the faster version that may create some overlap but ensures all labels
+        are present in both sets. If a label is missing from train, it adds the first
+        instance to train. If a label is missing from val, it adds the first instance to val.
+        If no labels are missing, no instances are duplicated.
+
+        Args:
+            state: The state object containing fit parameters
+            X_train, X_val: Training and validation features
+            y_train, y_val: Training and validation labels
+            X_train_all, y_train_all: Complete dataset
+            is_spark_dataframe: Whether data is pandas_on_spark
+            data_is_df: Whether data is DataFrame/Series
+
+        Returns:
+            Tuple of (X_train, X_val, y_train, y_val) with missing labels added
+        """
+        # Check which labels are present in train and val sets
+        if is_spark_dataframe:
+            label_set_train, _ = unique_pandas_on_spark(y_train)
+            label_set_val, _ = unique_pandas_on_spark(y_val)
+            label_set_all, first = unique_value_first_index(y_train_all)
+        else:
+            label_set_all, first = unique_value_first_index(y_train_all)
+            label_set_train = np.unique(y_train)
+            label_set_val = np.unique(y_val)
+
+        # Find missing labels
+        missing_in_train = np.setdiff1d(label_set_all, label_set_train)
+        missing_in_val = np.setdiff1d(label_set_all, label_set_val)
+
+        # Add first instance of missing labels to train set
+        if len(missing_in_train) > 0:
+            missing_train_indices = []
+            for label in missing_in_train:
+                label_matches = np.where(label_set_all == label)[0]
+                if len(label_matches) > 0 and label_matches[0] < len(first):
+                    missing_train_indices.append(first[label_matches[0]])
+
+            if len(missing_train_indices) > 0:
+                X_missing_train = (
+                    iloc_pandas_on_spark(X_train_all, missing_train_indices)
+                    if is_spark_dataframe
+                    else X_train_all.iloc[missing_train_indices]
+                    if data_is_df
+                    else X_train_all[missing_train_indices]
+                )
+                y_missing_train = (
+                    iloc_pandas_on_spark(y_train_all, missing_train_indices)
+                    if is_spark_dataframe
+                    else y_train_all.iloc[missing_train_indices]
+                    if isinstance(y_train_all, (pd.Series, psSeries))
+                    else y_train_all[missing_train_indices]
+                )
+                X_train = concat(X_missing_train, X_train)
+                y_train = concat(y_missing_train, y_train) if data_is_df else np.concatenate([y_missing_train, y_train])
+
+                # Handle sample_weight if present
+                if "sample_weight" in state.fit_kwargs:
+                    sample_weight_source = (
+                        state.sample_weight_all
+                        if hasattr(state, "sample_weight_all")
+                        else state.fit_kwargs.get("sample_weight")
+                    )
+                    if sample_weight_source is not None and max(missing_train_indices) < len(sample_weight_source):
+                        missing_weights = (
+                            sample_weight_source[missing_train_indices]
+                            if isinstance(sample_weight_source, np.ndarray)
+                            else sample_weight_source.iloc[missing_train_indices]
+                        )
+                        state.fit_kwargs["sample_weight"] = concat(missing_weights, state.fit_kwargs["sample_weight"])
+
+        # Add first instance of missing labels to val set
+        if len(missing_in_val) > 0:
+            missing_val_indices = []
+            for label in missing_in_val:
+                label_matches = np.where(label_set_all == label)[0]
+                if len(label_matches) > 0 and label_matches[0] < len(first):
+                    missing_val_indices.append(first[label_matches[0]])
+
+            if len(missing_val_indices) > 0:
+                X_missing_val = (
+                    iloc_pandas_on_spark(X_train_all, missing_val_indices)
+                    if is_spark_dataframe
+                    else X_train_all.iloc[missing_val_indices]
+                    if data_is_df
+                    else X_train_all[missing_val_indices]
+                )
+                y_missing_val = (
+                    iloc_pandas_on_spark(y_train_all, missing_val_indices)
+                    if is_spark_dataframe
+                    else y_train_all.iloc[missing_val_indices]
+                    if isinstance(y_train_all, (pd.Series, psSeries))
+                    else y_train_all[missing_val_indices]
+                )
+                X_val = concat(X_missing_val, X_val)
+                y_val = concat(y_missing_val, y_val) if data_is_df else np.concatenate([y_missing_val, y_val])
+
+                # Handle sample_weight if present
+                if (
+                    "sample_weight" in state.fit_kwargs
+                    and hasattr(state, "weight_val")
+                    and state.weight_val is not None
+                ):
+                    sample_weight_source = (
+                        state.sample_weight_all
+                        if hasattr(state, "sample_weight_all")
+                        else state.fit_kwargs.get("sample_weight")
+                    )
+                    if sample_weight_source is not None and max(missing_val_indices) < len(sample_weight_source):
+                        missing_weights = (
+                            sample_weight_source[missing_val_indices]
+                            if isinstance(sample_weight_source, np.ndarray)
+                            else sample_weight_source.iloc[missing_val_indices]
+                        )
+                        state.weight_val = concat(missing_weights, state.weight_val)
+
+        return X_train, X_val, y_train, y_val
+
+    def _handle_missing_labels_no_overlap(
+        self,
+        state,
+        X_train,
+        X_val,
+        y_train,
+        y_val,
+        X_train_all,
+        y_train_all,
+        is_spark_dataframe,
+        data_is_df,
+        split_ratio,
+    ):
+        """Handle missing labels intelligently to avoid overlap when possible.
+
+        This is the slower but more precise version that:
+        - For single-instance classes: Adds to both sets (unavoidable overlap)
+        - For multi-instance classes: Re-splits them properly to avoid overlap
+
+        Args:
+            state: The state object containing fit parameters
+            X_train, X_val: Training and validation features
+            y_train, y_val: Training and validation labels
+            X_train_all, y_train_all: Complete dataset
+            is_spark_dataframe: Whether data is pandas_on_spark
+            data_is_df: Whether data is DataFrame/Series
+            split_ratio: The ratio for splitting
+
+        Returns:
+            Tuple of (X_train, X_val, y_train, y_val) with missing labels handled
+        """
+        # Check which labels are present in train and val sets
+        if is_spark_dataframe:
+            label_set_train, _ = unique_pandas_on_spark(y_train)
+            label_set_val, _ = unique_pandas_on_spark(y_val)
+            label_set_all, first = unique_value_first_index(y_train_all)
+        else:
+            label_set_all, first = unique_value_first_index(y_train_all)
+            label_set_train = np.unique(y_train)
+            label_set_val = np.unique(y_val)
+
+        # Find missing labels
+        missing_in_train = np.setdiff1d(label_set_all, label_set_train)
+        missing_in_val = np.setdiff1d(label_set_all, label_set_val)
+
+        # Handle missing labels intelligently
+        # For classes with only 1 instance: add to both sets (unavoidable overlap)
+        # For classes with multiple instances: move/split them properly to avoid overlap
+
+        if len(missing_in_train) > 0:
+            # Process missing labels in training set
+            for label in missing_in_train:
+                # Find all indices for this label in the original data
+                if is_spark_dataframe:
+                    label_indices = np.where(y_train_all.to_numpy() == label)[0].tolist()
+                else:
+                    label_indices = np.where(np.asarray(y_train_all) == label)[0].tolist()
+
+                num_instances = len(label_indices)
+
+                if num_instances == 1:
+                    # Single instance: must add to both train and val (unavoidable overlap)
+                    X_single = (
+                        iloc_pandas_on_spark(X_train_all, label_indices)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[label_indices]
+                        if data_is_df
+                        else X_train_all[label_indices]
+                    )
+                    y_single = (
+                        iloc_pandas_on_spark(y_train_all, label_indices)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[label_indices]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[label_indices]
+                    )
+                    X_train = concat(X_single, X_train)
+                    y_train = concat(y_single, y_train) if data_is_df else np.concatenate([y_single, y_train])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs:
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and label_indices[0] < len(sample_weight_source):
+                            single_weight = (
+                                sample_weight_source[label_indices]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[label_indices]
+                            )
+                            state.fit_kwargs["sample_weight"] = concat(single_weight, state.fit_kwargs["sample_weight"])
+                else:
+                    # Multiple instances: move some from val to train (no overlap needed)
+                    # Calculate how many to move to train (leave at least 1 in val)
+                    num_to_train = max(1, min(num_instances - 1, int(num_instances * (1 - split_ratio))))
+                    indices_to_move = label_indices[:num_to_train]
+
+                    X_to_move = (
+                        iloc_pandas_on_spark(X_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[indices_to_move]
+                        if data_is_df
+                        else X_train_all[indices_to_move]
+                    )
+                    y_to_move = (
+                        iloc_pandas_on_spark(y_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[indices_to_move]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[indices_to_move]
+                    )
+
+                    # Add to train
+                    X_train = concat(X_to_move, X_train)
+                    y_train = concat(y_to_move, y_train) if data_is_df else np.concatenate([y_to_move, y_train])
+
+                    # Remove from val (they are currently all in val)
+                    if is_spark_dataframe:
+                        val_mask = ~y_val.isin([label])
+                        X_val = X_val[val_mask]
+                        y_val = y_val[val_mask]
+                    else:
+                        val_mask = np.asarray(y_val) != label
+                        if data_is_df:
+                            X_val = X_val[val_mask]
+                            y_val = y_val[val_mask]
+                        else:
+                            X_val = X_val[val_mask]
+                            y_val = y_val[val_mask]
+
+                    # Add remaining instances back to val
+                    remaining_indices = label_indices[num_to_train:]
+                    if len(remaining_indices) > 0:
+                        X_remaining = (
+                            iloc_pandas_on_spark(X_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else X_train_all.iloc[remaining_indices]
+                            if data_is_df
+                            else X_train_all[remaining_indices]
+                        )
+                        y_remaining = (
+                            iloc_pandas_on_spark(y_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else y_train_all.iloc[remaining_indices]
+                            if isinstance(y_train_all, (pd.Series, psSeries))
+                            else y_train_all[remaining_indices]
+                        )
+                        X_val = concat(X_remaining, X_val)
+                        y_val = concat(y_remaining, y_val) if data_is_df else np.concatenate([y_remaining, y_val])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs:
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and max(indices_to_move) < len(sample_weight_source):
+                            weights_to_move = (
+                                sample_weight_source[indices_to_move]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[indices_to_move]
+                            )
+                            state.fit_kwargs["sample_weight"] = concat(
+                                weights_to_move, state.fit_kwargs["sample_weight"]
+                            )
+
+                            if (
+                                len(remaining_indices) > 0
+                                and hasattr(state, "weight_val")
+                                and state.weight_val is not None
+                            ):
+                                # Remove and re-add weights for val
+                                if isinstance(state.weight_val, np.ndarray):
+                                    state.weight_val = state.weight_val[val_mask]
+                                else:
+                                    state.weight_val = state.weight_val[val_mask]
+
+                                if max(remaining_indices) < len(sample_weight_source):
+                                    remaining_weights = (
+                                        sample_weight_source[remaining_indices]
+                                        if isinstance(sample_weight_source, np.ndarray)
+                                        else sample_weight_source.iloc[remaining_indices]
+                                    )
+                                    state.weight_val = concat(remaining_weights, state.weight_val)
+
+        if len(missing_in_val) > 0:
+            # Process missing labels in validation set
+            for label in missing_in_val:
+                # Find all indices for this label in the original data
+                if is_spark_dataframe:
+                    label_indices = np.where(y_train_all.to_numpy() == label)[0].tolist()
+                else:
+                    label_indices = np.where(np.asarray(y_train_all) == label)[0].tolist()
+
+                num_instances = len(label_indices)
+
+                if num_instances == 1:
+                    # Single instance: must add to both train and val (unavoidable overlap)
+                    X_single = (
+                        iloc_pandas_on_spark(X_train_all, label_indices)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[label_indices]
+                        if data_is_df
+                        else X_train_all[label_indices]
+                    )
+                    y_single = (
+                        iloc_pandas_on_spark(y_train_all, label_indices)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[label_indices]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[label_indices]
+                    )
+                    X_val = concat(X_single, X_val)
+                    y_val = concat(y_single, y_val) if data_is_df else np.concatenate([y_single, y_val])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs and hasattr(state, "weight_val"):
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and label_indices[0] < len(sample_weight_source):
+                            single_weight = (
+                                sample_weight_source[label_indices]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[label_indices]
+                            )
+                            if state.weight_val is not None:
+                                state.weight_val = concat(single_weight, state.weight_val)
+                else:
+                    # Multiple instances: move some from train to val (no overlap needed)
+                    # Calculate how many to move to val (leave at least 1 in train)
+                    num_to_val = max(1, min(num_instances - 1, int(num_instances * split_ratio)))
+                    indices_to_move = label_indices[:num_to_val]
+
+                    X_to_move = (
+                        iloc_pandas_on_spark(X_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[indices_to_move]
+                        if data_is_df
+                        else X_train_all[indices_to_move]
+                    )
+                    y_to_move = (
+                        iloc_pandas_on_spark(y_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[indices_to_move]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[indices_to_move]
+                    )
+
+                    # Add to val
+                    X_val = concat(X_to_move, X_val)
+                    y_val = concat(y_to_move, y_val) if data_is_df else np.concatenate([y_to_move, y_val])
+
+                    # Remove from train (they are currently all in train)
+                    if is_spark_dataframe:
+                        train_mask = ~y_train.isin([label])
+                        X_train = X_train[train_mask]
+                        y_train = y_train[train_mask]
+                    else:
+                        train_mask = np.asarray(y_train) != label
+                        if data_is_df:
+                            X_train = X_train[train_mask]
+                            y_train = y_train[train_mask]
+                        else:
+                            X_train = X_train[train_mask]
+                            y_train = y_train[train_mask]
+
+                    # Add remaining instances back to train
+                    remaining_indices = label_indices[num_to_val:]
+                    if len(remaining_indices) > 0:
+                        X_remaining = (
+                            iloc_pandas_on_spark(X_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else X_train_all.iloc[remaining_indices]
+                            if data_is_df
+                            else X_train_all[remaining_indices]
+                        )
+                        y_remaining = (
+                            iloc_pandas_on_spark(y_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else y_train_all.iloc[remaining_indices]
+                            if isinstance(y_train_all, (pd.Series, psSeries))
+                            else y_train_all[remaining_indices]
+                        )
+                        X_train = concat(X_remaining, X_train)
+                        y_train = concat(y_remaining, y_train) if data_is_df else np.concatenate([y_remaining, y_train])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs:
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and max(indices_to_move) < len(sample_weight_source):
+                            weights_to_move = (
+                                sample_weight_source[indices_to_move]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[indices_to_move]
+                            )
+                            if hasattr(state, "weight_val") and state.weight_val is not None:
+                                state.weight_val = concat(weights_to_move, state.weight_val)
+
+                            if len(remaining_indices) > 0:
+                                # Remove and re-add weights for train
+                                if isinstance(state.fit_kwargs["sample_weight"], np.ndarray):
+                                    state.fit_kwargs["sample_weight"] = state.fit_kwargs["sample_weight"][train_mask]
+                                else:
+                                    state.fit_kwargs["sample_weight"] = state.fit_kwargs["sample_weight"][train_mask]
+
+                                if max(remaining_indices) < len(sample_weight_source):
+                                    remaining_weights = (
+                                        sample_weight_source[remaining_indices]
+                                        if isinstance(sample_weight_source, np.ndarray)
+                                        else sample_weight_source.iloc[remaining_indices]
+                                    )
+                                    state.fit_kwargs["sample_weight"] = concat(
+                                        remaining_weights, state.fit_kwargs["sample_weight"]
+                                    )
+
+        return X_train, X_val, y_train, y_val
+
    def prepare_data(
        self,
        state,
@@ -377,6 +836,7 @@ class GenericTask(Task):
        n_splits,
        data_is_df,
        sample_weight_full,
+        allow_label_overlap=True,
    ) -> int:
        X_val, y_val = state.X_val, state.y_val
        if issparse(X_val):
@@ -505,59 +965,46 @@ class GenericTask(Task):
            elif self.is_classification():
                # for classification, make sure the labels are complete in both
                # training and validation data
-                label_set, first = unique_value_first_index(y_train_all)
-                rest = []
-                last = 0
-                first.sort()
-                for i in range(len(first)):
-                    rest.extend(range(last, first[i]))
-                    last = first[i] + 1
-                rest.extend(range(last, len(y_train_all)))
-                X_first = X_train_all.iloc[first] if data_is_df else X_train_all[first]
-                if len(first) < len(y_train_all) / 2:
-                    # Get X_rest and y_rest with drop, sparse matrix can't apply np.delete
-                    X_rest = (
-                        np.delete(X_train_all, first, axis=0)
-                        if isinstance(X_train_all, np.ndarray)
-                        else X_train_all.drop(first.tolist())
-                        if data_is_df
-                        else X_train_all[rest]
-                    )
-                    y_rest = (
-                        np.delete(y_train_all, first, axis=0)
-                        if isinstance(y_train_all, np.ndarray)
-                        else y_train_all.drop(first.tolist())
-                        if data_is_df
-                        else y_train_all[rest]
+                stratify = y_train_all if split_type == "stratified" else None
+                X_train, X_val, y_train, y_val = self._train_test_split(
+                    state, X_train_all, y_train_all, split_ratio=split_ratio, stratify=stratify
+                )
+
+                # Handle missing labels using the appropriate strategy
+                if allow_label_overlap:
+                    # Fast version: adds first instance to set with missing label (may create overlap)
+                    X_train, X_val, y_train, y_val = self._handle_missing_labels_fast(
+                        state,
+                        X_train,
+                        X_val,
+                        y_train,
+                        y_val,
+                        X_train_all,
+                        y_train_all,
+                        is_spark_dataframe,
+                        data_is_df,
                    )
                else:
-                    X_rest = (
-                        iloc_pandas_on_spark(X_train_all, rest)
-                        if is_spark_dataframe
-                        else X_train_all.iloc[rest]
-                        if data_is_df
-                        else X_train_all[rest]
+                    # Precise version: avoids overlap when possible (slower)
+                    X_train, X_val, y_train, y_val = self._handle_missing_labels_no_overlap(
+                        state,
+                        X_train,
+                        X_val,
+                        y_train,
+                        y_val,
+                        X_train_all,
+                        y_train_all,
+                        is_spark_dataframe,
+                        data_is_df,
+                        split_ratio,
                    )
-                    y_rest = (
-                        iloc_pandas_on_spark(y_train_all, rest)
-                        if is_spark_dataframe
-                        else y_train_all.iloc[rest]
-                        if data_is_df
-                        else y_train_all[rest]
-                    )
-                stratify = y_rest if split_type == "stratified" else None
-                X_train, X_val, y_train, y_val = self._train_test_split(
-                    state, X_rest, y_rest, first, rest, split_ratio, stratify
-                )
-                X_train = concat(X_first, X_train)
-                y_train = concat(label_set, y_train) if data_is_df else np.concatenate([label_set, y_train])
-                X_val = concat(X_first, X_val)
-                y_val = concat(label_set, y_val) if data_is_df else np.concatenate([label_set, y_val])

                if isinstance(y_train, (psDataFrame, pd.DataFrame)) and y_train.shape[1] == 1:
                    y_train = y_train[y_train.columns[0]]
                    y_val = y_val[y_val.columns[0]]
-                    y_train.name = y_val.name = y_rest.name
+                    # Only set name if y_train_all is a Series (not a DataFrame)
+                    if isinstance(y_train_all, (pd.Series, psSeries)):
+                        y_train.name = y_val.name = y_train_all.name

            elif self.is_regression():
                X_train, X_val, y_train, y_val = self._train_test_split(
--- a/flaml/automl/task/time_series_task.py
+++ b/flaml/automl/task/time_series_task.py
@@ -151,7 +151,7 @@ class TimeSeriesTask(Task):
                raise ValueError("Must supply either X_train_all and y_train_all, or dataframe and label")

            try:
-                dataframe[self.time_col] = pd.to_datetime(dataframe[self.time_col])
+                dataframe.loc[:, self.time_col] = pd.to_datetime(dataframe[self.time_col])
            except Exception:
                raise ValueError(
                    f"For '{TS_FORECAST}' task, time column {self.time_col} must contain timestamp values."
@@ -386,9 +386,8 @@ class TimeSeriesTask(Task):
        return X

    def preprocess(self, X, transformer=None):
-        if isinstance(X, pd.DataFrame) or isinstance(X, np.ndarray) or isinstance(X, pd.Series):
-            X = X.copy()
-            X = normalize_ts_data(X, self.target_names, self.time_col)
+        if isinstance(X, (pd.DataFrame, np.ndarray, pd.Series)):
+            X = normalize_ts_data(X.copy(), self.target_names, self.time_col)
            return self._preprocess(X, transformer)
        elif isinstance(X, int):
            return X
--- a/flaml/automl/time_series/sklearn.py
+++ b/flaml/automl/time_series/sklearn.py
@@ -17,24 +17,30 @@ from sklearn.preprocessing import StandardScaler


 def make_lag_features(X: pd.DataFrame, y: pd.Series, lags: int):
-    """Transform input data X, y into autoregressive form - shift
-    them appropriately based on horizon and create `lags` columns.
+    """Transform input data X, y into autoregressive form by creating `lags` columns.
+
+    This function is called automatically by FLAML during the training process
+    to convert time series data into a format suitable for sklearn-based regression
+    models (e.g., lgbm, rf, xgboost). Users do NOT need to manually call this function
+    or create lagged features themselves.

    Parameters
    ----------
    X : pandas.DataFrame
-        Input features.
+        Input feature DataFrame, which may contain temporal features and/or exogenous variables.

    y : array_like, (1d)
-        Target vector.
+        Target vector (time series values to forecast).

-    horizon : int
-        length of X for `predict` method
+    lags : int
+        Number of lagged time steps to use as features.

    Returns
    -------
    pandas.DataFrame
-        shifted dataframe with `lags` columns
+        Shifted dataframe with `lags` columns for each original feature.
+        The target variable y is also lagged to prevent data leakage
+        (i.e., we use y(t-1), y(t-2), ..., y(t-lags) to predict y(t)).
    """
    lag_features = []

@@ -55,6 +61,17 @@ def make_lag_features(X: pd.DataFrame, y: pd.Series, lags: int):


 class SklearnWrapper:
+    """Wrapper class for using sklearn-based models for time series forecasting.
+
+    This wrapper automatically handles the transformation of time series data into
+    a supervised learning format by creating lagged features. It trains separate
+    models for each step in the forecast horizon.
+
+    Users typically don't interact with this class directly - it's used internally
+    by FLAML when sklearn-based estimators (lgbm, rf, xgboost, etc.) are selected
+    for time series forecasting tasks.
+    """
+
    def __init__(
        self,
        model_class: type,
@@ -76,6 +93,8 @@ class SklearnWrapper:
            self.pca = None

    def fit(self, X: pd.DataFrame, y: pd.Series, **kwargs):
+        if "is_retrain" in kwargs:
+            kwargs.pop("is_retrain")
        self._X = X
        self._y = y

@@ -92,7 +111,14 @@ class SklearnWrapper:

        for i, model in enumerate(self.models):
            offset = i + self.lags
-            model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
+            if len(X) - offset > 2:
+                # series with length 2 will meet All features are either constant or ignored.
+                # TODO: see why the non-constant features are ignored. Selector?
+                model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
+            elif len(X) > offset and "catboost" not in str(model).lower():
+                model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
+            else:
+                print("[INFO]: Length of data should longer than period + lags.")
        return self

    def predict(self, X, X_train=None, y_train=None):
--- a/flaml/automl/time_series/tcn.py
+++ b/flaml/automl/time_series/tcn.py
@@ -264,7 +264,8 @@ class TCNEstimator(TimeSeriesEstimator):
    def predict(self, X):
        X = self.enrich(X)
        if isinstance(X, TimeSeriesDataset):
-            df = X.X_val
+            # Use X_train if X_val is empty (e.g., when computing training metrics)
+            df = X.X_val if len(X.test_data) > 0 else X.X_train
        else:
            df = X
        dataset = DataframeDataset(
--- a/flaml/automl/time_series/tft.py
+++ b/flaml/automl/time_series/tft.py
@@ -197,7 +197,11 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
        last_data_cols = self.group_ids.copy()
        last_data_cols.append(self.target_names[0])
        last_data = self.data[lambda x: x.time_idx == x.time_idx.max()][last_data_cols]
-        decoder_data = X.X_val if isinstance(X, TimeSeriesDataset) else X
+        # Use X_train if test_data is empty (e.g., when computing training metrics)
+        if isinstance(X, TimeSeriesDataset):
+            decoder_data = X.X_val if len(X.test_data) > 0 else X.X_train
+        else:
+            decoder_data = X
        if "time_idx" not in decoder_data:
            decoder_data = add_time_idx_col(decoder_data)
        decoder_data["time_idx"] += encoder_data["time_idx"].max() + 1 - decoder_data["time_idx"].min()
--- a/flaml/automl/time_series/ts_data.py
+++ b/flaml/automl/time_series/ts_data.py
@@ -121,7 +121,12 @@ class TimeSeriesDataset:

    @property
    def X_all(self) -> pd.DataFrame:
-        return pd.concat([self.X_train, self.X_val], axis=0)
+        # Remove empty or all-NA columns before concatenation
+        X_train_filtered = self.X_train.dropna(axis=1, how="all")
+        X_val_filtered = self.X_val.dropna(axis=1, how="all")
+
+        # Concatenate the filtered DataFrames
+        return pd.concat([X_train_filtered, X_val_filtered], axis=0)

    @property
    def y_train(self) -> pd.DataFrame:
@@ -472,7 +477,7 @@ class DataTransformerTS:
                if "__NAN__" not in X[col].cat.categories:
                    X[col] = X[col].cat.add_categories("__NAN__").fillna("__NAN__")
            else:
-                X[col] = X[col].fillna("__NAN__")
+                X[col] = X[col].fillna("__NAN__").infer_objects(copy=False)
                X[col] = X[col].astype("category")

        for column in self.num_columns:
@@ -541,14 +546,12 @@ def normalize_ts_data(X_train_all, target_names, time_col, y_train_all=None):


 def validate_data_basic(X_train_all, y_train_all):
-    assert isinstance(X_train_all, np.ndarray) or issparse(X_train_all) or isinstance(X_train_all, pd.DataFrame), (
-        "X_train_all must be a numpy array, a pandas dataframe, " "or Scipy sparse matrix."
-    )
+    assert isinstance(X_train_all, (np.ndarray, DataFrame)) or issparse(
+        X_train_all
+    ), "X_train_all must be a numpy array, a pandas dataframe, or Scipy sparse matrix."

-    assert (
-        isinstance(y_train_all, np.ndarray)
-        or isinstance(y_train_all, pd.Series)
-        or isinstance(y_train_all, pd.DataFrame)
+    assert isinstance(
+        y_train_all, (np.ndarray, pd.Series, pd.DataFrame)
    ), "y_train_all must be a numpy array or a pandas series or DataFrame."

    assert X_train_all.size != 0 and y_train_all.size != 0, "Input data must not be empty, use None if no data"
--- a/flaml/automl/time_series/ts_model.py
+++ b/flaml/automl/time_series/ts_model.py
@@ -194,7 +194,13 @@ class Orbit(TimeSeriesEstimator):

        elif isinstance(X, TimeSeriesDataset):
            data = X
-            X = data.test_data[[self.time_col] + X.regressors]
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data[data.regressors + [data.time_col]]
+            else:
+                X = data.train_data[data.regressors + [data.time_col]]

        if self._model is not None:
            forecast = self._model.predict(X, **kwargs)
@@ -301,7 +307,13 @@ class Prophet(TimeSeriesEstimator):

        if isinstance(X, TimeSeriesDataset):
            data = X
-            X = data.test_data[data.regressors + [data.time_col]]
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data[data.regressors + [data.time_col]]
+            else:
+                X = data.train_data[data.regressors + [data.time_col]]

        X = X.rename(columns={self.time_col: "ds"})
        if self._model is not None:
@@ -327,11 +339,19 @@ class StatsModelsEstimator(TimeSeriesEstimator):

        if isinstance(X, TimeSeriesDataset):
            data = X
-            X = data.test_data[data.regressors + [data.time_col]]
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data[data.regressors + [data.time_col]]
+            else:
+                X = data.train_data[data.regressors + [data.time_col]]
        else:
            X = X[self.regressors + [self.time_col]]

        if isinstance(X, DataFrame):
+            if X.shape[0] == 0:
+                return pd.Series([], name=self.target_names[0], dtype=float)
            start = X[self.time_col].iloc[0]
            end = X[self.time_col].iloc[-1]
            if len(self.regressors):
@@ -829,6 +849,13 @@ class TS_SKLearn(TimeSeriesEstimator):
        if isinstance(X, TimeSeriesDataset):
            data = X
            X = data.test_data
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data
+            else:
+                X = data.train_data

        if self._model is not None:
            X = X[self.regressors]
--- a/flaml/default/estimator.py
+++ b/flaml/default/estimator.py
@@ -95,6 +95,27 @@ def flamlize_estimator(super_class, name: str, task: str, alternatives=None):
        def fit(self, X, y, *args, **params):
            hyperparams, estimator_name, X, y_transformed = self.suggest_hyperparams(X, y)
            self.set_params(**hyperparams)
+
+            # Transform eval_set if present
+            if "eval_set" in params and params["eval_set"] is not None:
+                transformed_eval_set = []
+                for eval_X, eval_y in params["eval_set"]:
+                    # Transform features
+                    eval_X_transformed = self._feature_transformer.transform(eval_X)
+                    # Transform labels if applicable
+                    if self._label_transformer and estimator_name in [
+                        "rf",
+                        "extra_tree",
+                        "xgboost",
+                        "xgb_limitdepth",
+                        "choose_xgb",
+                    ]:
+                        eval_y_transformed = self._label_transformer.transform(eval_y)
+                        transformed_eval_set.append((eval_X_transformed, eval_y_transformed))
+                    else:
+                        transformed_eval_set.append((eval_X_transformed, eval_y))
+                params["eval_set"] = transformed_eval_set
+
            if self._label_transformer and estimator_name in [
                "rf",
                "extra_tree",
--- a/flaml/default/greedy.py
+++ b/flaml/default/greedy.py
@@ -32,6 +32,7 @@ def construct_portfolio(regret_matrix, meta_features, regret_bound):
    if meta_features is not None:
        scaler = RobustScaler()
        meta_features = meta_features.loc[tasks]
+        meta_features = meta_features.astype(float)
        meta_features.loc[:, :] = scaler.fit_transform(meta_features)
        nearest_task = {}
        for t in tasks:
--- a/flaml/default/portfolio.py
+++ b/flaml/default/portfolio.py
@@ -26,6 +26,7 @@ def config_predictor_tuple(tasks, configs, meta_features, regret_matrix):
    # pre-processing
    scaler = RobustScaler()
    meta_features_norm = meta_features.loc[tasks]  # this makes a copy
+    meta_features_norm = meta_features_norm.astype(float)
    meta_features_norm.loc[:, :] = scaler.fit_transform(meta_features_norm)

    proc = {
--- a/flaml/fabric/mlflow.py
+++ b/flaml/fabric/mlflow.py
@@ -567,7 +567,7 @@ class MLflowIntegration:
            try:
                with open(pickle_fpath, "wb") as f:
                    pickle.dump(obj, f)
-                mlflow.log_artifact(pickle_fpath, artifact_name, run_id)
+                self.mlflow_client.log_artifact(run_id, pickle_fpath, artifact_name)
                return True
            except Exception as e:
                logger.debug(f"Failed to pickle and log {artifact_name}, error: {e}")
@@ -652,7 +652,7 @@ class MLflowIntegration:
        return f"Successfully pickle_and_log_automl_artifacts {estimator} to run_id {run_id}"

    @time_it
-    def record_state(self, automl, search_state, estimator):
+    def record_state(self, automl, search_state, estimator, is_log_model=True):
        _st = time.time()
        automl_metric_name = (
            automl._state.metric if isinstance(automl._state.metric, str) else automl._state.error_metric
@@ -727,7 +727,7 @@ class MLflowIntegration:
                self.futures[future] = f"iter_{automl._track_iter}_log_info_to_run"
                future = executor.submit(lambda: self._log_automl_configurations(child_run.info.run_id))
                self.futures[future] = f"iter_{automl._track_iter}_log_automl_configurations"
-                if automl._state.model_history:
+                if automl._state.model_history and is_log_model:
                    if estimator.endswith("_spark"):
                        future = executor.submit(
                            lambda: self.log_model(
@@ -797,8 +797,10 @@ class MLflowIntegration:
                conf = automl._config_history[automl._best_iteration][1].copy()
                if "ml" in conf.keys():
                    conf = conf["ml"]
-
-                mlflow.log_params({**conf, "best_learner": automl._best_estimator}, run_id=self.parent_run_id)
+                params_arr = [
+                    Param(key, str(value)) for key, value in {**conf, "best_learner": automl._best_estimator}.items()
+                ]
+                self.mlflow_client.log_batch(run_id=self.parent_run_id, metrics=[], params=params_arr, tags=[])
                if not self.has_summary:
                    logger.info(f"logging best model {automl.best_estimator}")
                    future = executor.submit(lambda: self.copy_mlflow_run(best_mlflow_run_id, self.parent_run_id))
@@ -894,6 +896,7 @@ class MLflowIntegration:
                ),
            )
            self.child_counter = 0
+            num_infos = len(self.infos)

            # From latest to earliest, remove duplicate cross-validation runs
            _exist_child_run_params = []  # for deduplication of cross-validation child runs
@@ -958,22 +961,37 @@ class MLflowIntegration:
                            )
                        self.mlflow_client.set_tag(child_run_id, "flaml.child_counter", self.child_counter)

-                    # merge autolog child run and corresponding manual run
-                    flaml_info = self.infos[self.child_counter]
-                    child_run = self.mlflow_client.get_run(child_run_id)
-                    self._log_info_to_run(flaml_info, child_run_id, log_params=False)
+                    # Merge autolog child run and corresponding FLAML trial info (if available).
+                    # In nested scenarios (e.g., Tune -> AutoML -> MLflow autolog), MLflow can create
+                    # more child runs than the number of FLAML trials recorded in self.infos.
+                    # TODO: need more tests in nested scenarios.
+                    flaml_info = None
+                    child_run = None
+                    if self.child_counter < num_infos:
+                        flaml_info = self.infos[self.child_counter]
+                        child_run = self.mlflow_client.get_run(child_run_id)
+                        self._log_info_to_run(flaml_info, child_run_id, log_params=False)

-                    if self.experiment_type == "automl":
-                        if "learner" not in child_run.data.params:
-                            self.mlflow_client.log_param(child_run_id, "learner", flaml_info["params"]["learner"])
-                        if "sample_size" not in child_run.data.params:
-                            self.mlflow_client.log_param(
-                                child_run_id, "sample_size", flaml_info["params"]["sample_size"]
-                            )
+                        if self.experiment_type == "automl":
+                            if "learner" not in child_run.data.params:
+                                self.mlflow_client.log_param(child_run_id, "learner", flaml_info["params"]["learner"])
+                            if "sample_size" not in child_run.data.params:
+                                self.mlflow_client.log_param(
+                                    child_run_id, "sample_size", flaml_info["params"]["sample_size"]
+                                )
+                    else:
+                        logger.debug(
+                            "No corresponding FLAML info for MLflow child run %s (child_counter=%s, infos=%s); skipping merge.",
+                            child_run_id,
+                            self.child_counter,
+                            num_infos,
+                        )

-                    if self.child_counter == best_iteration:
+                    if flaml_info is not None and self.child_counter == best_iteration:
                        self.mlflow_client.set_tag(child_run_id, "flaml.best_run", True)
                        if result is not None:
+                            if child_run is None:
+                                child_run = self.mlflow_client.get_run(child_run_id)
                            result.best_run_id = child_run_id
                            result.best_run_name = child_run.info.run_name
                            self.best_run_id = child_run_id
@@ -997,7 +1015,7 @@ class MLflowIntegration:
        self.resume_mlflow()


-def register_automl_pipeline(automl, model_name=None, signature=None):
+def register_automl_pipeline(automl, model_name=None, signature=None, artifact_path="model"):
    pipeline = automl.automl_pipeline
    if pipeline is None:
        logger.warning("pipeline not found, cannot register it")
@@ -1007,7 +1025,7 @@ def register_automl_pipeline(automl, model_name=None, signature=None):
    if automl.best_run_id is None:
        mlflow.sklearn.log_model(
            pipeline,
-            "automl_pipeline",
+            artifact_path,
            registered_model_name=model_name,
            signature=automl.pipeline_signature if signature is None else signature,
        )
@@ -1017,5 +1035,5 @@ def register_automl_pipeline(automl, model_name=None, signature=None):
        return mvs[0]
    else:
        best_run = mlflow.get_run(automl.best_run_id)
-        model_uri = f"runs:/{best_run.info.run_id}/automl_pipeline"
+        model_uri = f"runs:/{best_run.info.run_id}/{artifact_path}"
        return mlflow.register_model(model_uri, model_name)
--- a/flaml/onlineml/README.md
+++ b/flaml/onlineml/README.md
@@ -1,6 +1,6 @@
 # ChaCha for Online AutoML

-FLAML includes *ChaCha* which is an automatic hyperparameter tuning solution for online machine learning. Online machine learning has the following properties: (1) data comes in sequential order; and (2) the performance of the machine learning model is evaluated online, i.e., at every iteration. *ChaCha* performs online AutoML respecting the aforementioned properties of online learning, and at the same time respecting the following constraints: (1) only a small constant number of 'live' models are allowed to perform online learning at the same time;  and (2) no model persistence or offline training is allowed, which means that once we decide to replace a 'live' model with a new one, the replaced model can no longer be retrieved.
+FLAML includes *ChaCha* which is an automatic hyperparameter tuning solution for online machine learning. Online machine learning has the following properties: (1) data comes in sequential order; and (2) the performance of the machine learning model is evaluated online, i.e., at every iteration. *ChaCha* performs online AutoML respecting the aforementioned properties of online learning, and at the same time respecting the following constraints: (1) only a small constant number of 'live' models are allowed to perform online learning at the same time; and (2) no model persistence or offline training is allowed, which means that once we decide to replace a 'live' model with a new one, the replaced model can no longer be retrieved.

 For more technical details about *ChaCha*, please check our paper.

--- a/flaml/tune/searcher/blendsearch.py
+++ b/flaml/tune/searcher/blendsearch.py
@@ -217,7 +217,24 @@ class BlendSearch(Searcher):
        if global_search_alg is not None:
            self._gs = global_search_alg
        elif getattr(self, "__name__", None) != "CFO":
-            if space and self._ls.hierarchical:
+            # Use define-by-run for OptunaSearch when needed:
+            # - Hierarchical/conditional spaces are best supported via define-by-run.
+            # - Ray Tune domain/grid specs can trigger an "unresolved search space" warning
+            #   unless we switch to define-by-run.
+            use_define_by_run = bool(getattr(self._ls, "hierarchical", False))
+            if (not use_define_by_run) and isinstance(space, dict) and space:
+                try:
+                    from .variant_generator import parse_spec_vars
+
+                    _, domain_vars, grid_vars = parse_spec_vars(space)
+                    use_define_by_run = bool(domain_vars or grid_vars)
+                except Exception:
+                    # Be conservative: if we can't determine whether the space is
+                    # unresolved, fall back to the original behavior.
+                    use_define_by_run = False
+
+            self._use_define_by_run = use_define_by_run
+            if use_define_by_run:
                from functools import partial

                gs_space = partial(define_by_run_func, space=space)
@@ -487,7 +504,7 @@ class BlendSearch(Searcher):
                            self._ls_bound_max,
                            self._subspace.get(trial_id, self._ls.space),
                        )
-                    if self._gs is not None and self._experimental and (not self._ls.hierarchical):
+                    if self._gs is not None and self._experimental and (not getattr(self, "_use_define_by_run", False)):
                        self._gs.add_evaluated_point(flatten_dict(config), objective)
                        # TODO: recover when supported
                        # converted = convert_key(config, self._gs.space)
--- a/flaml/tune/searcher/flow2.py
+++ b/flaml/tune/searcher/flow2.py
@@ -641,8 +641,10 @@ class FLOW2(Searcher):
            else:
                # key must be in space
                domain = space[key]
-                if self.hierarchical and not (
-                    domain is None or type(domain) in (str, int, float) or isinstance(domain, sample.Domain)
+                if (
+                    self.hierarchical
+                    and domain is not None
+                    and not isinstance(domain, (str, int, float, sample.Domain))
                ):
                    # not domain or hashable
                    # get rid of list type for hierarchical search space.
--- a/flaml/tune/searcher/online_searcher.py
+++ b/flaml/tune/searcher/online_searcher.py
@@ -207,7 +207,7 @@ class ChampionFrontierSearcher(BaseSearcher):
                    hyperparameter_config_groups.append(partial_new_configs)
                    # does not have searcher_trial_ids
                    searcher_trial_ids_groups.append([])
-            elif isinstance(config_domain, Float) or isinstance(config_domain, Categorical):
+            elif isinstance(config_domain, (Float, Categorical)):
                # otherwise we need to deal with them in group
                nonpoly_config[k] = v
                if k not in self._space_of_nonpoly_hp:
--- a/flaml/tune/searcher/search_thread.py
+++ b/flaml/tune/searcher/search_thread.py
@@ -25,6 +25,31 @@ from .flow2 import FLOW2
 logger = logging.getLogger(__name__)


+def _recursive_dict_update(target: Dict, source: Dict) -> None:
+    """Recursively update target dictionary with source dictionary.
+
+    Unlike dict.update(), this function merges nested dictionaries instead of
+    replacing them entirely. This is crucial for configurations with nested
+    structures (e.g., XGBoost params).
+
+    Args:
+        target: The dictionary to be updated (modified in place).
+        source: The dictionary containing values to merge into target.
+
+    Example:
+        >>> target = {'params': {'eta': 0.1, 'max_depth': 3}}
+        >>> source = {'params': {'verbosity': 0}}
+        >>> _recursive_dict_update(target, source)
+        >>> target
+        {'params': {'eta': 0.1, 'max_depth': 3, 'verbosity': 0}}
+    """
+    for key, value in source.items():
+        if isinstance(value, dict) and key in target and isinstance(target[key], dict):
+            _recursive_dict_update(target[key], value)
+        else:
+            target[key] = value
+
+
 class SearchThread:
    """Class of global or local search thread."""

@@ -65,7 +90,7 @@ class SearchThread:
            try:
                config = self._search_alg.suggest(trial_id)
                if isinstance(self._search_alg._space, dict):
-                    config.update(self._const)
+                    _recursive_dict_update(config, self._const)
                else:
                    # define by run
                    config, self.space = unflatten_hierarchical(config, self._space)
--- a/flaml/tune/space.py
+++ b/flaml/tune/space.py
@@ -261,7 +261,7 @@ def add_cost_to_space(space: Dict, low_cost_point: Dict, choice_cost: Dict):
                        low_cost[i] = point
                if len(low_cost) > len(domain.categories):
                    if domain.ordered:
-                        low_cost[-1] = int(np.where(ind == low_cost[-1])[0])
+                        low_cost[-1] = int(np.where(ind == low_cost[-1])[0].item())
                    domain.low_cost_point = low_cost[-1]
                return
        if low_cost:
--- a/flaml/tune/tune.py
+++ b/flaml/tune/tune.py
@@ -776,7 +776,7 @@ def run(
                        and (num_samples < 0 or num_trials < num_samples)
                        and num_failures < upperbound_num_failures
                    ):
-                        if automl_info and automl_info[0] > 0 and time_budget_s < np.inf:
+                        if automl_info and automl_info[1] == "all" and automl_info[0] > 0 and time_budget_s < np.inf:
                            time_budget_s -= automl_info[0] * n_concurrent_trials
                            logger.debug(f"Remaining time budget with mlflow log latency: {time_budget_s} seconds.")
                        while len(_runner.running_trials) < n_concurrent_trials:
@@ -802,9 +802,17 @@ def run(
                        )
                        results = None
                        with PySparkOvertimeMonitor(time_start, time_budget_s, force_cancel, parallel=parallel):
-                            results = parallel(
-                                delayed(evaluation_function)(trial_to_run.config) for trial_to_run in trials_to_run
-                            )
+                            try:
+                                results = parallel(
+                                    delayed(evaluation_function)(trial_to_run.config) for trial_to_run in trials_to_run
+                                )
+                            except RuntimeError as e:
+                                logger.warning(f"RuntimeError: {e}")
+                                results = None
+                                logger.info(
+                                    "Encountered RuntimeError. Waiting 10 seconds for Spark cluster to recover before retrying."
+                                )
+                                time.sleep(10)
                        # results = [evaluation_function(trial_to_run.config) for trial_to_run in trials_to_run]
                        while results:
                            result = results.pop(0)
--- a/flaml/version.py
+++ b/flaml/version.py
@@ -1 +1 @@
-__version__ = "2.4.0"
+__version__ = "2.5.0"
--- a/installed_all_dependencies_3.10_ubuntu-latest.txt
+++ b/installed_all_dependencies_3.10_ubuntu-latest.txt
@@ -1,259 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-async-timeout==5.0.1
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-backports.strenum==1.3.1
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-cmdstanpy==1.3.0
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.2
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cuda-bindings==12.9.4
-cuda-pathfinder==1.3.4
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-exceptiongroup==1.3.1
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@0b4d76f509972c51050aff4f9f89be02de7b9aee#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-holidays==0.90
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-importlib_resources==6.5.2
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==8.38.0
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.4.2
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==1.26.4
-nvidia-cublas-cu12==12.8.4.1
-nvidia-cuda-cupti-cu12==12.8.90
-nvidia-cuda-nvrtc-cu12==12.8.93
-nvidia-cuda-runtime-cu12==12.8.90
-nvidia-cudnn-cu12==9.10.2.21
-nvidia-cufft-cu12==11.3.3.83
-nvidia-cufile-cu12==1.13.1.3
-nvidia-curand-cu12==10.3.9.90
-nvidia-cusolver-cu12==11.7.3.90
-nvidia-cusparse-cu12==12.5.8.93
-nvidia-cusparselt-cu12==0.7.1
-nvidia-nccl-cu12==2.27.5
-nvidia-nvjitlink-cu12==12.8.93
-nvidia-nvshmem-cu12==3.4.5
-nvidia-nvtx-cu12==12.8.90
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-overrides==7.7.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pexpect==4.9.0
-pillow==12.1.1
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-prophet==1.3.0
-protobuf==6.33.5
-psutil==7.2.2
-ptyprocess==0.7.0
-pure_eval==0.2.3
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.7.2
-scipy==1.15.3
-Send2Trash==2.1.0
-seqeval==1.2.2
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-stanio==0.5.1
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-tomli==2.4.0
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-triton==3.6.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==1.7.6
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.10_windows-latest.txt
+++ b/installed_all_dependencies_3.10_windows-latest.txt
@@ -1,237 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-async-timeout==5.0.1
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-backports.strenum==1.3.1
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-colorama==0.4.6
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.2
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-exceptiongroup==1.3.1
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@61742144cb2fd46c68459941ed3f235c7ee90873#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==8.38.0
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.4.2
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==1.26.4
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-overrides==7.7.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pillow==12.1.1
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-protobuf==6.33.5
-psutil==7.2.2
-pure_eval==0.2.3
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pyreadline3==3.5.4
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-pywinpty==3.0.3
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.7.2
-scipy==1.15.3
-Send2Trash==2.1.0
-seqeval==1.2.2
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-tomli==2.4.0
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==1.7.6
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.11_macos-latest.txt
+++ b/installed_all_dependencies_3.11_macos-latest.txt
@@ -1,217 +0,0 @@
-absl-py==2.3.1
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.0
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-appnope==0.1.4
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.0.5
-attrs==25.4.0
-babel==2.17.0
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cloudpickle==3.1.2
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.3
-convertdate==2.4.0
-coverage==7.13.1
-cycler==0.12.1
-databricks-sdk==0.77.0
-dataclasses==0.6
-datasets==4.4.2
-debugpy==1.8.19
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-executing==2.2.1
-fastapi==0.128.0
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@3ab9ce3cda330a54210c591e89b7f8674948d607#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.47.0
-graphviz==0.21
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-httpcore==1.0.9
-httpx==0.28.1
-huggingface-hub==0.36.0
-identify==2.6.15
-idna==3.11
-importlib_metadata==8.7.1
-iniconfig==2.3.0
-ipykernel==7.1.0
-ipython==9.9.0
-ipython_pygments_lexers==1.1.1
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.3
-jupyterlab==4.5.1
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.0
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.0
-multiprocess==0.70.18
-narwhals==2.15.0
-nbclient==0.10.4
-nbconvert==7.16.6
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.6.1
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.1
-notebook_shim==0.2.4
-numpy==1.26.4
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==3.6.1
-overrides==7.7.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.5
-patsy==1.0.2
-pexpect==4.9.0
-pillow==12.1.0
-platformdirs==4.5.1
-plotly==6.5.1
-pluggy==1.6.0
-pre_commit==4.5.1
-prometheus_client==0.23.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-protobuf==6.33.3
-psutil==7.2.1
-ptyprocess==0.7.0
-pure_eval==0.2.3
-pyarrow==22.0.0
-pyasn1==0.6.1
-pyasn1_modules==0.4.2
-pycparser==2.23
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.1
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.5.0
-pytorch-lightning==2.6.0
-pytz==2025.2
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2025.11.3
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf_python==3.12.0
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-learn==1.8.0
-scipy==1.16.3
-Send2Trash==2.0.0
-seqeval==1.2.2
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.1
-SQLAlchemy==2.0.45
-sqlparse==0.5.5
-stack-data==0.6.3
-starlette==0.50.0
-statsmodels==0.14.6
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-torch==2.9.1
-torchmetrics==1.8.2
-torchvision==0.24.1
-tornado==6.5.4
-tqdm==4.67.1
-traitlets==5.14.3
-transformers==4.57.3
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.2.14
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==3.1.3
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.11_ubuntu-latest.txt
+++ b/installed_all_dependencies_3.11_ubuntu-latest.txt
@@ -1,258 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-cmdstanpy==1.3.0
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.3
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cuda-bindings==12.9.4
-cuda-pathfinder==1.3.4
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@41016d6087aa546653ed5aef274597782594bcf3#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-holidays==0.90
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-importlib_resources==6.5.2
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==9.10.0
-ipython_pygments_lexers==1.1.1
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.6.1
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==1.26.4
-nvidia-cublas-cu12==12.8.4.1
-nvidia-cuda-cupti-cu12==12.8.90
-nvidia-cuda-nvrtc-cu12==12.8.93
-nvidia-cuda-runtime-cu12==12.8.90
-nvidia-cudnn-cu12==9.10.2.21
-nvidia-cufft-cu12==11.3.3.83
-nvidia-cufile-cu12==1.13.1.3
-nvidia-curand-cu12==10.3.9.90
-nvidia-cusolver-cu12==11.7.3.90
-nvidia-cusparse-cu12==12.5.8.93
-nvidia-cusparselt-cu12==0.7.1
-nvidia-nccl-cu12==2.27.5
-nvidia-nvjitlink-cu12==12.8.93
-nvidia-nvshmem-cu12==3.4.5
-nvidia-nvtx-cu12==12.8.90
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-overrides==7.7.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pexpect==4.9.0
-pillow==12.1.1
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-prophet==1.3.0
-protobuf==6.33.5
-psutil==7.2.2
-ptyprocess==0.7.0
-pure_eval==0.2.3
-py4j==0.10.9.7
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pyspark==3.5.1
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.8.0
-scipy==1.17.0
-Send2Trash==2.1.0
-seqeval==1.2.2
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-stanio==0.5.1
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-triton==3.6.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==3.2.0
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.11_windows-latest.txt
+++ b/installed_all_dependencies_3.11_windows-latest.txt
@@ -1,234 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-colorama==0.4.6
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.3
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@2c0f95df98bed3fffa97dfba74395e751b5f136c#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==9.10.0
-ipython_pygments_lexers==1.1.1
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.6.1
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==1.26.4
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-overrides==7.7.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pillow==12.1.1
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-protobuf==6.33.5
-psutil==7.2.2
-pure_eval==0.2.3
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pyreadline3==3.5.4
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-pywinpty==3.0.3
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.8.0
-scipy==1.17.0
-Send2Trash==2.1.0
-seqeval==1.2.2
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==3.2.0
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.12_ubuntu-latest.txt
+++ b/installed_all_dependencies_3.12_ubuntu-latest.txt
@@ -1,259 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-cmdstanpy==1.3.0
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.3
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cuda-bindings==12.9.4
-cuda-pathfinder==1.3.4
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@ec25d5bce7fbcd9dd460c4b6fb659bf9d665ab86#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-holidays==0.90
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-importlib_resources==6.5.2
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==9.10.0
-ipython_pygments_lexers==1.1.1
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.6.1
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==1.26.4
-nvidia-cublas-cu12==12.8.4.1
-nvidia-cuda-cupti-cu12==12.8.90
-nvidia-cuda-nvrtc-cu12==12.8.93
-nvidia-cuda-runtime-cu12==12.8.90
-nvidia-cudnn-cu12==9.10.2.21
-nvidia-cufft-cu12==11.3.3.83
-nvidia-cufile-cu12==1.13.1.3
-nvidia-curand-cu12==10.3.9.90
-nvidia-cusolver-cu12==11.7.3.90
-nvidia-cusparse-cu12==12.5.8.93
-nvidia-cusparselt-cu12==0.7.1
-nvidia-nccl-cu12==2.27.5
-nvidia-nvjitlink-cu12==12.8.93
-nvidia-nvshmem-cu12==3.4.5
-nvidia-nvtx-cu12==12.8.90
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pexpect==4.9.0
-pillow==12.1.1
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-prophet==1.3.0
-protobuf==6.33.5
-psutil==7.2.2
-ptyprocess==0.7.0
-pure_eval==0.2.3
-py4j==0.10.9.9
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pyspark==4.0.1
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.8.0
-scipy==1.17.0
-Send2Trash==2.1.0
-seqeval==1.2.2
-setuptools==81.0.0
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-stanio==0.5.1
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-triton==3.6.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-wheel==0.46.3
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==3.2.0
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.12_windows-latest.txt
+++ b/installed_all_dependencies_3.12_windows-latest.txt
@@ -1,238 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argcomplete==3.6.3
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-colorama==0.4.6
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.3
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@02f8ca32dea0605aaa4989c9f564299746adacb1#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==9.10.0
-ipython_pygments_lexers==1.1.1
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.6.1
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==1.26.4
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pillow==12.1.1
-pipx==1.8.0
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-protobuf==6.33.5
-psutil==7.2.2
-pure_eval==0.2.3
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pyreadline3==3.5.4
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-pywinpty==3.0.3
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.8.0
-scipy==1.17.0
-Send2Trash==2.1.0
-seqeval==1.2.2
-setuptools==81.0.0
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-userpath==1.9.2
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-wheel==0.46.3
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==3.2.0
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.13_ubuntu-latest.txt
+++ b/installed_all_dependencies_3.13_ubuntu-latest.txt
@@ -1,259 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-cmdstanpy==1.3.0
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.3
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cuda-bindings==12.9.4
-cuda-pathfinder==1.3.4
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@3eb01a57781be209e1dd01690796796e903ef306#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-holidays==0.90
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-importlib_resources==6.5.2
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==9.10.0
-ipython_pygments_lexers==1.1.1
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.6.1
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==2.4.2
-nvidia-cublas-cu12==12.8.4.1
-nvidia-cuda-cupti-cu12==12.8.90
-nvidia-cuda-nvrtc-cu12==12.8.93
-nvidia-cuda-runtime-cu12==12.8.90
-nvidia-cudnn-cu12==9.10.2.21
-nvidia-cufft-cu12==11.3.3.83
-nvidia-cufile-cu12==1.13.1.3
-nvidia-curand-cu12==10.3.9.90
-nvidia-cusolver-cu12==11.7.3.90
-nvidia-cusparse-cu12==12.5.8.93
-nvidia-cusparselt-cu12==0.7.1
-nvidia-nccl-cu12==2.27.5
-nvidia-nvjitlink-cu12==12.8.93
-nvidia-nvshmem-cu12==3.4.5
-nvidia-nvtx-cu12==12.8.90
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pexpect==4.9.0
-pillow==12.1.1
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-prophet==1.3.0
-protobuf==6.33.5
-psutil==7.2.2
-ptyprocess==0.7.0
-pure_eval==0.2.3
-py4j==0.10.9.9
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pyspark==4.1.0
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.8.0
-scipy==1.17.0
-Send2Trash==2.1.0
-seqeval==1.2.2
-setuptools==81.0.0
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-stanio==0.5.1
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-triton==3.6.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-wheel==0.46.3
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==3.2.0
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_all_dependencies_3.13_windows-latest.txt
+++ b/installed_all_dependencies_3.13_windows-latest.txt
@@ -1,235 +0,0 @@
-absl-py==2.4.0
-accelerate==1.12.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.13.3
-aiosignal==1.4.0
-alembic==1.18.4
-annotated-doc==0.0.4
-annotated-types==0.7.0
-anyio==4.12.1
-argon2-cffi==25.1.0
-argon2-cffi-bindings==25.1.0
-arrow==1.4.0
-asttokens==3.0.1
-async-lru==2.1.0
-attrs==25.4.0
-autopage==0.6.0
-babel==2.18.0
-beautifulsoup4==4.14.3
-bleach==6.3.0
-cachetools==5.5.2
-catboost==1.2.8
-certifi==2026.1.4
-cffi==2.0.0
-cfgv==3.5.0
-charset-normalizer==3.4.4
-click==8.3.1
-cliff==4.13.1
-cloudpickle==3.1.2
-cmaes==0.12.0
-cmd2==3.2.0
-colorama==0.4.6
-colorlog==6.10.1
-comm==0.2.3
-contourpy==1.3.3
-convertdate==2.4.1
-coverage==7.13.4
-cryptography==46.0.5
-cycler==0.12.1
-databricks-sdk==0.87.0
-dataclasses==0.6
-datasets==4.5.0
-debugpy==1.8.20
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.4.0
-distlib==0.4.0
-evaluate==0.4.6
-executing==2.2.1
-fastapi==0.128.8
-fastjsonschema==2.21.2
-filelock==3.20.3
-e git+https://github.com/microsoft/FLAML@7fea33db97001c8a2b56ad0b0b81cb8a38cb751e#egg=FLAML
-fonttools==4.61.1
-fqdn==1.5.1
-frozenlist==1.8.0
-fsspec==2025.10.0
-gitdb==4.0.12
-GitPython==3.1.46
-google-auth==2.48.0
-graphviz==0.21
-greenlet==3.3.1
-h11==0.16.0
-hcrystalball==0.1.12
-hf-xet==1.2.0
-httpcore==1.0.9
-httpx==0.28.1
-huggingface_hub==1.4.1
-identify==2.6.16
-idna==3.11
-importlib_metadata==8.7.1
-iniconfig==2.3.0
-ipykernel==7.2.0
-ipython==9.10.0
-ipython_pygments_lexers==1.1.1
-ipywidgets==8.1.8
-isoduration==20.11.0
-jedi==0.19.2
-Jinja2==3.1.6
-joblib==1.3.2
-joblibspark==0.6.0
-json5==0.13.0
-jsonpointer==3.0.0
-jsonschema==4.26.0
-jsonschema-specifications==2025.9.1
-jupyter==1.1.1
-jupyter-console==6.6.3
-jupyter-events==0.12.0
-jupyter-lsp==2.3.0
-jupyter_client==8.8.0
-jupyter_core==5.9.1
-jupyter_server==2.17.0
-jupyter_server_terminals==0.5.4
-jupyterlab==4.5.4
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.28.0
-jupyterlab_widgets==3.0.16
-kiwisolver==1.4.9
-lark==1.3.1
-liac-arff==2.5.0
-lightgbm==4.6.0
-lightning==2.6.1
-lightning-utilities==0.15.2
-lunardate==0.2.2
-Mako==1.3.10
-markdown-it-py==4.0.0
-MarkupSafe==3.0.3
-matplotlib==3.10.8
-matplotlib-inline==0.2.1
-mdurl==0.1.2
-minio==7.2.20
-mistune==3.2.0
-mlflow-skinny==2.22.1
-mpmath==1.3.0
-multidict==6.7.1
-multiprocess==0.70.18
-narwhals==2.16.0
-nbclient==0.10.4
-nbconvert==7.17.0
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.6.1
-nltk==3.9.2
-nodeenv==1.10.0
-notebook==7.5.3
-notebook_shim==0.2.4
-numpy==2.4.2
-openml==0.15.1
-opentelemetry-api==1.39.1
-opentelemetry-sdk==1.39.1
-opentelemetry-semantic-conventions==0.60b1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-pandocfilters==1.5.1
-parso==0.8.6
-patsy==1.0.2
-pillow==12.1.1
-platformdirs==4.5.1
-plotly==6.5.2
-pluggy==1.6.0
-pre_commit==4.5.1
-prettytable==3.17.0
-prometheus_client==0.24.1
-prompt_toolkit==3.0.52
-propcache==0.4.1
-protobuf==6.33.5
-psutil==7.2.2
-pure_eval==0.2.3
-pyarrow==23.0.0
-pyasn1==0.6.2
-pyasn1_modules==0.4.2
-pycparser==3.0
-pycryptodome==3.23.0
-pydantic==2.12.5
-pydantic_core==2.41.5
-Pygments==2.19.2
-pyluach==2.3.0
-PyMeeus==0.5.12
-pyparsing==3.3.2
-pyperclip==1.11.0
-pyreadline3==3.5.4
-pytest==9.0.2
-pytest-rerunfailures==16.1
-python-dateutil==2.9.0.post0
-python-json-logger==4.0.0
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-pytz==2025.2
-pywinpty==3.0.3
-PyYAML==6.0.3
-pyzmq==27.1.0
-referencing==0.37.0
-regex==2026.1.15
-requests==2.32.5
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rfc3987-syntax==1.1.0
-rgf-python==3.12.0
-rich==14.3.2
-rich-argparse==1.7.2
-rouge_score==0.1.2
-rpds-py==0.30.0
-rsa==4.9.1
-safetensors==0.7.0
-scikit-base==0.13.1
-scikit-learn==1.8.0
-scipy==1.17.0
-Send2Trash==2.1.0
-seqeval==1.2.2
-setuptools==81.0.0
-shellingham==1.5.4
-six==1.17.0
-smmap==5.0.2
-soupsieve==2.8.3
-SQLAlchemy==2.0.46
-sqlparse==0.5.5
-stack-data==0.6.3
-starlette==0.52.1
-statsmodels==0.14.6
-stevedore==5.6.0
-sympy==1.14.0
-tensorboardX==2.6.4
-terminado==0.18.1
-thop==0.1.1.post2209072238
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.22.2
-torch==2.10.0
-torchmetrics==1.8.2
-torchvision==0.25.0
-tornado==6.5.4
-tqdm==4.67.3
-traitlets==5.14.3
-transformers==5.1.0
-typer==0.23.0
-typer-slim==0.23.0
-typing-inspection==0.4.2
-typing_extensions==4.15.0
-tzdata==2025.3
-uri-template==1.3.0
-urllib3==2.6.3
-uvicorn==0.40.0
-virtualenv==20.36.1
-wcwidth==0.6.0
-webcolors==25.10.0
-webencodings==0.5.1
-websocket-client==1.9.0
-wheel==0.46.3
-widgetsnbextension==4.0.15
-workalendar==17.0.0
-xgboost==3.2.0
-xmltodict==1.0.2
-xxhash==3.6.0
-yarl==1.22.0
-zipp==3.23.0
--- a/installed_first_tier_dependencies_3.10_ubuntu-latest.txt
+++ b/installed_first_tier_dependencies_3.10_ubuntu-latest.txt
@@ -1,42 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==1.26.4
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-prophet==1.3.0
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.7.2
-scipy==1.15.3
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==1.7.6
-xgboost==1.7.6
-Current commit hash: 0b4d76f509972c51050aff4f9f89be02de7b9aee
--- a/installed_first_tier_dependencies_3.10_windows-latest.txt
+++ b/installed_first_tier_dependencies_3.10_windows-latest.txt
@@ -1,41 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==1.26.4
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.7.2
-scipy==1.15.3
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==1.7.6
-xgboost==1.7.6
-Current commit hash: 61742144cb2fd46c68459941ed3f235c7ee90873
--- a/installed_first_tier_dependencies_3.11_macos-latest.txt
+++ b/installed_first_tier_dependencies_3.11_macos-latest.txt
@@ -1,39 +0,0 @@
-catboost==1.2.8
-coverage==7.13.1
-dataclasses==0.6
-datasets==4.4.2
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.1.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.16.6
-nbformat==5.10.4
-nltk==3.9.2
-numpy==1.26.4
-openml==0.15.1
-optuna==3.6.1
-packaging==24.2
-pandas==2.3.3
-psutil==7.2.1
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.5.0
-pytorch-lightning==2.6.0
-requests==2.32.5
-rouge_score==0.1.2
-scikit-learn==1.8.0
-scipy==1.16.3
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.9.1
-torchvision==0.24.1
-transformers==4.57.3
-xgboost==3.1.3
-Current commit hash: 3ab9ce3cda330a54210c591e89b7f8674948d607
--- a/installed_first_tier_dependencies_3.11_ubuntu-latest.txt
+++ b/installed_first_tier_dependencies_3.11_ubuntu-latest.txt
@@ -1,42 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==1.26.4
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-prophet==1.3.0
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.8.0
-scipy==1.17.0
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==3.2.0
-xgboost==3.2.0
-Current commit hash: 41016d6087aa546653ed5aef274597782594bcf3
--- a/installed_first_tier_dependencies_3.11_windows-latest.txt
+++ b/installed_first_tier_dependencies_3.11_windows-latest.txt
@@ -1,41 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==1.26.4
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.8.0
-scipy==1.17.0
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==3.2.0
-xgboost==3.2.0
-Current commit hash: 2c0f95df98bed3fffa97dfba74395e751b5f136c
--- a/installed_first_tier_dependencies_3.12_ubuntu-latest.txt
+++ b/installed_first_tier_dependencies_3.12_ubuntu-latest.txt
@@ -1,42 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==1.26.4
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-prophet==1.3.0
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.8.0
-scipy==1.17.0
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==3.2.0
-xgboost==3.2.0
-Current commit hash: ec25d5bce7fbcd9dd460c4b6fb659bf9d665ab86
--- a/installed_first_tier_dependencies_3.12_windows-latest.txt
+++ b/installed_first_tier_dependencies_3.12_windows-latest.txt
@@ -1,41 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==1.26.4
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.8.0
-scipy==1.17.0
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==3.2.0
-xgboost==3.2.0
-Current commit hash: 02f8ca32dea0605aaa4989c9f564299746adacb1
--- a/installed_first_tier_dependencies_3.13_ubuntu-latest.txt
+++ b/installed_first_tier_dependencies_3.13_ubuntu-latest.txt
@@ -1,42 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==2.4.2
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-prophet==1.3.0
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.8.0
-scipy==1.17.0
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==3.2.0
-xgboost==3.2.0
-Current commit hash: 3eb01a57781be209e1dd01690796796e903ef306
--- a/installed_first_tier_dependencies_3.13_windows-latest.txt
+++ b/installed_first_tier_dependencies_3.13_windows-latest.txt
@@ -1,41 +0,0 @@
-catboost==1.2.8
-coverage==7.13.4
-dataclasses==0.6
-datasets==4.5.0
-dill==0.4.0
-evaluate==0.4.6
-hcrystalball==0.1.12
-ipykernel==7.2.0
-joblib==1.3.2
-joblibspark==0.6.0
-jupyter==1.1.1
-lightgbm==4.6.0
-mlflow-skinny==2.22.1
-nbconvert==7.17.0
-nbformat==5.10.4
-nltk==3.9.2
-numpy==2.4.2
-openml==0.15.1
-optuna==2.8.0
-packaging==24.2
-pandas==2.3.3
-psutil==7.2.2
-pytest-rerunfailures==16.1
-pytest==9.0.2
-pytorch-forecasting==1.6.1
-pytorch-lightning==2.6.1
-requests==2.32.5
-rgf-python==3.12.0
-rouge_score==0.1.2
-scikit-learn==1.8.0
-scipy==1.17.0
-seqeval==1.2.2
-statsmodels==0.14.6
-tensorboardX==2.6.4
-thop==0.1.1-2209072238
-torch==2.10.0
-torchvision==0.25.0
-transformers==5.1.0
-xgboost==3.2.0
-xgboost==3.2.0
-Current commit hash: 7fea33db97001c8a2b56ad0b0b81cb8a38cb751e
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -2,7 +2,6 @@
 license_file = "LICENSE"
 description-file = "README.md"

-
 [tool.pytest.ini_options]
 addopts = '-m "not conda"'
 markers = [
--- a/setup.py
+++ b/setup.py
@@ -52,8 +52,8 @@ setuptools.setup(
        ],
        "test": [
            "numpy>=1.17,<2.0.0; python_version<'3.13'",
-            "numpy>2.0.0; python_version>='3.13'",
-            "jupyter; python_version<'3.13'",
+            "numpy>=1.17; python_version>='3.13'",
+            "jupyter",
            "lightgbm>=2.3.1",
            "xgboost>=0.90,<2.0.0; python_version<'3.11'",
            "xgboost>=2.0.0; python_version>='3.11'",
@@ -68,10 +68,10 @@ setuptools.setup(
            "pre-commit",
            "torch",
            "torchvision",
-            "catboost>=0.26; python_version<'3.13'",
+            "catboost>=0.26",
            "rgf-python",
            "optuna>=2.8.0,<=3.6.1",
-            "openml; python_version<'3.13'",
+            "openml",
            "statsmodels>=0.12.2",
            "psutil",
            "dataclasses",
@@ -82,7 +82,7 @@ setuptools.setup(
            "rouge_score",
            "hcrystalball",
            "seqeval",
-            "pytorch-forecasting; python_version<'3.13'",
+            "pytorch-forecasting",
            "mlflow-skinny<=2.22.1",  # Refer to https://mvnrepository.com/artifact/org.mlflow/mlflow-spark
            "joblibspark>=0.5.0",
            "joblib<=1.3.2",
@@ -116,14 +116,14 @@ setuptools.setup(
            "scikit-learn",
        ],
        "hf": [
-            "transformers[torch]==4.26",
+            "transformers[torch]>=4.26",
            "datasets",
            "nltk<=3.8.1",
            "rouge_score",
            "seqeval",
        ],
        "nlp": [  # for backward compatibility; hf is the new option name
-            "transformers[torch]==4.26",
+            "transformers[torch]>=4.26",
            "datasets",
            "nltk<=3.8.1",
            "rouge_score",
@@ -140,7 +140,7 @@ setuptools.setup(
            "prophet>=1.1.5",
            "statsmodels>=0.12.2",
            "hcrystalball>=0.1.10",
-            "pytorch-forecasting>=0.10.4; python_version<'3.13'",
+            "pytorch-forecasting>=0.10.4",
            "pytorch-lightning>=1.9.0",
            "tensorboardX>=2.6",
        ],
--- a/test/automl/test_custom_hp.py
+++ b/test/automl/test_custom_hp.py
@@ -4,8 +4,17 @@ import pytest

 from flaml import AutoML, tune

+try:
+    import transformers

-@pytest.mark.skipif(sys.platform == "darwin", reason="do not run on mac os")
+    _transformers_installed = True
+except ImportError:
+    _transformers_installed = False
+
+
+@pytest.mark.skipif(
+    sys.platform == "darwin" or not _transformers_installed, reason="do not run on mac os or transformers not installed"
+)
 def test_custom_hp_nlp():
    from test.nlp.utils import get_automl_settings, get_toy_data_seqclassification

@@ -63,5 +72,39 @@ def test_custom_hp():
    print(automl.best_config_per_estimator)


+def test_lgbm_objective():
+    """Test that objective parameter can be set via custom_hp for LGBMEstimator"""
+    import numpy as np
+
+    # Create a simple regression dataset
+    np.random.seed(42)
+    X_train = np.random.rand(100, 5)
+    y_train = np.random.rand(100) * 100  # Scale to avoid division issues with MAPE
+
+    automl = AutoML()
+    settings = {
+        "time_budget": 3,
+        "metric": "mape",
+        "task": "regression",
+        "estimator_list": ["lgbm"],
+        "verbose": 0,
+        "custom_hp": {"lgbm": {"objective": {"domain": "mape"}}},  # Fixed value, not tuned
+    }
+
+    automl.fit(X_train, y_train, **settings)
+
+    # Verify that objective was set correctly
+    assert "objective" in automl.best_config, "objective should be in best_config"
+    assert automl.best_config["objective"] == "mape", "objective should be 'mape'"
+
+    # Verify the model has the correct objective
+    if hasattr(automl.model, "estimator") and hasattr(automl.model.estimator, "get_params"):
+        model_params = automl.model.estimator.get_params()
+        assert model_params.get("objective") == "mape", "Model should use 'mape' objective"
+
+    print("Test passed: objective parameter works correctly with LGBMEstimator")
+
+
 if __name__ == "__main__":
    test_custom_hp()
+    test_lgbm_objective()
--- a/test/automl/test_extra_models.py
+++ b/test/automl/test_extra_models.py
@@ -1,3 +1,4 @@
+import atexit
 import os
 import sys
 import unittest
@@ -15,8 +16,16 @@ from sklearn.model_selection import train_test_split

 from flaml import AutoML
 from flaml.automl.ml import sklearn_metric_loss_score
+from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
 from flaml.tune.spark.utils import check_spark

+try:
+    import pytorch_lightning
+
+    _pl_installed = True
+except ImportError:
+    _pl_installed = False
+
 pytestmark = pytest.mark.spark

 leaderboard = defaultdict(dict)
@@ -39,7 +48,7 @@ else:
            .config(
                "spark.jars.packages",
                (
-                    "com.microsoft.azure:synapseml_2.12:1.0.2,"
+                    "com.microsoft.azure:synapseml_2.12:1.1.0,"
                    "org.apache.hadoop:hadoop-azure:3.3.5,"
                    "com.microsoft.azure:azure-storage:8.6.6,"
                    f"org.mlflow:mlflow-spark_2.12:{mlflow.__version__}"
@@ -63,6 +72,9 @@ else:
    except ImportError:
        skip_spark = True

+spark, ansi_conf, adjusted = disable_spark_ansi_mode()
+atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
+

 def _test_regular_models(estimator_list, task):
    if isinstance(estimator_list, str):
@@ -176,7 +188,11 @@ def _test_sparse_matrix_classification(estimator):
        "n_jobs": 1,
        "model_history": True,
    }
-    X_train = scipy.sparse.random(1554, 21, dtype=int)
+    # NOTE: Avoid `dtype=int` here. On some NumPy/SciPy combinations (notably
+    # Windows + Python 3.13), `scipy.sparse.random(..., dtype=int)` may trigger
+    # integer sampling paths which raise "low is out of bounds for int32".
+    # A float sparse matrix is sufficient to validate sparse-input support.
+    X_train = scipy.sparse.random(1554, 21, dtype=np.float32)
    y_train = np.random.randint(3, size=1554)
    automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)

@@ -271,7 +287,11 @@ class TestExtraModel(unittest.TestCase):

    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
    def test_default_spark(self):
-        _test_spark_models(None, "classification")
+        # TODO: remove the estimator assignment once SynapseML supports spark 4+.
+        from flaml.automl.spark.utils import _spark_major_minor_version
+
+        estimator_list = ["rf_spark"] if _spark_major_minor_version[0] >= 4 else None
+        _test_spark_models(estimator_list, "classification")

    def test_svc(self):
        _test_regular_models("svc", "classification")
@@ -302,7 +322,7 @@ class TestExtraModel(unittest.TestCase):
    def test_avg(self):
        _test_forecast("avg")

-    @unittest.skipIf(skip_spark, reason="Skip on Mac or Windows")
+    @unittest.skipIf(skip_spark or not _pl_installed, reason="Skip on Mac or Windows or no pytorch_lightning.")
    def test_tcn(self):
        _test_forecast("tcn")

--- a/test/automl/test_forecast.py
+++ b/test/automl/test_forecast.py
@@ -10,7 +10,7 @@ from flaml import AutoML
 from flaml.automl.task.time_series_task import TimeSeriesTask


-def test_forecast_automl(budget=10, estimators_when_no_prophet=["arima", "sarimax", "holt-winters"]):
+def test_forecast_automl(budget=20, estimators_when_no_prophet=["arima", "sarimax", "holt-winters"]):
    # using dataframe
    import statsmodels.api as sm

@@ -510,8 +510,12 @@ def get_stalliion_data():
    "3.11" in sys.version,
    reason="do not run on py 3.11",
 )
-def test_forecast_panel(budget=5):
-    data, special_days = get_stalliion_data()
+def test_forecast_panel(budget=30):
+    try:
+        data, special_days = get_stalliion_data()
+    except ImportError:
+        print("pytorch_forecasting not installed")
+        return
    time_horizon = 6  # predict six months
    training_cutoff = data["time_idx"].max() - time_horizon
    data["time_idx"] = data["time_idx"].astype("int")
@@ -677,11 +681,55 @@ def test_cv_step():
    print("yahoo!")


+def test_log_training_metric_ts_models():
+    """Test that log_training_metric=True works with time series models (arima, sarimax, holt-winters)."""
+    import statsmodels.api as sm
+
+    from flaml.automl.task.time_series_task import TimeSeriesTask
+
+    estimators_all = TimeSeriesTask("forecast").estimators.keys()
+    estimators_to_test = ["xgboost", "arima", "lassolars", "tcn", "snaive", "prophet", "orbit"]
+    estimators = [
+        est for est in estimators_to_test if est in estimators_all
+    ]  # not all estimators available in current python env
+    print(f"Testing estimators: {estimators}")
+
+    # Prepare data
+    data = sm.datasets.co2.load_pandas().data["co2"]
+    data = data.resample("MS").mean()
+    data = data.bfill().ffill()
+    data = data.to_frame().reset_index()
+    data = data.rename(columns={"index": "ds", "co2": "y"})
+    num_samples = data.shape[0]
+    time_horizon = 12
+    split_idx = num_samples - time_horizon
+    df = data[:split_idx]
+
+    # Test each time series model with log_training_metric=True
+    for estimator in estimators:
+        print(f"\nTesting {estimator} with log_training_metric=True")
+        automl = AutoML()
+        settings = {
+            "time_budget": 3,
+            "metric": "mape",
+            "task": "forecast",
+            "eval_method": "holdout",
+            "label": "y",
+            "log_training_metric": True,  # This should not cause errors
+            "estimator_list": [estimator],
+        }
+        automl.fit(dataframe=df, **settings, period=time_horizon, force_cancel=True)
+        print(f"  ✅ {estimator} SUCCESS with log_training_metric=True")
+        if automl.best_estimator:
+            assert automl.best_estimator == estimator
+
+
 if __name__ == "__main__":
    # test_forecast_automl(60)
    # test_multivariate_forecast_num(5)
    # test_multivariate_forecast_cat(5)
-    test_numpy()
+    # test_numpy()
    # test_forecast_classification(5)
    # test_forecast_panel(5)
    # test_cv_step()
+    test_log_training_metric_ts_models()
--- a/test/automl/test_multiclass.py
+++ b/test/automl/test_multiclass.py
@@ -181,6 +181,49 @@ class TestMultiClass(unittest.TestCase):
        }
        automl.fit(X_train=X_train, y_train=y_train, **settings)

+    def test_ensemble_final_estimator_params_not_tuned(self):
+        """Test that final_estimator parameters in ensemble are not automatically tuned.
+
+        This test verifies that when a custom final_estimator is provided with specific
+        parameters, those parameters are used as-is without any hyperparameter tuning.
+        """
+        from sklearn.linear_model import LogisticRegression
+
+        automl = AutoML()
+        X_train, y_train = load_wine(return_X_y=True)
+
+        # Create a LogisticRegression with specific non-default parameters
+        custom_params = {
+            "C": 0.5,  # Non-default value
+            "max_iter": 50,  # Non-default value
+            "random_state": 42,
+        }
+        final_est = LogisticRegression(**custom_params)
+
+        settings = {
+            "time_budget": 5,
+            "estimator_list": ["rf", "lgbm"],
+            "task": "classification",
+            "ensemble": {
+                "final_estimator": final_est,
+                "passthrough": False,
+            },
+            "n_jobs": 1,
+        }
+        automl.fit(X_train=X_train, y_train=y_train, **settings)
+
+        # Verify that the final estimator in the stacker uses the exact parameters we specified
+        if hasattr(automl.model, "final_estimator_"):
+            # The model is a StackingClassifier
+            fitted_final_estimator = automl.model.final_estimator_
+            assert (
+                abs(fitted_final_estimator.C - custom_params["C"]) < 1e-9
+            ), f"Expected C={custom_params['C']}, but got {fitted_final_estimator.C}"
+            assert (
+                fitted_final_estimator.max_iter == custom_params["max_iter"]
+            ), f"Expected max_iter={custom_params['max_iter']}, but got {fitted_final_estimator.max_iter}"
+            print("✓ Final estimator parameters were preserved (not tuned)")
+
    def test_dataframe(self):
        self.test_classification(True)

@@ -235,6 +278,34 @@ class TestMultiClass(unittest.TestCase):
        except ImportError:
            pass

+    def test_invalid_custom_metric(self):
+        """Test that proper error is raised when custom_metric is called instead of passed."""
+        from sklearn.datasets import load_iris
+
+        X_train, y_train = load_iris(return_X_y=True)
+
+        # Test with non-callable metric in __init__
+        with self.assertRaises(ValueError) as context:
+            automl = AutoML(metric=123)  # passing an int instead of function
+        self.assertIn("must be either a string or a callable function", str(context.exception))
+        self.assertIn("but got int", str(context.exception))
+
+        # Test with non-callable metric in fit
+        automl = AutoML()
+        with self.assertRaises(ValueError) as context:
+            automl.fit(X_train=X_train, y_train=y_train, metric=[], task="classification", time_budget=1)
+        self.assertIn("must be either a string or a callable function", str(context.exception))
+        self.assertIn("but got list", str(context.exception))
+
+        # Test with tuple (simulating result of calling a function that returns tuple)
+        with self.assertRaises(ValueError) as context:
+            automl = AutoML()
+            automl.fit(
+                X_train=X_train, y_train=y_train, metric=(0.5, {"loss": 0.5}), task="classification", time_budget=1
+            )
+        self.assertIn("must be either a string or a callable function", str(context.exception))
+        self.assertIn("but got tuple", str(context.exception))
+
    def test_classification(self, as_frame=False):
        automl_experiment = AutoML()
        automl_settings = {
@@ -368,7 +439,11 @@ class TestMultiClass(unittest.TestCase):
            "n_jobs": 1,
            "model_history": True,
        }
-        X_train = scipy.sparse.random(1554, 21, dtype=int)
+        # NOTE: Avoid `dtype=int` here. On some NumPy/SciPy combinations (notably
+        # Windows + Python 3.13), `scipy.sparse.random(..., dtype=int)` may trigger
+        # integer sampling paths which raise "low is out of bounds for int32".
+        # A float sparse matrix is sufficient to validate sparse-input support.
+        X_train = scipy.sparse.random(1554, 21, dtype=np.float32)
        y_train = np.random.randint(3, size=1554)
        automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)
        print(automl_experiment.classes_)
@@ -531,6 +606,32 @@ class TestMultiClass(unittest.TestCase):
        print(f"Best accuracy on validation data: {new_automl_val_accuracy:.4g}")
        # print('Training duration of best run: {0:.4g} s'.format(new_automl_experiment.best_config_train_time))

+    def test_starting_points_should_improve_performance(self):
+        N = 10000  # a large N is needed to see the improvement
+        X_train, y_train = load_iris(return_X_y=True)
+        X_train = np.concatenate([X_train + 0.1 * i for i in range(N)], axis=0)
+        y_train = np.concatenate([y_train] * N, axis=0)
+
+        am1 = AutoML()
+        am1.fit(X_train, y_train, estimator_list=["lgbm"], time_budget=3, seed=11)
+
+        am2 = AutoML()
+        am2.fit(
+            X_train,
+            y_train,
+            estimator_list=["lgbm"],
+            time_budget=2,
+            seed=11,
+            starting_points=am1.best_config_per_estimator,
+        )
+
+        print(f"am1.best_loss: {am1.best_loss:.4f}")
+        print(f"am2.best_loss: {am2.best_loss:.4f}")
+
+        assert np.round(am2.best_loss, 4) <= np.round(
+            am1.best_loss, 4
+        ), "Starting points should help improve the performance!"
+

 if __name__ == "__main__":
    unittest.main()
--- a/test/automl/test_no_overlap.py
+++ b/test/automl/test_no_overlap.py
@@ -0,0 +1,272 @@
+"""Test to ensure correct label overlap handling for classification tasks"""
+import numpy as np
+import pandas as pd
+from sklearn.datasets import load_iris, make_classification
+
+from flaml import AutoML
+
+
+def test_allow_label_overlap_true():
+    """Test with allow_label_overlap=True (fast mode, default)"""
+    # Load iris dataset
+    dic_data = load_iris(as_frame=True)
+    iris_data = dic_data["frame"]
+
+    # Prepare data
+    x_train = iris_data[["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"]].to_numpy()
+    y_train = iris_data["target"]
+
+    # Train with fast mode (default)
+    automl = AutoML()
+    automl_settings = {
+        "max_iter": 5,
+        "metric": "accuracy",
+        "task": "classification",
+        "estimator_list": ["lgbm"],
+        "eval_method": "holdout",
+        "split_type": "stratified",
+        "keep_search_state": True,
+        "retrain_full": False,
+        "auto_augment": False,
+        "verbose": 0,
+        "allow_label_overlap": True,  # Fast mode
+    }
+    automl.fit(x_train, y_train, **automl_settings)
+
+    # Check results
+    input_size = len(x_train)
+    train_size = len(automl._state.X_train)
+    val_size = len(automl._state.X_val)
+
+    # With stratified split on balanced data, fast mode may have no overlap
+    assert (
+        train_size + val_size >= input_size
+    ), f"Inconsistent sizes. Input: {input_size}, Train: {train_size}, Val: {val_size}"
+
+    # Verify all classes are represented in both sets
+    train_labels = set(np.unique(automl._state.y_train))
+    val_labels = set(np.unique(automl._state.y_val))
+    all_labels = set(np.unique(y_train))
+
+    assert train_labels == all_labels, f"Not all labels in train. All: {all_labels}, Train: {train_labels}"
+    assert val_labels == all_labels, f"Not all labels in val. All: {all_labels}, Val: {val_labels}"
+
+    print(
+        f"✓ Test passed (fast mode): Input: {input_size}, Train: {train_size}, Val: {val_size}, "
+        f"Overlap: {train_size + val_size - input_size}"
+    )
+
+
+def test_allow_label_overlap_false():
+    """Test with allow_label_overlap=False (precise mode)"""
+    # Load iris dataset
+    dic_data = load_iris(as_frame=True)
+    iris_data = dic_data["frame"]
+
+    # Prepare data
+    x_train = iris_data[["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"]].to_numpy()
+    y_train = iris_data["target"]
+
+    # Train with precise mode
+    automl = AutoML()
+    automl_settings = {
+        "max_iter": 5,
+        "metric": "accuracy",
+        "task": "classification",
+        "estimator_list": ["lgbm"],
+        "eval_method": "holdout",
+        "split_type": "stratified",
+        "keep_search_state": True,
+        "retrain_full": False,
+        "auto_augment": False,
+        "verbose": 0,
+        "allow_label_overlap": False,  # Precise mode
+    }
+    automl.fit(x_train, y_train, **automl_settings)
+
+    # Check that there's no overlap (or minimal overlap for single-instance classes)
+    input_size = len(x_train)
+    train_size = len(automl._state.X_train)
+    val_size = len(automl._state.X_val)
+
+    # Verify all classes are represented
+    all_labels = set(np.unique(y_train))
+
+    # Should have no overlap or minimal overlap
+    overlap = train_size + val_size - input_size
+    assert overlap <= len(all_labels), f"Excessive overlap: {overlap}"
+
+    # Verify all classes are represented
+    train_labels = set(np.unique(automl._state.y_train))
+    val_labels = set(np.unique(automl._state.y_val))
+
+    combined_labels = train_labels.union(val_labels)
+    assert combined_labels == all_labels, f"Not all labels present. All: {all_labels}, Combined: {combined_labels}"
+
+    print(
+        f"✓ Test passed (precise mode): Input: {input_size}, Train: {train_size}, Val: {val_size}, "
+        f"Overlap: {overlap}"
+    )
+
+
+def test_uniform_split_with_overlap_control():
+    """Test with uniform split and both overlap modes"""
+    # Load iris dataset
+    dic_data = load_iris(as_frame=True)
+    iris_data = dic_data["frame"]
+
+    # Prepare data
+    x_train = iris_data[["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"]].to_numpy()
+    y_train = iris_data["target"]
+
+    # Test precise mode with uniform split
+    automl = AutoML()
+    automl_settings = {
+        "max_iter": 5,
+        "metric": "accuracy",
+        "task": "classification",
+        "estimator_list": ["lgbm"],
+        "eval_method": "holdout",
+        "split_type": "uniform",
+        "keep_search_state": True,
+        "retrain_full": False,
+        "auto_augment": False,
+        "verbose": 0,
+        "allow_label_overlap": False,  # Precise mode
+    }
+    automl.fit(x_train, y_train, **automl_settings)
+
+    input_size = len(x_train)
+    train_size = len(automl._state.X_train)
+    val_size = len(automl._state.X_val)
+
+    # Verify all classes are represented
+    train_labels = set(np.unique(automl._state.y_train))
+    val_labels = set(np.unique(automl._state.y_val))
+    all_labels = set(np.unique(y_train))
+
+    combined_labels = train_labels.union(val_labels)
+    assert combined_labels == all_labels, "Not all labels present with uniform split"
+
+    print(f"✓ Test passed (uniform split): Input: {input_size}, Train: {train_size}, Val: {val_size}")
+
+
+def test_with_sample_weights():
+    """Test label overlap handling with sample weights"""
+    # Create a simple dataset
+    X, y = make_classification(
+        n_samples=200,
+        n_features=10,
+        n_informative=5,
+        n_redundant=2,
+        n_classes=3,
+        n_clusters_per_class=1,
+        random_state=42,
+    )
+
+    # Create sample weights (giving more weight to some samples)
+    sample_weight = np.random.uniform(0.5, 2.0, size=len(y))
+
+    # Test fast mode with sample weights
+    automl_fast = AutoML()
+    automl_fast.fit(
+        X,
+        y,
+        task="classification",
+        metric="accuracy",
+        estimator_list=["lgbm"],
+        eval_method="holdout",
+        split_type="stratified",
+        max_iter=3,
+        keep_search_state=True,
+        retrain_full=False,
+        auto_augment=False,
+        verbose=0,
+        allow_label_overlap=True,  # Fast mode
+        sample_weight=sample_weight,
+    )
+
+    # Verify all labels present
+    train_labels_fast = set(np.unique(automl_fast._state.y_train))
+    val_labels_fast = set(np.unique(automl_fast._state.y_val))
+    all_labels = set(np.unique(y))
+
+    assert train_labels_fast == all_labels, "Not all labels in train (fast mode with weights)"
+    assert val_labels_fast == all_labels, "Not all labels in val (fast mode with weights)"
+
+    # Test precise mode with sample weights
+    automl_precise = AutoML()
+    automl_precise.fit(
+        X,
+        y,
+        task="classification",
+        metric="accuracy",
+        estimator_list=["lgbm"],
+        eval_method="holdout",
+        split_type="stratified",
+        max_iter=3,
+        keep_search_state=True,
+        retrain_full=False,
+        auto_augment=False,
+        verbose=0,
+        allow_label_overlap=False,  # Precise mode
+        sample_weight=sample_weight,
+    )
+
+    # Verify all labels present
+    train_labels_precise = set(np.unique(automl_precise._state.y_train))
+    val_labels_precise = set(np.unique(automl_precise._state.y_val))
+
+    combined_labels = train_labels_precise.union(val_labels_precise)
+    assert combined_labels == all_labels, "Not all labels present (precise mode with weights)"
+
+    print("✓ Test passed with sample weights (fast and precise modes)")
+
+
+def test_single_instance_class():
+    """Test handling of single-instance classes"""
+    # Create imbalanced dataset where one class has only 1 instance
+    X = np.random.randn(50, 4)
+    y = np.array([0] * 40 + [1] * 9 + [2] * 1)  # Class 2 has only 1 instance
+
+    # Test precise mode - should add single instance to both sets
+    automl = AutoML()
+    automl.fit(
+        X,
+        y,
+        task="classification",
+        metric="accuracy",
+        estimator_list=["lgbm"],
+        eval_method="holdout",
+        split_type="uniform",
+        max_iter=3,
+        keep_search_state=True,
+        retrain_full=False,
+        auto_augment=False,
+        verbose=0,
+        allow_label_overlap=False,  # Precise mode
+    )
+
+    # Verify all labels present
+    train_labels = set(np.unique(automl._state.y_train))
+    val_labels = set(np.unique(automl._state.y_val))
+    all_labels = set(np.unique(y))
+
+    # Single-instance class should be in both sets
+    combined_labels = train_labels.union(val_labels)
+    assert combined_labels == all_labels, "Not all labels present with single-instance class"
+
+    # Check that single-instance class (label 2) is in both sets
+    assert 2 in train_labels, "Single-instance class not in train"
+    assert 2 in val_labels, "Single-instance class not in val"
+
+    print("✓ Test passed with single-instance class")
+
+
+if __name__ == "__main__":
+    test_allow_label_overlap_true()
+    test_allow_label_overlap_false()
+    test_uniform_split_with_overlap_control()
+    test_with_sample_weights()
+    test_single_instance_class()
+    print("\n✓ All tests passed!")
--- a/test/automl/test_notebook_example.py
+++ b/test/automl/test_notebook_example.py
@@ -79,6 +79,9 @@ def test_automl(budget=5, dataset_format="dataframe", hpo_method=None):
    automl.fit(X_train=X_train, y_train=y_train, **settings)
    """ retrieve best config and best learner """
    print("Best ML leaner:", automl.best_estimator)
+    if not automl.best_estimator:
+        print("Training budget is not sufficient")
+        return
    print("Best hyperparmeter config:", automl.best_config)
    print(f"Best accuracy on validation data: {1 - automl.best_loss:.4g}")
    print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")
--- a/test/automl/test_preprocess_api.py
+++ b/test/automl/test_preprocess_api.py
@@ -0,0 +1,236 @@
+"""Tests for the public preprocessor APIs."""
+import unittest
+
+import numpy as np
+import pandas as pd
+from sklearn.datasets import load_breast_cancer, load_diabetes
+
+from flaml import AutoML
+
+
+class TestPreprocessAPI(unittest.TestCase):
+    """Test cases for the public preprocess() API methods."""
+
+    def test_automl_preprocess_before_fit(self):
+        """Test that calling preprocess before fit raises an error."""
+        automl = AutoML()
+        X_test = np.array([[1, 2, 3], [4, 5, 6]])
+
+        with self.assertRaises(AttributeError) as context:
+            automl.preprocess(X_test)
+        # Check that an error is raised about not being fitted
+        self.assertIn("fit()", str(context.exception))
+
+    def test_automl_preprocess_classification(self):
+        """Test task-level preprocessing for classification."""
+        # Load dataset
+        X, y = load_breast_cancer(return_X_y=True)
+        X_train, y_train = X[:400], y[:400]
+        X_test = X[400:450]
+
+        # Train AutoML
+        automl = AutoML()
+        automl_settings = {
+            "max_iter": 5,
+            "task": "classification",
+            "metric": "accuracy",
+            "estimator_list": ["lgbm"],
+            "verbose": 0,
+        }
+        automl.fit(X_train, y_train, **automl_settings)
+
+        # Test task-level preprocessing
+        X_preprocessed = automl.preprocess(X_test)
+
+        # Verify the output is not None and has the right shape
+        self.assertIsNotNone(X_preprocessed)
+        self.assertEqual(X_preprocessed.shape[0], X_test.shape[0])
+
+    def test_automl_preprocess_regression(self):
+        """Test task-level preprocessing for regression."""
+        # Load dataset
+        X, y = load_diabetes(return_X_y=True)
+        X_train, y_train = X[:300], y[:300]
+        X_test = X[300:350]
+
+        # Train AutoML
+        automl = AutoML()
+        automl_settings = {
+            "max_iter": 5,
+            "task": "regression",
+            "metric": "r2",
+            "estimator_list": ["lgbm"],
+            "verbose": 0,
+        }
+        automl.fit(X_train, y_train, **automl_settings)
+
+        # Test task-level preprocessing
+        X_preprocessed = automl.preprocess(X_test)
+
+        # Verify the output
+        self.assertIsNotNone(X_preprocessed)
+        self.assertEqual(X_preprocessed.shape[0], X_test.shape[0])
+
+    def test_automl_preprocess_with_dataframe(self):
+        """Test task-level preprocessing with pandas DataFrame."""
+        # Create a simple dataset
+        X_train = pd.DataFrame(
+            {
+                "feature1": [1, 2, 3, 4, 5] * 20,
+                "feature2": [5, 4, 3, 2, 1] * 20,
+                "category": ["a", "b", "a", "b", "a"] * 20,
+            }
+        )
+        y_train = pd.Series([0, 1, 0, 1, 0] * 20)
+
+        X_test = pd.DataFrame(
+            {
+                "feature1": [6, 7, 8],
+                "feature2": [1, 2, 3],
+                "category": ["a", "b", "a"],
+            }
+        )
+
+        # Train AutoML
+        automl = AutoML()
+        automl_settings = {
+            "max_iter": 5,
+            "task": "classification",
+            "metric": "accuracy",
+            "estimator_list": ["lgbm"],
+            "verbose": 0,
+        }
+        automl.fit(X_train, y_train, **automl_settings)
+
+        # Test preprocessing
+        X_preprocessed = automl.preprocess(X_test)
+
+        # Verify the output - check the number of rows matches
+        self.assertIsNotNone(X_preprocessed)
+        preprocessed_len = len(X_preprocessed) if hasattr(X_preprocessed, "__len__") else X_preprocessed.shape[0]
+        self.assertEqual(preprocessed_len, len(X_test))
+
+    def test_estimator_preprocess(self):
+        """Test estimator-level preprocessing."""
+        # Load dataset
+        X, y = load_breast_cancer(return_X_y=True)
+        X_train, y_train = X[:400], y[:400]
+        X_test = X[400:450]
+
+        # Train AutoML
+        automl = AutoML()
+        automl_settings = {
+            "max_iter": 5,
+            "task": "classification",
+            "metric": "accuracy",
+            "estimator_list": ["lgbm"],
+            "verbose": 0,
+        }
+        automl.fit(X_train, y_train, **automl_settings)
+
+        # Get the trained estimator
+        estimator = automl.model
+        self.assertIsNotNone(estimator)
+
+        # First apply task-level preprocessing
+        X_task_preprocessed = automl.preprocess(X_test)
+
+        # Then apply estimator-level preprocessing
+        X_estimator_preprocessed = estimator.preprocess(X_task_preprocessed)
+
+        # Verify the output
+        self.assertIsNotNone(X_estimator_preprocessed)
+        self.assertEqual(X_estimator_preprocessed.shape[0], X_test.shape[0])
+
+    def test_preprocess_pipeline(self):
+        """Test the complete preprocessing pipeline (task-level then estimator-level)."""
+        # Load dataset
+        X, y = load_breast_cancer(return_X_y=True)
+        X_train, y_train = X[:400], y[:400]
+        X_test = X[400:450]
+
+        # Train AutoML
+        automl = AutoML()
+        automl_settings = {
+            "max_iter": 5,
+            "task": "classification",
+            "metric": "accuracy",
+            "estimator_list": ["lgbm"],
+            "verbose": 0,
+        }
+        automl.fit(X_train, y_train, **automl_settings)
+
+        # Apply the complete preprocessing pipeline
+        X_task_preprocessed = automl.preprocess(X_test)
+        X_final = automl.model.preprocess(X_task_preprocessed)
+
+        # Verify predictions work with preprocessed data
+        # The internal predict already does this preprocessing,
+        # but we verify our manual preprocessing gives consistent results
+        y_pred_manual = automl.model._model.predict(X_final)
+        y_pred_auto = automl.predict(X_test)
+
+        # Both should give the same predictions
+        np.testing.assert_array_equal(y_pred_manual, y_pred_auto)
+
+    def test_preprocess_with_mixed_types(self):
+        """Test preprocessing with mixed data types."""
+        # Create dataset with mixed types
+        X_train = pd.DataFrame(
+            {
+                "numeric1": np.random.rand(100),
+                "numeric2": np.random.randint(0, 100, 100),
+                "categorical": np.random.choice(["cat", "dog", "bird"], 100),
+                "boolean": np.random.choice([True, False], 100),
+            }
+        )
+        y_train = pd.Series(np.random.randint(0, 2, 100))
+
+        X_test = pd.DataFrame(
+            {
+                "numeric1": np.random.rand(10),
+                "numeric2": np.random.randint(0, 100, 10),
+                "categorical": np.random.choice(["cat", "dog", "bird"], 10),
+                "boolean": np.random.choice([True, False], 10),
+            }
+        )
+
+        # Train AutoML
+        automl = AutoML()
+        automl_settings = {
+            "max_iter": 5,
+            "task": "classification",
+            "metric": "accuracy",
+            "estimator_list": ["lgbm"],
+            "verbose": 0,
+        }
+        automl.fit(X_train, y_train, **automl_settings)
+
+        # Test preprocessing
+        X_preprocessed = automl.preprocess(X_test)
+
+        # Verify the output
+        self.assertIsNotNone(X_preprocessed)
+
+    def test_estimator_preprocess_without_automl(self):
+        """Test that estimator.preprocess() can be used independently."""
+        from flaml.automl.model import LGBMEstimator
+
+        # Create a simple estimator
+        X_train = np.random.rand(100, 5)
+        y_train = np.random.randint(0, 2, 100)
+
+        estimator = LGBMEstimator(task="classification")
+        estimator.fit(X_train, y_train)
+
+        # Test preprocessing
+        X_test = np.random.rand(10, 5)
+        X_preprocessed = estimator.preprocess(X_test)
+
+        # Verify the output
+        self.assertIsNotNone(X_preprocessed)
+        self.assertEqual(X_preprocessed.shape, X_test.shape)
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/test/automl/test_regression.py
+++ b/test/automl/test_regression.py
@@ -130,7 +130,7 @@ class TestRegression(unittest.TestCase):
        )
        automl.fit(X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **settings)

-    def test_parallel(self, hpo_method=None):
+    def test_parallel_and_pickle(self, hpo_method=None):
        automl_experiment = AutoML()
        automl_settings = {
            "time_budget": 10,
@@ -153,6 +153,18 @@ class TestRegression(unittest.TestCase):
        except ImportError:
            return

+        # test pickle and load_pickle, should work for prediction
+        automl_experiment.pickle("automl_xgboost_spark.pkl")
+        automl_loaded = AutoML().load_pickle("automl_xgboost_spark.pkl")
+        assert automl_loaded.best_estimator == automl_experiment.best_estimator
+        assert automl_loaded.best_loss == automl_experiment.best_loss
+        automl_loaded.predict(X_train)
+
+        import shutil
+
+        shutil.rmtree("automl_xgboost_spark.pkl", ignore_errors=True)
+        shutil.rmtree("automl_xgboost_spark.pkl.flaml_artifacts", ignore_errors=True)
+
    def test_sparse_matrix_regression_holdout(self):
        X_train = scipy.sparse.random(8, 100)
        y_train = np.random.uniform(size=8)
--- a/test/automl/test_sklearn_17_compat.py
+++ b/test/automl/test_sklearn_17_compat.py
@@ -0,0 +1,89 @@
+"""Test sklearn 1.7+ compatibility for estimator type detection.
+
+This test ensures that FLAML estimators are properly recognized as
+regressors or classifiers by sklearn's is_regressor() and is_classifier()
+functions, which is required for sklearn 1.7+ ensemble methods.
+"""
+
+import pytest
+from sklearn.base import is_classifier, is_regressor
+
+from flaml.automl.model import (
+    ExtraTreesEstimator,
+    LGBMEstimator,
+    RandomForestEstimator,
+    XGBoostSklearnEstimator,
+)
+
+
+def test_extra_trees_regressor_type():
+    """Test that ExtraTreesEstimator with regression task is recognized as regressor."""
+    est = ExtraTreesEstimator(task="regression")
+    assert is_regressor(est), "ExtraTreesEstimator(task='regression') should be recognized as a regressor"
+    assert not is_classifier(est), "ExtraTreesEstimator(task='regression') should not be recognized as a classifier"
+
+
+def test_extra_trees_classifier_type():
+    """Test that ExtraTreesEstimator with classification task is recognized as classifier."""
+    est = ExtraTreesEstimator(task="binary")
+    assert is_classifier(est), "ExtraTreesEstimator(task='binary') should be recognized as a classifier"
+    assert not is_regressor(est), "ExtraTreesEstimator(task='binary') should not be recognized as a regressor"
+
+    est = ExtraTreesEstimator(task="multiclass")
+    assert is_classifier(est), "ExtraTreesEstimator(task='multiclass') should be recognized as a classifier"
+    assert not is_regressor(est), "ExtraTreesEstimator(task='multiclass') should not be recognized as a regressor"
+
+
+def test_random_forest_regressor_type():
+    """Test that RandomForestEstimator with regression task is recognized as regressor."""
+    est = RandomForestEstimator(task="regression")
+    assert is_regressor(est), "RandomForestEstimator(task='regression') should be recognized as a regressor"
+    assert not is_classifier(est), "RandomForestEstimator(task='regression') should not be recognized as a classifier"
+
+
+def test_random_forest_classifier_type():
+    """Test that RandomForestEstimator with classification task is recognized as classifier."""
+    est = RandomForestEstimator(task="binary")
+    assert is_classifier(est), "RandomForestEstimator(task='binary') should be recognized as a classifier"
+    assert not is_regressor(est), "RandomForestEstimator(task='binary') should not be recognized as a regressor"
+
+
+def test_lgbm_regressor_type():
+    """Test that LGBMEstimator with regression task is recognized as regressor."""
+    est = LGBMEstimator(task="regression")
+    assert is_regressor(est), "LGBMEstimator(task='regression') should be recognized as a regressor"
+    assert not is_classifier(est), "LGBMEstimator(task='regression') should not be recognized as a classifier"
+
+
+def test_lgbm_classifier_type():
+    """Test that LGBMEstimator with classification task is recognized as classifier."""
+    est = LGBMEstimator(task="binary")
+    assert is_classifier(est), "LGBMEstimator(task='binary') should be recognized as a classifier"
+    assert not is_regressor(est), "LGBMEstimator(task='binary') should not be recognized as a regressor"
+
+
+def test_xgboost_regressor_type():
+    """Test that XGBoostSklearnEstimator with regression task is recognized as regressor."""
+    est = XGBoostSklearnEstimator(task="regression")
+    assert is_regressor(est), "XGBoostSklearnEstimator(task='regression') should be recognized as a regressor"
+    assert not is_classifier(est), "XGBoostSklearnEstimator(task='regression') should not be recognized as a classifier"
+
+
+def test_xgboost_classifier_type():
+    """Test that XGBoostSklearnEstimator with classification task is recognized as classifier."""
+    est = XGBoostSklearnEstimator(task="binary")
+    assert is_classifier(est), "XGBoostSklearnEstimator(task='binary') should be recognized as a classifier"
+    assert not is_regressor(est), "XGBoostSklearnEstimator(task='binary') should not be recognized as a regressor"
+
+
+if __name__ == "__main__":
+    # Run all tests
+    test_extra_trees_regressor_type()
+    test_extra_trees_classifier_type()
+    test_random_forest_regressor_type()
+    test_random_forest_classifier_type()
+    test_lgbm_regressor_type()
+    test_lgbm_classifier_type()
+    test_xgboost_regressor_type()
+    test_xgboost_classifier_type()
+    print("All sklearn 1.7+ compatibility tests passed!")
--- a/test/default/test_defaults.py
+++ b/test/default/test_defaults.py
@@ -183,6 +183,8 @@ def test_lgbm():


 def test_xgboost():
+    import numpy as np
+
    from flaml.default import XGBClassifier, XGBRegressor

    X_train, y_train = load_breast_cancer(return_X_y=True, as_frame=True)
@@ -200,6 +202,65 @@ def test_xgboost():
    regressor.predict(X_train)
    print(regressor)

+    # Test eval_set with categorical features (Issue: eval_set not preprocessed)
+    np.random.seed(42)
+    n = 500
+    df = pd.DataFrame(
+        {
+            "num1": np.random.randn(n),
+            "num2": np.random.rand(n) * 10,
+            "cat1": np.random.choice(["A", "B", "C"], size=n),
+            "cat2": np.random.choice(["X", "Y"], size=n),
+            "target": np.random.choice([0, 1], size=n),
+        }
+    )
+
+    X = df.drop(columns="target")
+    y = df["target"]
+
+    X_train_cat, X_valid_cat, y_train_cat, y_valid_cat = train_test_split(X, y, test_size=0.2, random_state=0)
+
+    # Convert categorical columns to pandas 'category' dtype
+    for col in X_train_cat.select_dtypes(include="object").columns:
+        X_train_cat[col] = X_train_cat[col].astype("category")
+        X_valid_cat[col] = X_valid_cat[col].astype("category")
+
+    # Test XGBClassifier with eval_set
+    classifier_eval = XGBClassifier(
+        tree_method="hist",
+        enable_categorical=True,
+        eval_metric="logloss",
+        use_label_encoder=False,
+        early_stopping_rounds=10,
+        random_state=0,
+        n_estimators=10,
+    )
+    classifier_eval.fit(X_train_cat, y_train_cat, eval_set=[(X_valid_cat, y_valid_cat)], verbose=False)
+    y_pred = classifier_eval.predict(X_valid_cat)
+    assert len(y_pred) == len(y_valid_cat)
+
+    # Test XGBRegressor with eval_set
+    y_reg = df["num1"]  # Use num1 as target for regression
+    X_reg = df.drop(columns=["num1", "target"])
+
+    X_train_reg, X_valid_reg, y_train_reg, y_valid_reg = train_test_split(X_reg, y_reg, test_size=0.2, random_state=0)
+
+    for col in X_train_reg.select_dtypes(include="object").columns:
+        X_train_reg[col] = X_train_reg[col].astype("category")
+        X_valid_reg[col] = X_valid_reg[col].astype("category")
+
+    regressor_eval = XGBRegressor(
+        tree_method="hist",
+        enable_categorical=True,
+        eval_metric="rmse",
+        early_stopping_rounds=10,
+        random_state=0,
+        n_estimators=10,
+    )
+    regressor_eval.fit(X_train_reg, y_train_reg, eval_set=[(X_valid_reg, y_valid_reg)], verbose=False)
+    y_pred = regressor_eval.predict(X_valid_reg)
+    assert len(y_pred) == len(y_valid_reg)
+

 def test_nobudget():
    X_train, y_train = load_breast_cancer(return_X_y=True, as_frame=True)
--- a/test/nlp/test_autohf_classificationhead.py
+++ b/test/nlp/test_autohf_classificationhead.py
@@ -3,6 +3,12 @@ import shutil
 import sys

 import pytest
+
+try:
+    import transformers
+except ImportError:
+    pytest.skip("transformers not installed", allow_module_level=True)
+
 from utils import (
    get_automl_settings,
    get_toy_data_binclassification,
--- a/test/nlp/test_autohf_cv.py
+++ b/test/nlp/test_autohf_cv.py
@@ -5,10 +5,20 @@ import sys
 import pytest
 from utils import get_automl_settings, get_toy_data_seqclassification

+try:
+    import transformers
+
+    _transformers_installed = True
+except ImportError:
+    _transformers_installed = False
+
 pytestmark = pytest.mark.spark  # set to spark as parallel testing raised MlflowException of changing parameter


-@pytest.mark.skipif(sys.platform in ["darwin", "win32"], reason="do not run on mac os or windows")
+@pytest.mark.skipif(
+    sys.platform in ["darwin", "win32"] or not _transformers_installed,
+    reason="do not run on mac os or windows or transformers not installed",
+)
 def test_cv():
    import requests

--- a/test/nlp/test_autohf_multichoice_classification.py
+++ b/test/nlp/test_autohf_multichoice_classification.py
@@ -5,8 +5,18 @@ import sys
 import pytest
 from utils import get_automl_settings, get_toy_data_multiplechoiceclassification

+try:
+    import transformers

-@pytest.mark.skipif(sys.platform in ["darwin", "win32"], reason="do not run on mac os or windows")
+    _transformers_installed = True
+except ImportError:
+    _transformers_installed = False
+
+
+@pytest.mark.skipif(
+    sys.platform in ["darwin", "win32"] or not _transformers_installed,
+    reason="do not run on mac os or windows or transformers not installed",
+)
 def test_mcc():
    import requests

--- a/test/nlp/test_default.py
+++ b/test/nlp/test_default.py
@@ -7,8 +7,20 @@ from utils import get_automl_settings, get_toy_data_seqclassification

 from flaml.default import portfolio

-if sys.platform.startswith("darwin") and sys.version_info[0] == 3 and sys.version_info[1] == 11:
-    pytest.skip("skipping Python 3.11 on MacOS", allow_module_level=True)
+try:
+    import transformers
+
+    _transformers_installed = True
+except ImportError:
+    _transformers_installed = False
+
+if (
+    sys.platform.startswith("darwin")
+    and sys.version_info >= (3, 11)
+    or not _transformers_installed
+    or sys.platform == "win32"
+):
+    pytest.skip("skipping Python 3.11 on MacOS or without transformers or on Windows", allow_module_level=True)

 pytestmark = (
    pytest.mark.spark
@@ -28,7 +40,6 @@ def test_build_portfolio(path="./test/nlp/default", strategy="greedy"):
    portfolio.main()


-@pytest.mark.skipif(sys.platform == "win32", reason="do not run on windows")
 def test_starting_point_not_in_search_space():
    """Regression test for invalid starting points and custom_hp.

@@ -126,7 +137,6 @@ def test_starting_point_not_in_search_space():
            print("PermissionError when deleting test/data/output/")


-@pytest.mark.skipif(sys.platform == "win32", reason="do not run on windows")
 def test_points_to_evaluate():
    from flaml import AutoML

@@ -155,7 +165,6 @@ def test_points_to_evaluate():


 # TODO: implement _test_zero_shot_model
-@pytest.mark.skipif(sys.platform == "win32", reason="do not run on windows")
 def test_zero_shot_nomodel():
    from flaml.default import preprocess_and_suggest_hyperparams

--- a/test/spark/test_0sparkml.py
+++ b/test/spark/test_0sparkml.py
@@ -1,3 +1,4 @@
+import atexit
 import os
 import sys
 import warnings
@@ -10,6 +11,7 @@ from packaging.version import Version

 from flaml import AutoML
 from flaml.automl.data import auto_convert_dtypes_pandas, auto_convert_dtypes_spark, get_random_dataframe
+from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
 from flaml.tune.spark.utils import check_spark

 warnings.simplefilter(action="ignore")
@@ -29,7 +31,7 @@ else:
            .config(
                "spark.jars.packages",
                (
-                    "com.microsoft.azure:synapseml_2.12:1.0.4,"
+                    "com.microsoft.azure:synapseml_2.12:1.1.0,"
                    "org.apache.hadoop:hadoop-azure:3.3.5,"
                    "com.microsoft.azure:azure-storage:8.6.6,"
                    f"org.mlflow:mlflow-spark_2.12:{mlflow.__version__}"
@@ -55,6 +57,9 @@ else:
    except ImportError:
        skip_spark = True

+spark, ansi_conf, adjusted = disable_spark_ansi_mode()
+atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
+
 if sys.version_info >= (3, 11):
    skip_py311 = True
 else:
@@ -64,6 +69,13 @@ pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Ski


 def _test_spark_synapseml_lightgbm(spark=None, task="classification"):
+    # TODO: remove the estimator assignment once SynapseML supports spark 4+.
+    from flaml.automl.spark.utils import _spark_major_minor_version
+
+    if _spark_major_minor_version[0] >= 4:
+        # skip synapseml lightgbm test for spark 4+
+        return
+
    if task == "classification":
        metric = "accuracy"
        X_train, y_train = skds.load_iris(return_X_y=True, as_frame=True)
@@ -153,27 +165,32 @@ def test_spark_synapseml_rank():
    _test_spark_synapseml_lightgbm(spark, "rank")


-def test_spark_input_df():
-    df = (
-        spark.read.format("csv")
-        .option("header", True)
-        .option("inferSchema", True)
-        .load("wasbs://publicwasb@mmlspark.blob.core.windows.net/company_bankruptcy_prediction_data.csv")
-    )
+def test_spark_input_df_and_pickle():
+    import pandas as pd
+
+    file_url = "https://mmlspark.blob.core.windows.net/publicwasb/company_bankruptcy_prediction_data.csv"
+    df = pd.read_csv(file_url)
+    df = spark.createDataFrame(df)
    train, test = df.randomSplit([0.8, 0.2], seed=1)
    feature_cols = df.columns[1:]
    featurizer = VectorAssembler(inputCols=feature_cols, outputCol="features")
    train_data = featurizer.transform(train)["Bankrupt?", "features"]
    test_data = featurizer.transform(test)["Bankrupt?", "features"]
    automl = AutoML()
+
+    # TODO: remove the estimator assignment once SynapseML supports spark 4+.
+    from flaml.automl.spark.utils import _spark_major_minor_version
+
+    estimator_list = ["rf_spark"] if _spark_major_minor_version[0] >= 4 else None
+
    settings = {
        "time_budget": 30,  # total running time in seconds
        "metric": "roc_auc",
-        # "estimator_list": ["lgbm_spark"],  # list of ML learners; we tune lightgbm in this example
        "task": "classification",  # task type
        "log_file_name": "flaml_experiment.log",  # flaml log file
        "seed": 7654321,  # random seed
        "eval_method": "holdout",
+        "estimator_list": estimator_list,  # TODO: remove once SynapseML supports spark 4+
    }
    df = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))

@@ -184,6 +201,22 @@ def test_spark_input_df():
        **settings,
    )

+    # test pickle and load_pickle, should work for prediction
+    automl.pickle("automl_spark.pkl")
+    automl_loaded = AutoML().load_pickle("automl_spark.pkl")
+    assert automl_loaded.best_estimator == automl.best_estimator
+    assert automl_loaded.best_loss == automl.best_loss
+    automl_loaded.predict(df)
+    automl_loaded.model.estimator.transform(test_data)
+
+    import shutil
+
+    shutil.rmtree("automl_spark.pkl", ignore_errors=True)
+    shutil.rmtree("automl_spark.pkl.flaml_artifacts", ignore_errors=True)
+
+    if estimator_list == ["rf_spark"]:
+        return
+
    try:
        model = automl.model.estimator
        predictions = model.transform(test_data)
@@ -373,13 +406,13 @@ def test_auto_convert_dtypes_spark():


 if __name__ == "__main__":
-    test_spark_synapseml_classification()
-    test_spark_synapseml_regression()
-    test_spark_synapseml_rank()
-    test_spark_input_df()
-    test_get_random_dataframe()
-    test_auto_convert_dtypes_pandas()
-    test_auto_convert_dtypes_spark()
+    # test_spark_synapseml_classification()
+    # test_spark_synapseml_regression()
+    # test_spark_synapseml_rank()
+    test_spark_input_df_and_pickle()
+    # test_get_random_dataframe()
+    # test_auto_convert_dtypes_pandas()
+    # test_auto_convert_dtypes_spark()

    # import cProfile
    # import pstats
--- a/test/spark/test_automl.py
+++ b/test/spark/test_automl.py
@@ -28,10 +28,10 @@ skip_spark = not spark_available
 pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]


-def test_parallel_xgboost(hpo_method=None, data_size=1000):
+def test_parallel_xgboost_and_pickle(hpo_method=None, data_size=1000):
    automl_experiment = AutoML()
    automl_settings = {
-        "time_budget": 10,
+        "time_budget": 30,
        "metric": "ap",
        "task": "classification",
        "log_file_name": "test/sparse_classification.log",
@@ -53,15 +53,27 @@ def test_parallel_xgboost(hpo_method=None, data_size=1000):
    print(automl_experiment.best_iteration)
    print(automl_experiment.best_estimator)

+    # test pickle and load_pickle, should work for prediction
+    automl_experiment.pickle("automl_xgboost_spark.pkl")
+    automl_loaded = AutoML().load_pickle("automl_xgboost_spark.pkl")
+    assert automl_loaded.best_estimator == automl_experiment.best_estimator
+    assert automl_loaded.best_loss == automl_experiment.best_loss
+    automl_loaded.predict(X_train)
+
+    import shutil
+
+    shutil.rmtree("automl_xgboost_spark.pkl", ignore_errors=True)
+    shutil.rmtree("automl_xgboost_spark.pkl.flaml_artifacts", ignore_errors=True)
+

 def test_parallel_xgboost_others():
    # use random search as the hpo_method
-    test_parallel_xgboost(hpo_method="random")
+    test_parallel_xgboost_and_pickle(hpo_method="random")


@pytest.mark.skip(reason="currently not supporting too large data, will support spark dataframe in the future")
 def test_large_dataset():
-    test_parallel_xgboost(data_size=90000000)
+    test_parallel_xgboost_and_pickle(data_size=90000000)


@pytest.mark.skipif(
@@ -95,10 +107,10 @@ def test_custom_learner(data_size=1000):


 if __name__ == "__main__":
-    test_parallel_xgboost()
-    test_parallel_xgboost_others()
-    # test_large_dataset()
-    if skip_my_learner:
-        print("please run pytest in the root directory of FLAML, i.e., the directory that contains the setup.py file")
-    else:
-        test_custom_learner()
+    test_parallel_xgboost_and_pickle()
+    # test_parallel_xgboost_others()
+    # # test_large_dataset()
+    # if skip_my_learner:
+    #     print("please run pytest in the root directory of FLAML, i.e., the directory that contains the setup.py file")
+    # else:
+    #     test_custom_learner()
--- a/test/spark/test_mlflow.py
+++ b/test/spark/test_mlflow.py
@@ -1,3 +1,4 @@
+import atexit
 import importlib
 import os
 import sys
@@ -13,6 +14,7 @@ from sklearn.metrics import r2_score
 from sklearn.model_selection import train_test_split

 import flaml
+from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
 from flaml.automl.spark.utils import to_pandas_on_spark

 try:
@@ -120,6 +122,29 @@ def _check_mlflow_logging(possible_num_runs, metric, is_parent_run, experiment_i
    # mlflow.delete_experiment(experiment_id)


+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_nonsparkdata_noautolog_noparentrun():
+    experiment_id = _test_automl_nonsparkdata(is_autolog=False, is_parent_run=False)
+    _check_mlflow_logging(0, "r2", False, experiment_id, is_automl=True)  # no logging
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_sparkdata_noautolog_noparentrun():
+    experiment_id = _test_automl_sparkdata(is_autolog=False, is_parent_run=False)
+    _check_mlflow_logging(0, "mse", False, experiment_id, is_automl=True)  # no logging
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_tune_noautolog_noparentrun_parallel():
+    experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=True)
+    _check_mlflow_logging(0, "r2", False, experiment_id)
+
+
+def test_tune_noautolog_noparentrun_nonparallel():
+    experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=False)
+    _check_mlflow_logging(3, "r2", False, experiment_id, skip_tags=True)
+
+
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
 def test_tune_autolog_parentrun_parallel():
    experiment_id = _test_tune(is_autolog=True, is_parent_run=True, is_parallel=True)
@@ -131,6 +156,16 @@ def test_tune_autolog_parentrun_nonparallel():
    _check_mlflow_logging(3, "r2", True, experiment_id)


+def test_tune_autolog_noparentrun_nonparallel():
+    experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=False)
+    _check_mlflow_logging(3, "r2", False, experiment_id)
+
+
+def test_tune_noautolog_parentrun_nonparallel():
+    experiment_id = _test_tune(is_autolog=False, is_parent_run=True, is_parallel=False)
+    _check_mlflow_logging(3, "r2", True, experiment_id)
+
+
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
 def test_tune_autolog_noparentrun_parallel():
    experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=True)
@@ -143,28 +178,12 @@ def test_tune_noautolog_parentrun_parallel():
    _check_mlflow_logging([4, 3], "r2", True, experiment_id)


-def test_tune_autolog_noparentrun_nonparallel():
-    experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=False)
-    _check_mlflow_logging(3, "r2", False, experiment_id)
-
-
-def test_tune_noautolog_parentrun_nonparallel():
-    experiment_id = _test_tune(is_autolog=False, is_parent_run=True, is_parallel=False)
-    _check_mlflow_logging(3, "r2", True, experiment_id)
-
-
-@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
-def test_tune_noautolog_noparentrun_parallel():
-    experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=True)
-    _check_mlflow_logging(0, "r2", False, experiment_id)
-
-
-def test_tune_noautolog_noparentrun_nonparallel():
-    experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=False)
-    _check_mlflow_logging(3, "r2", False, experiment_id, skip_tags=True)
-
-
 def _test_automl_sparkdata(is_autolog, is_parent_run):
+    # TODO: remove the estimator assignment once SynapseML supports spark 4+.
+    from flaml.automl.spark.utils import _spark_major_minor_version
+
+    estimator_list = ["rf_spark"] if _spark_major_minor_version[0] >= 4 else None
+
    mlflow.end_run()
    mlflow_exp_name = f"test_mlflow_integration_{int(time.time())}"
    mlflow_experiment = mlflow.set_experiment(mlflow_exp_name)
@@ -175,6 +194,9 @@ def _test_automl_sparkdata(is_autolog, is_parent_run):
    if is_parent_run:
        mlflow.start_run(run_name=f"automl_sparkdata_autolog_{is_autolog}")
    spark = pyspark.sql.SparkSession.builder.getOrCreate()
+    spark, ansi_conf, adjusted = disable_spark_ansi_mode()
+    atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
+
    pd_df = load_diabetes(as_frame=True).frame
    df = spark.createDataFrame(pd_df)
    df = df.repartition(4).cache()
@@ -193,6 +215,7 @@ def _test_automl_sparkdata(is_autolog, is_parent_run):
        "log_type": "all",
        "n_splits": 2,
        "model_history": True,
+        "estimator_list": estimator_list,
    }
    df = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))
    automl.fit(
@@ -252,12 +275,6 @@ def test_automl_sparkdata_noautolog_parentrun():
    _check_mlflow_logging(3, "mse", True, experiment_id, is_automl=True)


-@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
-def test_automl_sparkdata_noautolog_noparentrun():
-    experiment_id = _test_automl_sparkdata(is_autolog=False, is_parent_run=False)
-    _check_mlflow_logging(0, "mse", False, experiment_id, is_automl=True)  # no logging
-
-
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
 def test_automl_nonsparkdata_autolog_parentrun():
    experiment_id = _test_automl_nonsparkdata(is_autolog=True, is_parent_run=True)
@@ -276,12 +293,6 @@ def test_automl_nonsparkdata_noautolog_parentrun():
    _check_mlflow_logging([4, 3], "r2", True, experiment_id, is_automl=True)


-@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
-def test_automl_nonsparkdata_noautolog_noparentrun():
-    experiment_id = _test_automl_nonsparkdata(is_autolog=False, is_parent_run=False)
-    _check_mlflow_logging(0, "r2", False, experiment_id, is_automl=True)  # no logging
-
-
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
 def test_exit_pyspark_autolog():
    import pyspark
@@ -319,6 +330,9 @@ def _init_spark_for_main():
        "https://mmlspark.blob.core.windows.net/publicwasb/log_model_allowlist.txt",
    )

+    spark, ansi_conf, adjusted = disable_spark_ansi_mode()
+    atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
+

 if __name__ == "__main__":
    _init_spark_for_main()
--- a/test/spark/test_multiclass.py
+++ b/test/spark/test_multiclass.py
@@ -262,7 +262,11 @@ class TestMultiClass(unittest.TestCase):
            "n_concurrent_trials": 2,
            "use_spark": True,
        }
-        X_train = scipy.sparse.random(1554, 21, dtype=int)
+        # NOTE: Avoid `dtype=int` here. On some NumPy/SciPy combinations (notably
+        # Windows + Python 3.13), `scipy.sparse.random(..., dtype=int)` may trigger
+        # integer sampling paths which raise "low is out of bounds for int32".
+        # A float sparse matrix is sufficient to validate sparse-input support.
+        X_train = scipy.sparse.random(1554, 21, dtype=np.float32)
        y_train = np.random.randint(3, size=1554)
        automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)
        print(automl_experiment.classes_)
--- a/test/spark/test_performance.py
+++ b/test/spark/test_performance.py
@@ -31,14 +31,14 @@ pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Ski
 os.environ["FLAML_MAX_CONCURRENT"] = "2"


-def run_automl(budget=3, dataset_format="dataframe", hpo_method=None):
+def run_automl(budget=30, dataset_format="dataframe", hpo_method=None):
    import urllib3

    from flaml.automl.data import load_openml_dataset

    performance_check_budget = 3600
    if sys.platform == "darwin" or "nt" in os.name or "3.10" not in sys.version:
-        budget = 3  # revise the buget if the platform is not linux + python 3.10
+        budget = 30  # revise the buget if the platform is not linux + python 3.10
    if budget >= performance_check_budget:
        max_iter = 60
        performance_check_budget = None
@@ -91,6 +91,11 @@ def run_automl(budget=3, dataset_format="dataframe", hpo_method=None):
    print("Best ML leaner:", automl.best_estimator)
    print("Best hyperparmeter config:", automl.best_config)
    print(f"Best accuracy on validation data: {1 - automl.best_loss:.4g}")
+    if performance_check_budget is not None and automl.best_estimator is None:
+        # skip the performance check if no model is trained
+        # this happens sometimes in github actions ubuntu python 3.12 environment
+        print("Warning: no model is trained, skip performance check")
+        return
    print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")
    print(automl.model.estimator)
    print(automl.best_config_per_estimator)
--- a/test/spark/test_utils.py
+++ b/test/spark/test_utils.py
@@ -1,3 +1,4 @@
+import atexit
 import os
 from functools import partial
 from timeit import timeit
@@ -14,6 +15,7 @@ try:
    from pyspark.sql import SparkSession

    from flaml.automl.ml import sklearn_metric_loss_score
+    from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
    from flaml.automl.spark.metrics import spark_metric_loss_score
    from flaml.automl.spark.utils import (
        iloc_pandas_on_spark,
@@ -24,6 +26,7 @@ try:
        unique_value_first_index,
    )
    from flaml.tune.spark.utils import (
+        _spark_major_minor_version,
        check_spark,
        get_broadcast_data,
        get_n_cpus,
@@ -35,10 +38,41 @@ try:
 except ImportError:
    print("Spark is not installed. Skip all spark tests.")
    skip_spark = True
+    _spark_major_minor_version = (0, 0)
+

 pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]


+@pytest.mark.skipif(_spark_major_minor_version[0] < 4, reason="Requires Spark 4.0+")
+def test_to_pandas_on_spark_temp_override():
+    import pyspark.pandas as ps
+    from pyspark.sql import Row
+
+    from flaml.automl.spark.utils import to_pandas_on_spark
+
+    spark_session = SparkSession.builder.getOrCreate()
+    spark, ansi_conf, adjusted = disable_spark_ansi_mode()
+    atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
+
+    # Ensure we can toggle options
+    orig = ps.get_option("compute.fail_on_ansi_mode")
+
+    try:
+        spark_session.conf.set("spark.sql.ansi.enabled", "true")
+        ps.set_option("compute.fail_on_ansi_mode", True)
+
+        # create tiny spark df
+        sdf = spark_session.createDataFrame([Row(a=1, b=2)])
+        # Should not raise as our function temporarily disables fail_on_ansi_mode
+        pds = to_pandas_on_spark(sdf)
+        assert "a" in pds.columns
+    finally:
+        # restore test environment
+        ps.set_option("compute.fail_on_ansi_mode", orig)
+        spark_session.conf.set("spark.sql.ansi.enabled", "false")
+
+
 def test_with_parameters_spark():
    def train(config, data=None):
        if isinstance(data, pyspark.broadcast.Broadcast):
--- a/test/tune/test_lexiflow.py
+++ b/test/tune/test_lexiflow.py
@@ -4,10 +4,17 @@ from collections import defaultdict

 import numpy as np
 import pytest
-import thop
-import torch
-import torch.nn as nn
-import torch.nn.functional as F
+
+try:
+    import thop
+    import torch
+    import torch.nn as nn
+    import torch.nn.functional as F
+except ImportError:
+    thop = None
+    torch = None
+    nn = None
+    F = None

 try:
    import torchvision
@@ -16,6 +23,11 @@ except ImportError:

 from flaml import tune

+if thop is None or torch is None or nn is None or F is None or torchvision is None:
+    pytest.skip(
+        "skipping test_lexiflow.py because torch, torchvision or thop is not installed.", allow_module_level=True
+    )
+
 DEVICE = torch.device("cpu")
 BATCHSIZE = 128
 N_TRAIN_EXAMPLES = BATCHSIZE * 30
--- a/test/tune/test_search_thread.py
+++ b/test/tune/test_search_thread.py
@@ -0,0 +1,99 @@
+"""Tests for SearchThread nested dictionary update fix."""
+
+import pytest
+
+from flaml.tune.searcher.search_thread import _recursive_dict_update
+
+
+def test_recursive_dict_update_simple():
+    """Test simple non-nested dictionary update."""
+    target = {"a": 1, "b": 2}
+    source = {"c": 3}
+    _recursive_dict_update(target, source)
+    assert target == {"a": 1, "b": 2, "c": 3}
+
+
+def test_recursive_dict_update_override():
+    """Test that source values override target values for non-dict values."""
+    target = {"a": 1, "b": 2}
+    source = {"b": 3}
+    _recursive_dict_update(target, source)
+    assert target == {"a": 1, "b": 3}
+
+
+def test_recursive_dict_update_nested():
+    """Test nested dictionary merge (the main use case for XGBoost params)."""
+    target = {
+        "num_boost_round": 10,
+        "params": {
+            "max_depth": 12,
+            "eta": 0.020168455186106736,
+            "min_child_weight": 1.4504723523894132,
+            "scale_pos_weight": 3.794258636185337,
+            "gamma": 0.4985070123025904,
+        },
+    }
+    source = {
+        "params": {
+            "verbosity": 3,
+            "booster": "gbtree",
+            "eval_metric": "auc",
+            "tree_method": "hist",
+            "objective": "binary:logistic",
+        }
+    }
+    _recursive_dict_update(target, source)
+
+    # Check that sampled params are preserved
+    assert target["params"]["max_depth"] == 12
+    assert target["params"]["eta"] == 0.020168455186106736
+    assert target["params"]["min_child_weight"] == 1.4504723523894132
+    assert target["params"]["scale_pos_weight"] == 3.794258636185337
+    assert target["params"]["gamma"] == 0.4985070123025904
+
+    # Check that const params are added
+    assert target["params"]["verbosity"] == 3
+    assert target["params"]["booster"] == "gbtree"
+    assert target["params"]["eval_metric"] == "auc"
+    assert target["params"]["tree_method"] == "hist"
+    assert target["params"]["objective"] == "binary:logistic"
+
+    # Check top-level param is preserved
+    assert target["num_boost_round"] == 10
+
+
+def test_recursive_dict_update_deeply_nested():
+    """Test deeply nested dictionary merge."""
+    target = {"a": {"b": {"c": 1, "d": 2}}}
+    source = {"a": {"b": {"e": 3}}}
+    _recursive_dict_update(target, source)
+    assert target == {"a": {"b": {"c": 1, "d": 2, "e": 3}}}
+
+
+def test_recursive_dict_update_mixed_types():
+    """Test that non-dict values in source replace dict values in target."""
+    target = {"a": {"b": 1}}
+    source = {"a": 2}
+    _recursive_dict_update(target, source)
+    assert target == {"a": 2}
+
+
+def test_recursive_dict_update_empty_dicts():
+    """Test with empty dictionaries."""
+    target = {}
+    source = {"a": 1}
+    _recursive_dict_update(target, source)
+    assert target == {"a": 1}
+
+    target = {"a": 1}
+    source = {}
+    _recursive_dict_update(target, source)
+    assert target == {"a": 1}
+
+
+def test_recursive_dict_update_none_values():
+    """Test that None values are properly handled."""
+    target = {"a": 1, "b": None}
+    source = {"b": 2, "c": None}
+    _recursive_dict_update(target, source)
+    assert target == {"a": 1, "b": 2, "c": None}
--- a/test/tune/test_searcher.py
+++ b/test/tune/test_searcher.py
@@ -324,3 +324,26 @@ def test_no_optuna():
    import flaml.tune.searcher.suggestion

    subprocess.check_call([sys.executable, "-m", "pip", "install", "optuna==2.8.0"])
+
+
+def test_unresolved_search_space(caplog):
+    import logging
+
+    from flaml import tune
+    from flaml.tune.searcher.blendsearch import BlendSearch
+
+    if caplog is not None:
+        caplog.set_level(logging.INFO)
+
+    BlendSearch(metric="loss", mode="min", space={"lr": tune.uniform(0.001, 0.1), "depth": tune.randint(1, 10)})
+    try:
+        text = caplog.text
+    except AttributeError:
+        text = ""
+    assert (
+        "unresolved search space" not in text and text
+    ), "BlendSearch should not produce warning about unresolved search space"
+
+
+if __name__ == "__main__":
+    test_unresolved_search_space(None)
--- a/test/tune/test_tune.py
+++ b/test/tune/test_tune.py
@@ -53,6 +53,11 @@ def _easy_objective(config):


 def test_nested_run():
+    """
+    nested tuning example: Tune -> AutoML -> MLflow autolog
+    mlflow logging is complicated in nested tuning. It's better to turn off mlflow autologging to avoid
+    potential issues in FLAML's mlflow_integration.adopt_children() function.
+    """
    from flaml import AutoML, tune

    data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)
--- a/tutorials/flaml-tutorial-automl-24.md
+++ b/tutorials/flaml-tutorial-automl-24.md
@@ -4,7 +4,7 @@

 **Date and Time**: 09.09.2024, 15:30-17:00

-Location:  Sorbonne University, 4 place Jussieu, 75005 Paris
+Location: Sorbonne University, 4 place Jussieu, 75005 Paris

 Duration: 1.5 hours

--- a/tutorials/flaml-tutorial-pydata-23.md
+++ b/tutorials/flaml-tutorial-pydata-23.md
@@ -4,7 +4,7 @@

 **Date and Time**: 04-26, 09:00–10:30 PT.

-Location:  Microsoft Conference Center, Seattle, WA.
+Location: Microsoft Conference Center, Seattle, WA.

 Duration: 1.5 hours

--- a/website/docs/Best-Practices.md
+++ b/website/docs/Best-Practices.md
@@ -0,0 +1,159 @@
+# Best Practices
+
+This page collects practical guidance for using FLAML effectively across common tasks.
+
+## General tips
+
+- Start simple: set `task`, `time_budget`, and keep `metric="auto"` unless you have a strong reason to override.
+- Prefer correct splits: ensure your evaluation strategy matches your data (time series vs i.i.d., grouped data, etc.).
+- Keep estimator lists explicit when debugging: start with a small `estimator_list` and expand.
+- Use built-in discovery helpers to avoid stale hardcoded lists:
+
+```python
+from flaml import AutoML
+from flaml.automl.task.factory import task_factory
+
+automl = AutoML()
+print("Built-in sklearn metrics:", sorted(automl.supported_metrics[0]))
+print(
+    "classification estimators:",
+    sorted(task_factory("classification").estimators.keys()),
+)
+```
+
+## Classification
+
+- **Metric**: for binary classification, `metric="roc_auc"` is common; for multiclass, `metric="log_loss"` is often robust.
+- **Imbalanced data**:
+  - pass `sample_weight` to `AutoML.fit()`;
+  - consider setting class weights via `custom_hp` / `fit_kwargs_by_estimator` for specific estimators (see [FAQ](FAQ)).
+- **Probability vs label metrics**: use `roc_auc` / `log_loss` when you care about calibrated probabilities.
+- **Label overlap control** (holdout evaluation only):
+  - By default, FLAML uses a fast strategy (`allow_label_overlap=True`) that ensures all labels are present in both training and validation sets by adding missing labels' first instances to both sets. This is efficient but may create minor overlap.
+  - For strict no-overlap validation, use `allow_label_overlap=False`. This slower but more precise strategy intelligently re-splits multi-instance classes to avoid overlap while maintaining label completeness.
+
+```python
+from flaml import AutoML
+
+# Fast version (default): allows overlap for efficiency
+automl_fast = AutoML()
+automl_fast.fit(
+    X_train,
+    y_train,
+    task="classification",
+    eval_method="holdout",
+    allow_label_overlap=True,
+)  # default
+
+# Precise version: avoids overlap when possible
+automl_precise = AutoML()
+automl_precise.fit(
+    X_train,
+    y_train,
+    task="classification",
+    eval_method="holdout",
+    allow_label_overlap=False,
+)  # slower but more precise
+```
+
+Note: This only affects holdout evaluation. CV and custom validation sets are unaffected.
+
+## Regression
+
+- **Default metric**: `metric="r2"` (minimizes `1 - r2`).
+- If your target scale matters (e.g., dollar error), consider `mae`/`rmse`.
+
+## Learning to rank
+
+- Use `task="rank"` with group information (`groups` / `groups_val`) so metrics like `ndcg` and `ndcg@k` are meaningful.
+- If you pass `metric="ndcg@10"`, also pass `groups` so FLAML can compute group-aware NDCG.
+
+## Time series forecasting
+
+- Use time-aware splitting. For holdout validation, set `eval_method="holdout"` and use a time-ordered dataset.
+- Prefer supplying a DataFrame with a clear time column when possible.
+- Optional time-series estimators depend on optional dependencies. To list what is available in your environment:
+
+```python
+from flaml.automl.task.factory import task_factory
+
+print("forecast:", sorted(task_factory("forecast").estimators.keys()))
+```
+
+## NLP (Transformers)
+
+- Install the optional dependency: `pip install "flaml[hf]"`.
+- When you provide a custom metric, ensure it returns `(metric_to_minimize, metrics_to_log)` with stable keys.
+
+## Speed, stability, and tricky settings
+
+- **Time budget vs convergence**: if you see warnings about not all estimators converging, increase `time_budget` or reduce `estimator_list`.
+- **Memory pressure / OOM**:
+  - set `free_mem_ratio` (e.g., `0.2`) to keep free memory above a threshold;
+  - set `model_history=False` to reduce stored artifacts;
+- **Reproducibility**: set `seed` and keep `n_jobs` fixed; expect some runtime variance.
+
+## Persisting models
+
+FLAML supports **both** MLflow logging and pickle-based persistence. For production deployment, MLflow logging is typically the most important option because it plugs into the MLflow ecosystem (tracking, model registry, serving, governance). For quick local reuse, persisting the whole `AutoML` object via pickle is often the most convenient.
+
+### Option 1: MLflow logging (recommended for production)
+
+When you run `AutoML.fit()` inside an MLflow run, FLAML can log metrics/params automatically (disable via `mlflow_logging=False` if needed). To persist the trained `AutoML` object as a model artifact and reuse MLflow tooling end-to-end:
+
+```python
+import mlflow
+import numpy as np
+from sklearn.datasets import load_iris
+from sklearn.model_selection import train_test_split
+from flaml import AutoML
+
+X, y = load_iris(return_X_y=True, as_frame=True)
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.2, random_state=42
+)
+
+automl = AutoML()
+mlflow.set_experiment("flaml")
+with mlflow.start_run(run_name="flaml_run") as run:
+    automl.fit(X_train, y_train, task="classification", time_budget=3)
+
+run_id = run.info.run_id
+
+# Later (or in a different process)
+automl2 = mlflow.sklearn.load_model(f"runs:/{run_id}/model")
+assert np.array_equal(automl2.predict(X_test), automl.predict(X_test))
+```
+
+### Option 2: Pickle the full `AutoML` instance (convenient)
+
+Pickling stores the *entire* `AutoML` instance (not just the best estimator). This is useful when you prefer not to rely on MLflow or when you want to reuse additional attributes of the AutoML object without retraining.
+
+In Microsoft Fabric scenarios, additional attributes is particularly important for re-plotting visualization figures without requiring model retraining.
+
+```python
+import mlflow
+import numpy as np
+from sklearn.datasets import load_iris
+from sklearn.model_selection import train_test_split
+from flaml import AutoML
+
+X, y = load_iris(return_X_y=True, as_frame=True)
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.2, random_state=42
+)
+
+automl = AutoML()
+mlflow.set_experiment("flaml")
+with mlflow.start_run(run_name="flaml_run") as run:
+    automl.fit(X_train, y_train, task="classification", time_budget=3)
+
+automl.pickle("automl.pkl")
+automl2 = AutoML.load_pickle("automl.pkl")
+assert np.array_equal(automl2.predict(X_test), automl.predict(X_test))
+assert automl.best_config == automl2.best_config
+assert automl.best_loss == automl2.best_loss
+assert automl.mlflow_integration.infos == automl2.mlflow_integration.infos
+```
+
+See also: [Task-Oriented AutoML](Use-Cases/Task-Oriented-AutoML) and [FAQ](FAQ).
--- a/website/docs/Contribute.md
+++ b/website/docs/Contribute.md
@@ -49,7 +49,7 @@ print(flaml.__version__)
 ```

 - Please ensure all **code snippets and error messages are formatted in
-  appropriate code blocks**.  See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks)
+  appropriate code blocks**. See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks)
  for more details.

 ## Becoming a Reviewer
@@ -62,10 +62,10 @@ There is currently no formal reviewer solicitation process. Current reviewers id

 ```bash
 git clone https://github.com/microsoft/FLAML.git
-pip install -e FLAML[notebook,autogen]
+pip install -e ".[notebook]"
 ```

-In case the `pip install` command fails, try escaping the brackets such as `pip install -e FLAML\[notebook,autogen\]`.
+In case the `pip install` command fails, try escaping the brackets such as `pip install -e .\[notebook\]`.

 ### Docker

@@ -88,7 +88,7 @@ Run `pre-commit install` to install pre-commit into your git hooks. Before you c

 ### Coverage

-Any code you commit should not decrease coverage. To run all unit tests, install the \[test\] option under FLAML/:
+Any code you commit should not decrease coverage. To run all unit tests, install the [test] option under FLAML/:

 ```bash
 pip install -e."[test]"
--- a/website/docs/Examples/AutoML-Classification.md
+++ b/website/docs/Examples/AutoML-Classification.md
@@ -2,7 +2,7 @@

 ### Prerequisites

-Install the \[automl\] option.
+Install the [automl] option.

 ```bash
 pip install "flaml[automl]"
--- a/website/docs/Examples/AutoML-NLP.md
+++ b/website/docs/Examples/AutoML-NLP.md
@@ -2,7 +2,7 @@

 ### Requirements

-This example requires GPU. Install the \[automl,hf\] option:
+This example requires GPU. Install the [automl,hf] option:

 ```python
 pip install "flaml[automl,hf]"
--- a/website/docs/Examples/AutoML-Rank.md
+++ b/website/docs/Examples/AutoML-Rank.md
@@ -2,7 +2,7 @@

 ### Prerequisites

-Install the \[automl\] option.
+Install the [automl] option.

 ```bash
 pip install "flaml[automl]"
--- a/website/docs/Examples/AutoML-Regression.md
+++ b/website/docs/Examples/AutoML-Regression.md
@@ -2,7 +2,7 @@

 ### Prerequisites

-Install the \[automl\] option.
+Install the [automl] option.

 ```bash
 pip install "flaml[automl]"
--- a/website/docs/Examples/AutoML-Time
+++ b/website/docs/Examples/AutoML-Time
@@ -2,12 +2,31 @@

 ### Prerequisites

-Install the \[automl,ts_forecast\] option.
+Install the [automl,ts_forecast] option.

 ```bash
 pip install "flaml[automl,ts_forecast]"
 ```

+### Understanding the `period` Parameter
+
+The `period` parameter (also called **horizon** in the code) specifies the **forecast horizon** - the number of future time steps the model is trained to predict. For example:
+
+- `period=12` means you want to forecast 12 time steps ahead (e.g., 12 months, 12 days)
+- `period=7` means you want to forecast 7 time steps ahead
+
+**Important Note on Prediction**: During the prediction stage, the output length equals the length of `X_test`. This means you can generate predictions for any number of time steps by providing the corresponding timestamps in `X_test`, regardless of the `period` value used during training.
+
+#### Automatic Feature Engineering
+
+**Important**: You do NOT need to manually lag the target variable before training. FLAML handles this automatically:
+
+- **For sklearn-based models** (lgbm, rf, xgboost, extra_tree, catboost): FLAML automatically creates lagged features of both the target variable and any exogenous variables. This transforms the time series forecasting problem into a supervised learning regression problem.
+
+- **For time series native models** (prophet, arima, sarimax, holt-winters): These models have built-in time series forecasting capabilities and handle temporal dependencies natively.
+
+The automatic lagging is implemented internally when you call `automl.fit()` with `task="ts_forecast"` or `task="ts_forecast_classification"`, so you can focus on providing clean input data without worrying about feature engineering.
+
 ### Simple NumPy Example

 ```python
--- a/website/docs/Examples/AutoML-for-LightGBM.md
+++ b/website/docs/Examples/AutoML-for-LightGBM.md
@@ -2,7 +2,7 @@

 ### Prerequisites for this example

-Install the \[automl\] option.
+Install the [automl] option.

 ```bash
 pip install "flaml[automl] matplotlib openml"
--- a/Show More
+++ b/Show More