Bump webpack from 5.94.0 to 5.105.0 in /website (#1515 )

Fix transformers API compatibility: support v4.26+ and v5.0+ with version-aware parameter selection (#1514 )
* Initial plan * Fix transformers API compatibility issues Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com> * Add backward compatibility for transformers v4.26+ by version check Support both tokenizer (v4.26-4.43) and processing_class (v4.44+) parameters based on installed transformers version. Fallback to tokenizer if version check fails. Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com> * Improve exception handling specificity Use specific exception types (ImportError, AttributeError, ValueError) instead of broad Exception catch for better error handling. Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com> * Run pre-commit formatting on all files Applied black formatting to fix code style across the repository. Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
2026-02-09 18:29:16 +08:00 · 2026-02-08 16:29:18 +08:00 · 2026-01-28 09:00:21 +08:00 · 2026-01-25 21:10:05 +08:00 · 2026-01-23 10:20:59 +08:00 · 2026-01-22 11:07:40 +08:00
275 changed files with 23150 additions and 4062 deletions
--- a/.coveragerc
+++ b/.coveragerc
@@ -1,5 +1,7 @@
 [run]
 branch = True
-source = flaml
+source =
+  flaml
 omit =
-  *test*
+  */test/*
+  */flaml/autogen/*
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -0,0 +1,73 @@
+### Description
+
+<!-- A clear and concise description of the issue or feature request. -->
+
+### Environment
+
+- FLAML version: <!-- Specify the FLAML version (e.g., v0.2.0) -->
+- Python version: <!-- Specify the Python version (e.g., 3.8) -->
+- Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
+
+### Steps to Reproduce (for bugs)
+
+<!-- Provide detailed steps to reproduce the issue. Include code snippets, configuration files, or any other relevant information. -->
+
+1. Step 1
+1. Step 2
+1. ...
+
+### Expected Behavior
+
+<!-- Describe what you expected to happen. -->
+
+### Actual Behavior
+
+<!-- Describe what actually happened. Include any error messages, stack traces, or unexpected behavior. -->
+
+### Screenshots / Logs (if applicable)
+
+<!-- If relevant, include screenshots or logs that help illustrate the issue. -->
+
+### Additional Information
+
+<!-- Include any additional information that might be helpful, such as specific configurations, data samples, or context about the environment. -->
+
+### Possible Solution (if you have one)
+
+<!-- If you have suggestions on how to address the issue, provide them here. -->
+
+### Is this a Bug or Feature Request?
+
+<!-- Choose one: Bug | Feature Request -->
+
+### Priority
+
+<!-- Choose one: High | Medium | Low -->
+
+### Difficulty
+
+<!-- Choose one: Easy | Moderate | Hard -->
+
+### Any related issues?
+
+<!-- If this is related to another issue, reference it here. -->
+
+### Any relevant discussions?
+
+<!-- If there are any discussions or forum threads related to this issue, provide links. -->
+
+### Checklist
+
+<!-- Please check the items that you have completed -->
+
+- [ ] I have searched for similar issues and didn't find any duplicates.
+- [ ] I have provided a clear and concise description of the issue.
+- [ ] I have included the necessary environment details.
+- [ ] I have outlined the steps to reproduce the issue.
+- [ ] I have included any relevant logs or screenshots.
+- [ ] I have indicated whether this is a bug or a feature request.
+- [ ] I have set the priority and difficulty levels.
+
+### Additional Comments
+
+<!-- Any additional comments or context that you think would be helpful. -->
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,53 @@
+name: Bug Report
+description: File a bug report
+title: "[Bug]: "
+labels: ["bug"]
+
+body:
+  - type: textarea
+    id: description
+    attributes:
+      label: Describe the bug
+      description: A clear and concise description of what the bug is.
+      placeholder: What went wrong?
+  - type: textarea
+    id: reproduce
+    attributes:
+      label: Steps to reproduce
+      description: |
+        Steps to reproduce the behavior:
+
+        1. Step 1
+        2. Step 2
+        3. ...
+        4. See error
+      placeholder: How can we replicate the issue?
+  - type: textarea
+    id: modelused
+    attributes:
+      label: Model Used
+      description: A description of the model that was used when the error was encountered
+      placeholder: gpt-4, mistral-7B etc
+  - type: textarea
+    id: expected_behavior
+    attributes:
+      label: Expected Behavior
+      description: A clear and concise description of what you expected to happen.
+      placeholder: What should have happened?
+  - type: textarea
+    id: screenshots
+    attributes:
+      label: Screenshots and logs
+      description: If applicable, add screenshots and logs to help explain your problem.
+      placeholder: Add screenshots here
+  - type: textarea
+    id: additional_information
+    attributes:
+      label: Additional Information
+      description: |
+        - FLAML Version: <!-- Specify the FLAML version (e.g., v0.2.0) -->
+        - Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
+        - Python Version: <!-- Specify the Python version (e.g., 3.8) -->
+        - Related Issues: <!-- Link to any related issues here (e.g., #1) -->
+        - Any other relevant information.
+      placeholder: Any additional details
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1 @@
+blank_issues_enabled: true
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,26 @@
+name: Feature Request
+description: File a feature request
+labels: ["enhancement"]
+title: "[Feature Request]: "
+
+body:
+  - type: textarea
+    id: problem_description
+    attributes:
+      label: Is your feature request related to a problem? Please describe.
+      description: A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+      placeholder: What problem are you trying to solve?
+
+  - type: textarea
+    id: solution_description
+    attributes:
+      label: Describe the solution you'd like
+      description: A clear and concise description of what you want to happen.
+      placeholder: How do you envision the solution?
+
+  - type: textarea
+    id: additional_context
+    attributes:
+      label: Additional context
+      description: Add any other context or screenshots about the feature request here.
+      placeholder: Any additional information
--- a/.github/ISSUE_TEMPLATE/general_issue.yml
+++ b/.github/ISSUE_TEMPLATE/general_issue.yml
@@ -0,0 +1,41 @@
+name: General Issue
+description: File a general issue
+title: "[Issue]: "
+labels: []
+
+body:
+  - type: textarea
+    id: description
+    attributes:
+      label: Describe the issue
+      description: A clear and concise description of what the issue is.
+      placeholder: What went wrong?
+  - type: textarea
+    id: reproduce
+    attributes:
+      label: Steps to reproduce
+      description: |
+        Steps to reproduce the behavior:
+
+        1. Step 1
+        2. Step 2
+        3. ...
+        4. See error
+      placeholder: How can we replicate the issue?
+  - type: textarea
+    id: screenshots
+    attributes:
+      label: Screenshots and logs
+      description: If applicable, add screenshots and logs to help explain your problem.
+      placeholder: Add screenshots here
+  - type: textarea
+    id: additional_information
+    attributes:
+      label: Additional Information
+      description: |
+        - FLAML Version: <!-- Specify the FLAML version (e.g., v0.2.0) -->
+        - Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
+        - Python Version: <!-- Specify the Python version (e.g., 3.8) -->
+        - Related Issues: <!-- Link to any related issues here (e.g., #1) -->
+        - Any other relevant information.
+      placeholder: Any additional details
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -12,7 +12,7 @@

 ## Checks

-<!-- - I've used [pre-commit](https://microsoft.github.io/FLAML/docs/Contribute#pre-commit) to lint the changes in this PR (note the same in integrated in our CI checks). -->
+- [ ] I've used [pre-commit](https://microsoft.github.io/FLAML/docs/Contribute#pre-commit) to lint the changes in this PR (note the same in integrated in our CI checks).
 - [ ] I've included any doc changes needed for https://microsoft.github.io/FLAML/. See https://microsoft.github.io/FLAML/docs/Contribute#documentation to build and test documentation locally.
 - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR.
 - [ ] I've made sure all auto checks have passed.
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,243 @@
+# GitHub Copilot Instructions for FLAML
+
+## Project Overview
+
+FLAML (Fast Library for Automated Machine Learning & Tuning) is a lightweight Python library for efficient automation of machine learning and AI operations. It automates workflow based on large language models, machine learning models, etc. and optimizes their performance.
+
+**Key Components:**
+
+- `flaml/automl/`: AutoML functionality for classification and regression
+- `flaml/tune/`: Generic hyperparameter tuning
+- `flaml/default/`: Zero-shot AutoML with default configurations
+- `flaml/autogen/`: Legacy autogen code (note: AutoGen has moved to a separate repository)
+- `flaml/fabric/`: Microsoft Fabric integration
+- `test/`: Comprehensive test suite
+
+## Build and Test Commands
+
+### Installation
+
+```bash
+# Basic installation
+pip install -e .
+
+# Install with test dependencies
+pip install -e .[test]
+
+# Install with automl dependencies
+pip install -e .[automl]
+
+# Install with forecast dependencies (Linux only)
+pip install -e .[forecast]
+```
+
+### Running Tests
+
+```bash
+# Run all tests (excluding autogen)
+pytest test/ --ignore=test/autogen --reruns 2 --reruns-delay 10
+
+# Run tests with coverage
+coverage run -a -m pytest test --ignore=test/autogen --reruns 2 --reruns-delay 10
+coverage xml
+
+# Check dependencies
+python test/check_dependency.py
+```
+
+### Linting and Formatting
+
+```bash
+# Run pre-commit hooks
+pre-commit run --all-files
+
+# Format with black (line length: 120)
+black . --line-length 120
+
+# Run ruff for linting and auto-fix
+ruff check . --fix
+```
+
+## Code Style and Formatting
+
+### Python Style
+
+- **Line length:** 120 characters (configured in both Black and Ruff)
+- **Formatter:** Black (v23.3.0+)
+- **Linter:** Ruff with Pyflakes and pycodestyle rules
+- **Import sorting:** Use isort (via Ruff)
+- **Python version:** Supports Python >= 3.10 (full support for 3.10, 3.11, 3.12 and 3.13)
+
+### Code Quality Rules
+
+- Follow Black formatting conventions
+- Keep imports sorted and organized
+- Avoid unused imports (F401) - these are flagged but not auto-fixed
+- Avoid wildcard imports (F403) where possible
+- Complexity: Max McCabe complexity of 10
+- Use type hints where appropriate
+- Write clear docstrings for public APIs
+
+### Pre-commit Hooks
+
+The repository uses pre-commit hooks for:
+
+- Checking for large files, AST syntax, YAML/TOML/JSON validity
+- Detecting merge conflicts and private keys
+- Trailing whitespace and end-of-file fixes
+- pyupgrade for Python 3.8+ syntax
+- Black formatting
+- Markdown formatting (mdformat with GFM and frontmatter support)
+- Ruff linting with auto-fix
+
+## Testing Strategy
+
+### Test Organization
+
+- Tests are in the `test/` directory, organized by module
+- `test/automl/`: AutoML feature tests
+- `test/tune/`: Hyperparameter tuning tests
+- `test/default/`: Zero-shot AutoML tests
+- `test/nlp/`: NLP-related tests
+- `test/spark/`: Spark integration tests
+
+### Test Requirements
+
+- Write tests for new functionality
+- Ensure tests pass on multiple Python versions (3.10, 3.11, 3.12 and 3.13)
+- Tests should work on both Ubuntu and Windows
+- Use pytest markers for platform-specific tests (e.g., `@pytest.mark.spark`)
+- Tests should be idempotent and not depend on external state
+- Use `--reruns 2 --reruns-delay 10` for flaky tests
+
+### Coverage
+
+- Aim for good test coverage on new code
+- Coverage reports are generated for Python 3.11 builds
+- Coverage reports are uploaded to Codecov
+
+## Git Workflow and Best Practices
+
+### Branching
+
+- Main branch: `main`
+- Create feature branches from `main`
+- PR reviews are required before merging
+
+### Commit Messages
+
+- Use clear, descriptive commit messages
+- Reference issue numbers when applicable
+- ALWAYS run `pre-commit run --all-files` before each commit to avoid formatting issues
+
+### Pull Requests
+
+- Ensure all tests pass before requesting review
+- Update documentation if adding new features
+- Follow the PR template in `.github/PULL_REQUEST_TEMPLATE.md`
+- ALWAYS run `pre-commit run --all-files` before each commit to avoid formatting issues
+
+## Project Structure
+
+```
+flaml/
+├── automl/         # AutoML functionality
+├── tune/           # Hyperparameter tuning
+├── default/        # Zero-shot AutoML
+├── autogen/        # Legacy autogen (deprecated, moved to separate repo)
+├── fabric/         # Microsoft Fabric integration
+├── onlineml/       # Online learning
+└── version.py      # Version information
+
+test/               # Test suite
+├── automl/
+├── tune/
+├── default/
+├── nlp/
+└── spark/
+
+notebook/           # Example notebooks
+website/            # Documentation website
+```
+
+## Dependencies and Package Management
+
+### Core Dependencies
+
+- NumPy >= 1.17
+- Python >= 3.10 (officially supported: 3.10, 3.11, 3.12 and 3.13)
+
+### Optional Dependencies
+
+- `[automl]`: lightgbm, xgboost, scipy, pandas, scikit-learn
+- `[test]`: Full test suite dependencies
+- `[spark]`: PySpark and joblib dependencies
+- `[forecast]`: holidays, prophet, statsmodels, hcrystalball, pytorch-forecasting, pytorch-lightning, tensorboardX
+- `[hf]`: Hugging Face transformers and datasets
+- See `setup.py` for complete list
+
+### Version Constraints
+
+- Be mindful of Python version-specific dependencies (check setup.py)
+- XGBoost versions differ based on Python version
+- NumPy 2.0+ only for Python >= 3.13
+- Some features (like vowpalwabbit) only work with older Python versions
+
+## Boundaries and Restrictions
+
+### Do NOT Modify
+
+- `.git/` directory and Git configuration
+- `LICENSE` file
+- Version information in `flaml/version.py` (unless explicitly updating version)
+- GitHub Actions workflows without careful consideration
+- Existing test files unless fixing bugs or adding coverage
+
+### Be Cautious With
+
+- `setup.py`: Changes to dependencies should be carefully reviewed
+- `pyproject.toml`: Linting and testing configuration
+- `.pre-commit-config.yaml`: Pre-commit hook configuration
+- Backward compatibility: FLAML is a library with external users
+
+### Security Considerations
+
+- Never commit secrets or API keys
+- Be careful with external data sources in tests
+- Validate user inputs in public APIs
+- Follow secure coding practices for ML operations
+
+## Special Notes
+
+### AutoGen Migration
+
+- AutoGen has moved to a separate repository: https://github.com/microsoft/autogen
+- The `flaml/autogen/` directory contains legacy code
+- Tests in `test/autogen/` are ignored in the main test suite
+- Direct users to the new AutoGen repository for AutoGen-related issues
+
+### Platform-Specific Considerations
+
+- Some tests only run on Linux (e.g., forecast tests with prophet)
+- Windows and Ubuntu are the primary supported platforms
+- macOS support exists but requires special libomp setup for lgbm/xgboost
+
+### Performance
+
+- FLAML focuses on efficient automation and tuning
+- Consider computational cost when adding new features
+- Optimize for low resource usage where possible
+
+## Documentation
+
+- Main documentation: https://microsoft.github.io/FLAML/
+- Update documentation when adding new features
+- Provide clear examples in docstrings
+- Add notebook examples for significant new features
+
+## Contributing
+
+- Follow the contributing guide: https://microsoft.github.io/FLAML/docs/Contribute
+- Sign the Microsoft CLA when making your first contribution
+- Be respectful and follow the Microsoft Open Source Code of Conduct
+- Join the Discord community for discussions: https://discord.gg/Cppx2vSPVP
--- a/.github/workflows/CD.yml
+++ b/.github/workflows/CD.yml
@@ -12,26 +12,17 @@ jobs:
  deploy:
    strategy:
      matrix:
-        os: ['ubuntu-latest']
-        python-version: [3.8]
+        os: ["ubuntu-latest"]
+        python-version: ["3.12"]
    runs-on: ${{ matrix.os }}
    environment: package
    steps:
      - name: Checkout
-        uses: actions/checkout@v3
-      - name: Cache conda
-        uses: actions/cache@v3
+        uses: actions/checkout@v4
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
        with:
-          path: ~/conda_pkgs_dir
-          key: conda-${{ matrix.os }}-python-${{ matrix.python-version }}-${{ hashFiles('environment.yml') }}
-      - name: Setup Miniconda
-        uses: conda-incubator/setup-miniconda@v2
-        with:
-          auto-update-conda: true
-          auto-activate-base: false
-          activate-environment: hcrystalball
          python-version: ${{ matrix.python-version }}
-          use-only-tar-bz2: true
      - name: Install from source
        # This is required for the pre-commit tests
        shell: pwsh
@@ -42,7 +33,7 @@ jobs:
      - name: Build
        shell: pwsh
        run: |
-          pip install twine
+          pip install twine wheel setuptools
          python setup.py sdist bdist_wheel
      - name: Publish to PyPI
        env:
--- a/.github/workflows/deploy-website.yml
+++ b/.github/workflows/deploy-website.yml
@@ -17,6 +17,9 @@ on:
  merge_group:
    types: [checks_requested]

+permissions:
+  contents: write
+
 jobs:
  checks:
    if: github.event_name != 'push'
@@ -34,11 +37,11 @@ jobs:
      - name: setup python
        uses: actions/setup-python@v4
        with:
-          python-version: "3.8"
+          python-version: "3.12"
      - name: pydoc-markdown install
        run: |
          python -m pip install --upgrade pip
-          pip install pydoc-markdown==4.5.0
+          pip install pydoc-markdown==4.7.0 setuptools
      - name: pydoc-markdown run
        run: |
          pydoc-markdown
@@ -70,11 +73,11 @@ jobs:
      - name: setup python
        uses: actions/setup-python@v4
        with:
-          python-version: "3.8"
+          python-version: "3.12"
      - name: pydoc-markdown install
        run: |
          python -m pip install --upgrade pip
-          pip install pydoc-markdown==4.5.0
+          pip install pydoc-markdown==4.7.0 setuptools
      - name: pydoc-markdown run
        run: |
          pydoc-markdown
--- a/.github/workflows/openai.yml
+++ b/.github/workflows/openai.yml
@@ -4,14 +4,17 @@
 name: OpenAI

 on:
-  pull_request:
-    branches: ['main']
-    paths:
-      - 'flaml/autogen/**'
-      - 'test/autogen/**'
-      - 'notebook/autogen_openai_completion.ipynb'
-      - 'notebook/autogen_chatgpt_gpt4.ipynb'
-      - '.github/workflows/openai.yml'
+  workflow_dispatch:
+#   pull_request:
+#     branches: ['main']
+#     paths:
+#       - 'flaml/autogen/**'
+#       - 'test/autogen/**'
+#       - 'notebook/autogen_openai_completion.ipynb'
+#       - 'notebook/autogen_chatgpt_gpt4.ipynb'
+#       - '.github/workflows/openai.yml'
+
+permissions: {}

 jobs:
  test:
--- a/.github/workflows/pre-commit.yml
+++ b/.github/workflows/pre-commit.yml
@@ -1,15 +1,14 @@
 name: Code formatting

 # see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
-on:  # Trigger the workflow on push or pull request, but only for the main branch
-  push:
-    branches: [main]
+on:
  pull_request: {}

 defaults:
  run:
    shell: bash

+permissions: {}
 jobs:

  pre-commit-check:
--- a/.github/workflows/python-package.yml
+++ b/.github/workflows/python-package.yml
@@ -14,9 +14,20 @@ on:
      - 'setup.py'
  pull_request:
    branches: ['main']
+    paths:
+      - 'flaml/**'
+      - 'test/**'
+      - 'notebook/**'
+      - '.github/workflows/python-package.yml'
+      - 'setup.py'
  merge_group:
    types: [checks_requested]
+  schedule:
+    # Every other day at 02:00 UTC
+    - cron: '0 2 */2 * *'

+permissions:
+  contents: write
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref }}
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
@@ -28,20 +39,18 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
-        os: [ubuntu-latest, macos-latest, windows-2019]
-        python-version: ["3.8", "3.9", "3.10"]
+        os: [ubuntu-latest, windows-latest]
+        python-version: ["3.10", "3.11", "3.12", "3.13"]
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v4
+        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
-      - name: On mac + python 3.10, install libomp to facilitate lgbm and xgboost install
-        if: matrix.os == 'macOS-latest' && matrix.python-version == '3.10'
+      - name: On mac, install libomp to facilitate lgbm and xgboost install
+        if: matrix.os == 'macos-latest'
        run: |
-          # remove libomp version constraint after xgboost works with libomp>11.1.0 on python 3.10
-          wget https://raw.githubusercontent.com/Homebrew/homebrew-core/679923b4eb48a8dc7ecc1f05d06063cd79b3fc00/Formula/libomp.rb -O $(find $(brew --repository) -name libomp.rb)
-          brew unlink libomp
+          brew update
          brew install libomp
          export CC=/usr/bin/clang
          export CXX=/usr/bin/clang++
@@ -51,74 +60,82 @@ jobs:
          export LDFLAGS="$LDFLAGS -Wl,-rpath,/usr/local/opt/libomp/lib -L/usr/local/opt/libomp/lib -lomp"
      - name: Install packages and dependencies
        run: |
-          python -m pip install --upgrade pip wheel
+          python -m pip install --upgrade pip wheel setuptools
          pip install -e .
          python -c "import flaml"
          pip install -e .[test]
-      - name: On Ubuntu python 3.8, install pyspark 3.2.3
-        if: matrix.python-version == '3.8' && matrix.os == 'ubuntu-latest'
+      - name: On Ubuntu python 3.11, install pyspark 3.5.1
+        if: matrix.python-version == '3.11' && matrix.os == 'ubuntu-latest'
        run: |
-          pip install pyspark==3.2.3
+          pip install pyspark==3.5.1
          pip list | grep "pyspark"
-      - name: If linux, install ray 2
+      - name: On Ubuntu python 3.12, install pyspark 4.0.1
+        if: matrix.python-version == '3.12' && matrix.os == 'ubuntu-latest'
+        run: |
+          pip install pyspark==4.0.1
+          pip list | grep "pyspark"
+      - name: On Ubuntu python 3.13, install pyspark 4.1.0
+        if: matrix.python-version == '3.13' && matrix.os == 'ubuntu-latest'
+        run: |
+          pip install pyspark==4.1.0
+          pip list | grep "pyspark"
+      # # TODO: support ray
+      # - name: If linux and python<3.11, install ray 2
+      #   if: matrix.os == 'ubuntu-latest' && matrix.python-version < '3.11'
+      #   run: |
+      #     pip install "ray[tune]<2.5.0"
+      - name: Install prophet when on linux
        if: matrix.os == 'ubuntu-latest'
-        run: |
-          pip install "ray[tune]<2.5.0"
-      - name: If mac, install ray
-        if: matrix.os == 'macOS-latest'
-        run: |
-          pip install -e .[ray]
-      - name: If linux or mac, install prophet on python < 3.9
-        if: (matrix.os == 'macOS-latest' || matrix.os == 'ubuntu-latest') && matrix.python-version != '3.9' && matrix.python-version != '3.10'
        run: |
          pip install -e .[forecast]
-      - name: Install vw on python < 3.10
-        if: matrix.python-version != '3.10'
+      # TODO: support vw for python 3.10+
+      - name:  If linux and python<3.10, install vw
+        if: matrix.os == 'ubuntu-latest' && matrix.python-version < '3.10'
        run: |
          pip install -e .[vw]
-      - name: Uninstall pyspark on (python 3.9) or (python 3.8 + windows)
-        if: matrix.python-version == '3.9' || (matrix.python-version == '3.8' && matrix.os == 'windows-2019')
+      - name: Pip freeze
        run: |
-          # Uninstall pyspark to test env without pyspark
-          pip uninstall -y pyspark
+          pip freeze
+      - name: Check dependencies
+        run: |
+          python test/check_dependency.py
+      - name: Clear pip cache
+        run: |
+          pip cache purge
      - name: Test with pytest
-        if: matrix.python-version != '3.10'
+        timeout-minutes: 120
+        if: matrix.python-version != '3.11'
        run: |
-          pytest test
+          pytest test/ --ignore=test/autogen --reruns 2 --reruns-delay 10
      - name: Coverage
-        if: matrix.python-version == '3.10'
+        timeout-minutes: 120
+        if: matrix.python-version == '3.11'
        run: |
          pip install coverage
-          coverage run -a -m pytest test
+          coverage run -a -m pytest test --ignore=test/autogen --reruns 2 --reruns-delay 10
          coverage xml
      - name: Upload coverage to Codecov
-        if: matrix.python-version == '3.10'
+        if: matrix.python-version == '3.11'
        uses: codecov/codecov-action@v3
        with:
          file: ./coverage.xml
          flags: unittests
+      - name: Save dependencies
+        if: github.ref == 'refs/heads/main'
+        shell: bash
+        run: |
+          git config --global user.name 'github-actions[bot]'
+          git config --global user.email 'github-actions[bot]@users.noreply.github.com'
+          git config advice.addIgnoredFile false

-  # docs:
+          BRANCH=unit-tests-installed-dependencies
+          git fetch origin
+          git checkout -B "$BRANCH" "origin/$BRANCH"

-  #   runs-on: ubuntu-latest
-
-  #   steps:
-  #     - uses: actions/checkout@v3
-  #     - name: Setup Python
-  #       uses: actions/setup-python@v4
-  #       with:
-  #         python-version: '3.8'
-  #     - name: Compile documentation
-  #       run: |
-  #           pip install -e .
-  #           python -m pip install sphinx sphinx_rtd_theme
-  #           cd docs
-  #           make html
-  #     - name: Deploy to GitHub pages
-  #       if: ${{ github.ref == 'refs/heads/main' }}
-  #       uses: JamesIves/github-pages-deploy-action@3.6.2
-  #       with:
-  #         GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-  #         BRANCH: gh-pages
-  #         FOLDER: docs/_build/html
-  #         CLEAN: true
+          pip freeze > installed_all_dependencies_${{ matrix.python-version }}_${{ matrix.os }}.txt
+          python test/check_dependency.py > installed_first_tier_dependencies_${{ matrix.python-version }}_${{ matrix.os }}.txt
+          git add installed_*dependencies*.txt
+          mv coverage.xml ./coverage_${{ matrix.python-version }}_${{ matrix.os }}.xml || true
+          git add -f ./coverage_${{ matrix.python-version }}_${{ matrix.os }}.xml || true
+          git commit -m "Update installed dependencies for Python ${{ matrix.python-version }} on ${{ matrix.os }}" || exit 0
+          git push origin "$BRANCH" --force
--- a/.gitignore
+++ b/.gitignore
@@ -60,6 +60,7 @@ coverage.xml
 .hypothesis/
 .pytest_cache/
 cover/
+junit

 # Translations
 *.mo
@@ -163,5 +164,28 @@ output/
 flaml/tune/spark/mylearner.py
 *.pkl

+data/
+benchmark/pmlb/csv_datasets
+benchmark/*.csv
+
+checkpoints/
+test/default
+test/housing.json
+test/nlp/default/transformer_ms/seq-classification.json
+
+flaml/fabric/fanova/*fanova.c
 # local config files
 *.config.local
+
+local_debug/
+patch.diff
+
+# Test things
+notebook/lightning_logs/
+lightning_logs/
+flaml/autogen/extensions/tmp/
+test/autogen/my_tmp/
+catboost_*
+
+# Internal configs
+.pypirc
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -22,10 +22,28 @@ repos:
    - id: trailing-whitespace
    - id: end-of-file-fixer
    - id: no-commit-to-branch
+
+  - repo: https://github.com/asottile/pyupgrade
+    rev: v2.31.1
+    hooks:
+      - id: pyupgrade
+        args: [--py38-plus]
+        name: Upgrade code
+
  - repo: https://github.com/psf/black
    rev: 23.3.0
    hooks:
    - id: black
+
+  - repo: https://github.com/executablebooks/mdformat
+    rev: 0.7.22
+    hooks:
+      - id: mdformat
+        additional_dependencies:
+          - mdformat-gfm
+          - mdformat-black
+          - mdformat_frontmatter
+
  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.0.261
    hooks:
--- a/2
+++ b/2
@@ -1,5 +1,5 @@
 # basic setup
-FROM python:3.7
+FROM mcr.microsoft.com/devcontainers/python:3.10
 RUN apt-get update && apt-get -y update
 RUN apt-get install -y sudo git npm

--- a/NOTICE.md
+++ b/NOTICE.md
@@ -1,221 +1,222 @@
-NOTICES
+# NOTICES

 This repository incorporates material as listed below or described in the code.

-#
 ## Component. Ray.

 Code in tune/[analysis.py, sample.py, trial.py, result.py],
 searcher/[suggestion.py, variant_generator.py], and scheduler/trial_scheduler.py is adapted from
 https://github.com/ray-project/ray/blob/master/python/ray/tune/

-
-
 ## Open Source License/Copyright Notice.

- Apache License
-                           Version 2.0, January 2004
-                        http://www.apache.org/licenses/
+Apache License
+Version 2.0, January 2004
+http://www.apache.org/licenses/

-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

-   1. Definitions.
+1. Definitions.

-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
+   "License" shall mean the terms and conditions for use, reproduction,
+   and distribution as defined by Sections 1 through 9 of this document.

-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
+   "Licensor" shall mean the copyright owner or entity authorized by
+   the copyright owner that is granting the License.

-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
+   "Legal Entity" shall mean the union of the acting entity and all
+   other entities that control, are controlled by, or are under common
+   control with that entity. For the purposes of this definition,
+   "control" means (i) the power, direct or indirect, to cause the
+   direction or management of such entity, whether by contract or
+   otherwise, or (ii) ownership of fifty percent (50%) or more of the
+   outstanding shares, or (iii) beneficial ownership of such entity.

-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
+   "You" (or "Your") shall mean an individual or Legal Entity
+   exercising permissions granted by this License.

-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
+   "Source" form shall mean the preferred form for making modifications,
+   including but not limited to software source code, documentation
+   source, and configuration files.

-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
+   "Object" form shall mean any form resulting from mechanical
+   transformation or translation of a Source form, including but
+   not limited to compiled object code, generated documentation,
+   and conversions to other media types.

-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
+   "Work" shall mean the work of authorship, whether in Source or
+   Object form, made available under the License, as indicated by a
+   copyright notice that is included in or attached to the work
+   (an example is provided in the Appendix below).

-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
+   "Derivative Works" shall mean any work, whether in Source or Object
+   form, that is based on (or derived from) the Work and for which the
+   editorial revisions, annotations, elaborations, or other modifications
+   represent, as a whole, an original work of authorship. For the purposes
+   of this License, Derivative Works shall not include works that remain
+   separable from, or merely link (or bind by name) to the interfaces of,
+   the Work and Derivative Works thereof.

-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
+   "Contribution" shall mean any work of authorship, including
+   the original version of the Work and any modifications or additions
+   to that Work or Derivative Works thereof, that is intentionally
+   submitted to Licensor for inclusion in the Work by the copyright owner
+   or by an individual or Legal Entity authorized to submit on behalf of
+   the copyright owner. For the purposes of this definition, "submitted"
+   means any form of electronic, verbal, or written communication sent
+   to the Licensor or its representatives, including but not limited to
+   communication on electronic mailing lists, source code control systems,
+   and issue tracking systems that are managed by, or on behalf of, the
+   Licensor for the purpose of discussing and improving the Work, but
+   excluding communication that is conspicuously marked or otherwise
+   designated in writing by the copyright owner as "Not a Contribution."

-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
+   "Contributor" shall mean Licensor and any individual or Legal Entity
+   on behalf of whom a Contribution has been received by Licensor and
+   subsequently incorporated within the Work.

-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
+1. Grant of Copyright License. Subject to the terms and conditions of
+   this License, each Contributor hereby grants to You a perpetual,
+   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+   copyright license to reproduce, prepare Derivative Works of,
+   publicly display, publicly perform, sublicense, and distribute the
+   Work and such Derivative Works in Source or Object form.

-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
+1. Grant of Patent License. Subject to the terms and conditions of
+   this License, each Contributor hereby grants to You a perpetual,
+   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+   (except as stated in this section) patent license to make, have made,
+   use, offer to sell, sell, import, and otherwise transfer the Work,
+   where such license applies only to those patent claims licensable
+   by such Contributor that are necessarily infringed by their
+   Contribution(s) alone or by combination of their Contribution(s)
+   with the Work to which such Contribution(s) was submitted. If You
+   institute patent litigation against any entity (including a
+   cross-claim or counterclaim in a lawsuit) alleging that the Work
+   or a Contribution incorporated within the Work constitutes direct
+   or contributory patent infringement, then any patent licenses
+   granted to You under this License for that Work shall terminate
+   as of the date such litigation is filed.

-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
+1. Redistribution. You may reproduce and distribute copies of the
+   Work or Derivative Works thereof in any medium, with or without
+   modifications, and in Source or Object form, provided that You
+   meet the following conditions:

-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
+   (a) You must give any other recipients of the Work or
+   Derivative Works a copy of this License; and

-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
+   (b) You must cause any modified files to carry prominent notices
+   stating that You changed the files; and

-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
+   (c) You must retain, in the Source form of any Derivative Works
+   that You distribute, all copyright, patent, trademark, and
+   attribution notices from the Source form of the Work,
+   excluding those notices that do not pertain to any part of
+   the Derivative Works; and

-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
+   (d) If the Work includes a "NOTICE" text file as part of its
+   distribution, then any Derivative Works that You distribute must
+   include a readable copy of the attribution notices contained
+   within such NOTICE file, excluding those notices that do not
+   pertain to any part of the Derivative Works, in at least one
+   of the following places: within a NOTICE text file distributed
+   as part of the Derivative Works; within the Source form or
+   documentation, if provided along with the Derivative Works; or,
+   within a display generated by the Derivative Works, if and
+   wherever such third-party notices normally appear. The contents
+   of the NOTICE file are for informational purposes only and
+   do not modify the License. You may add Your own attribution
+   notices within Derivative Works that You distribute, alongside
+   or as an addendum to the NOTICE text from the Work, provided
+   that such additional attribution notices cannot be construed
+   as modifying the License.

-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
+   You may add Your own copyright statement to Your modifications and
+   may provide additional or different license terms and conditions
+   for use, reproduction, or distribution of Your modifications, or
+   for any such Derivative Works as a whole, provided Your use,
+   reproduction, and distribution of the Work otherwise complies with
+   the conditions stated in this License.

-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
+1. Submission of Contributions. Unless You explicitly state otherwise,
+   any Contribution intentionally submitted for inclusion in the Work
+   by You to the Licensor shall be under the terms and conditions of
+   this License, without any additional terms or conditions.
+   Notwithstanding the above, nothing herein shall supersede or modify
+   the terms of any separate license agreement you may have executed
+   with Licensor regarding such Contributions.

-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
+1. Trademarks. This License does not grant permission to use the trade
+   names, trademarks, service marks, or product names of the Licensor,
+   except as required for reasonable and customary use in describing the
+   origin of the Work and reproducing the content of the NOTICE file.

-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
+1. Disclaimer of Warranty. Unless required by applicable law or
+   agreed to in writing, Licensor provides the Work (and each
+   Contributor provides its Contributions) on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+   implied, including, without limitation, any warranties or conditions
+   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+   PARTICULAR PURPOSE. You are solely responsible for determining the
+   appropriateness of using or redistributing the Work and assume any
+   risks associated with Your exercise of permissions under this License.

-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
+1. Limitation of Liability. In no event and under no legal theory,
+   whether in tort (including negligence), contract, or otherwise,
+   unless required by applicable law (such as deliberate and grossly
+   negligent acts) or agreed to in writing, shall any Contributor be
+   liable to You for damages, including any direct, indirect, special,
+   incidental, or consequential damages of any character arising as a
+   result of this License or out of the use or inability to use the
+   Work (including but not limited to damages for loss of goodwill,
+   work stoppage, computer failure or malfunction, or any and all
+   other commercial damages or losses), even if such Contributor
+   has been advised of the possibility of such damages.

-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
+1. Accepting Warranty or Additional Liability. While redistributing
+   the Work or Derivative Works thereof, You may choose to offer,
+   and charge a fee for, acceptance of support, warranty, indemnity,
+   or other liability obligations and/or rights consistent with this
+   License. However, in accepting such obligations, You may act only
+   on Your own behalf and on Your sole responsibility, not on behalf
+   of any other Contributor, and only if You agree to indemnify,
+   defend, and hold each Contributor harmless for any liability
+   incurred by, or claims asserted against, such Contributor by reason
+   of your accepting any such warranty or additional liability.

-   END OF TERMS AND CONDITIONS
+END OF TERMS AND CONDITIONS

-   APPENDIX: How to apply the Apache License to your work.
+APPENDIX: How to apply the Apache License to your work.

-      To apply the Apache License to your work, attach the following
-      boilerplate notice, with the fields enclosed by brackets "{}"
-      replaced with your own identifying information. (Don't include
-      the brackets!)  The text should be enclosed in the appropriate
-      comment syntax for the file format. We also recommend that a
-      file or class name and description of purpose be included on the
-      same "printed page" as the copyright notice for easier
-      identification within third-party archives.
+```
+  To apply the Apache License to your work, attach the following
+  boilerplate notice, with the fields enclosed by brackets "{}"
+  replaced with your own identifying information. (Don't include
+  the brackets!)  The text should be enclosed in the appropriate
+  comment syntax for the file format. We also recommend that a
+  file or class name and description of purpose be included on the
+  same "printed page" as the copyright notice for easier
+  identification within third-party archives.
+```

-   Copyright {yyyy} {name of copyright owner}
+Copyright {yyyy} {name of copyright owner}

-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at

-       http://www.apache.org/licenses/LICENSE-2.0
+```
+   http://www.apache.org/licenses/LICENSE-2.0
+```

-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.

--------------------------------------------------------------------------------
+______________________________________________________________________

 Code in python/ray/rllib/{evolution_strategies, dqn} adapted from
 https://github.com/openai (MIT License)
@@ -240,7 +241,7 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 THE SOFTWARE.

--------------------------------------------------------------------------------
+______________________________________________________________________

 Code in python/ray/rllib/impala/vtrace.py from
 https://github.com/deepmind/scalable_agent
@@ -251,7 +252,9 @@ Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

-    https://www.apache.org/licenses/LICENSE-2.0
+```
+https://www.apache.org/licenses/LICENSE-2.0
+```

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
@@ -259,7 +262,8 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.

--------------------------------------------------------------------------------
+______________________________________________________________________
+
 Code in python/ray/rllib/ars is adapted from https://github.com/modestyachts/ARS

 Copyright (c) 2018, ARS contributors (Horia Mania, Aurelia Guy, Benjamin Recht)
@@ -269,11 +273,11 @@ Redistribution and use of ARS in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:

 1. Redistributions of source code must retain the above copyright notice, this
-list of conditions and the following disclaimer.
+   list of conditions and the following disclaimer.

-2. Redistributions in binary form must reproduce the above copyright notice,
-this list of conditions and the following disclaimer in the documentation and/or
-other materials provided with the distribution.
+1. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation and/or
+   other materials provided with the distribution.

 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
@@ -286,5 +290,6 @@ ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

------------------
-Code in python/ray/_private/prometheus_exporter.py is adapted from https://github.com/census-instrumentation/opencensus-python/blob/master/contrib/opencensus-ext-prometheus/opencensus/ext/prometheus/stats_exporter/__init__.py
+______________________________________________________________________
+
+Code in python/ray/\_private/prometheus_exporter.py is adapted from https://github.com/census-instrumentation/opencensus-python/blob/master/contrib/opencensus-ext-prometheus/opencensus/ext/prometheus/stats_exporter/__init__.py
--- a/README.md
+++ b/README.md
@@ -1,11 +1,11 @@
 [![PyPI version](https://badge.fury.io/py/FLAML.svg)](https://badge.fury.io/py/FLAML)
 ![Conda version](https://img.shields.io/conda/vn/conda-forge/flaml)
 [![Build](https://github.com/microsoft/FLAML/actions/workflows/python-package.yml/badge.svg)](https://github.com/microsoft/FLAML/actions/workflows/python-package.yml)
-![Python Version](https://img.shields.io/badge/3.8%20%7C%203.9%20%7C%203.10-blue)
+[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/FLAML)](https://pypi.org/project/FLAML/)
 [![Downloads](https://pepy.tech/badge/flaml)](https://pepy.tech/project/flaml)
 [![](https://img.shields.io/discord/1025786666260111483?logo=discord&style=flat)](https://discord.gg/Cppx2vSPVP)
-<!-- [![Join the chat at https://gitter.im/FLAMLer/community](https://badges.gitter.im/FLAMLer/community.svg)](https://gitter.im/FLAMLer/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) -->

+<!-- [![Join the chat at https://gitter.im/FLAMLer/community](https://badges.gitter.im/FLAMLer/community.svg)](https://gitter.im/FLAMLer/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) -->

 # A Fast Library for Automated Machine Learning & Tuning

@@ -14,39 +14,36 @@
    <br>
 </p>

-:fire: The automated multi-agent chat framework in [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) is in preview from v2.0.0.
-
-:fire: FLAML is highlighted in OpenAI's [cookbook](https://github.com/openai/openai-cookbook#related-resources-from-around-the-web).
-
-:fire: [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) is released with support for ChatGPT and GPT-4, based on [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673).
-
-:fire: FLAML supports Code-First AutoML & Tuning – Private Preview in [Microsoft Fabric Data Science](https://learn.microsoft.com/en-us/fabric/data-science/).
+:fire: FLAML supports AutoML and Hyperparameter Tuning in [Microsoft Fabric Data Science](https://learn.microsoft.com/en-us/fabric/data-science/automated-machine-learning-fabric). In addition, we've introduced Python 3.11 and 3.12 support, along with a range of new estimators, and comprehensive integration with MLflow—thanks to contributions from the Microsoft Fabric product team.

+:fire: Heads-up: [AutoGen](https://microsoft.github.io/autogen/) has moved to a dedicated [GitHub repository](https://github.com/microsoft/autogen). FLAML no longer includes the `autogen` module—please use AutoGen directly.

 ## What is FLAML
+
 FLAML is a lightweight Python library for efficient automation of machine
 learning and AI operations. It automates workflow based on large language models, machine learning models, etc.
 and optimizes their performance.

-* FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
-* For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
-* It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.
+- FLAML enables economical automation and tuning for ML/AI workflows, including model selection and hyperparameter optimization under resource constraints.
+- For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
+- It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.

-FLAML is powered by a series of [research studies](/docs/Research) from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.
+FLAML is powered by a series of [research studies](https://microsoft.github.io/FLAML/docs/Research/) from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.

 FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET.

 ## Installation

-FLAML requires **Python version >= 3.8**. It can be installed from pip:
+The latest version of FLAML requires **Python >= 3.10 and < 3.14**. While other Python versions may work for core components, full model support is not guaranteed. FLAML can be installed via `pip`:

 ```bash
 pip install flaml
 ```

-Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`autogen`](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) package.
+Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`automl`](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML) module.
+
 ```bash
-pip install "flaml[autogen]"
+pip install "flaml[automl]"
 ```

 Find more options in [Installation](https://microsoft.github.io/FLAML/docs/Installation).
@@ -54,56 +51,34 @@ Each of the [`notebook examples`](https://github.com/microsoft/FLAML/tree/main/n

 ## Quickstart

-* (New) The [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) package enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
-It offers customizable and conversable agents which integrate LLMs, tools and human.
-By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
-```python
-from flaml import autogen
-assistant = autogen.AssistantAgent("assistant")
-user_proxy = autogen.UserProxyAgent("user_proxy")
-user_proxy.initiate_chat(assistant, message="Show me the YTD gain of 10 largest technology companies as of today.")
-# This initiates an automated chat between the two agents to solve the task
-```
-
-Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
-```python
-# perform tuning
-config, analysis = autogen.Completion.tune(
-    data=tune_data,
-    metric="success",
-    mode="max",
-    eval_func=eval_func,
-    inference_budget=0.05,
-    optimization_budget=3,
-    num_samples=-1,
-)
-# perform inference for a test instance
-response = autogen.Completion.create(context=test_instance, **config)
-```
-* With three lines of code, you can start using this economical and fast
-AutoML engine as a [scikit-learn style estimator](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML).
+- With three lines of code, you can start using this economical and fast
+  AutoML engine as a [scikit-learn style estimator](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML).

 ```python
 from flaml import AutoML
+
 automl = AutoML()
 automl.fit(X_train, y_train, task="classification")
 ```

-* You can restrict the learners and use FLAML as a fast hyperparameter tuning
-tool for XGBoost, LightGBM, Random Forest etc. or a [customized learner](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#estimator-and-search-space).
+- You can restrict the learners and use FLAML as a fast hyperparameter tuning
+  tool for XGBoost, LightGBM, Random Forest etc. or a [customized learner](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#estimator-and-search-space).

 ```python
 automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])
 ```

-* You can also run generic hyperparameter tuning for a [custom function](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function).
+- You can also run generic hyperparameter tuning for a [custom function](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function).

 ```python
 from flaml import tune
-tune.run(evaluation_function, config={…}, low_cost_partial_config={…}, time_budget_s=3600)
+
+tune.run(
+    evaluation_function, config={…}, low_cost_partial_config={…}, time_budget_s=3600
+)
 ```

-* [Zero-shot AutoML](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML) allows using the existing training API from lightgbm, xgboost etc. while getting the benefit of AutoML in choosing high-performance hyperparameter configurations per task.
+- [Zero-shot AutoML](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML) allows using the existing training API from lightgbm, xgboost etc. while getting the benefit of AutoML in choosing high-performance hyperparameter configurations per task.

 ```python
 from flaml.default import LGBMRegressor
@@ -143,3 +118,9 @@ provided by the bot. You will only need to do this once across all repos using o
 This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
 For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
 contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
+
+## Contributors Wall
+
+<a href="https://github.com/microsoft/flaml/graphs/contributors">
+  <img src="https://contrib.rocks/image?repo=microsoft/flaml&max=204" />
+</a>
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -4,7 +4,7 @@

 Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).

-If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below.
+If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](<https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)>), please report it to us as described below.

 ## Reporting Security Issues

@@ -12,19 +12,19 @@ If you believe you have found a security vulnerability in any Microsoft-owned re

 Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).

-If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com).  If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).
+If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).

 You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).

 Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:

-  * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
-  * Full paths of source file(s) related to the manifestation of the issue
-  * The location of the affected source code (tag/branch/commit or direct URL)
-  * Any special configuration required to reproduce the issue
-  * Step-by-step instructions to reproduce the issue
-  * Proof-of-concept or exploit code (if possible)
-  * Impact of the issue, including how an attacker might exploit the issue
+- Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
+- Full paths of source file(s) related to the manifestation of the issue
+- The location of the affected source code (tag/branch/commit or direct URL)
+- Any special configuration required to reproduce the issue
+- Step-by-step instructions to reproduce the issue
+- Proof-of-concept or exploit code (if possible)
+- Impact of the issue, including how an attacker might exploit the issue

 This information will help us triage your report more quickly.

--- a/flaml/init.py
+++ b/flaml/init.py
@@ -1,10 +1,20 @@
 import logging
-from flaml.automl import AutoML, logger_formatter
-from flaml.tune.searcher import CFO, BlendSearch, FLOW2, BlendSearchTuner, RandomSearch
-from flaml.onlineml.autovw import AutoVW
-from flaml.version import __version__
+import warnings

+try:
+    from flaml.automl import AutoML, logger_formatter
+
+    has_automl = True
+except ImportError:
+    has_automl = False
+from flaml.onlineml.autovw import AutoVW
+from flaml.tune.searcher import CFO, FLOW2, BlendSearch, BlendSearchTuner, RandomSearch
+from flaml.version import __version__

 # Set the root logger.
 logger = logging.getLogger(__name__)
-logger.setLevel(logging.INFO)
+if logger.level == logging.NOTSET:
+    logger.setLevel(logging.INFO)
+
+if not has_automl:
+    warnings.warn("flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.")
--- a/flaml/autogen/init.py
+++ b/flaml/autogen/init.py
@@ -1,3 +1,12 @@
-from .oai import *
+import warnings
+
 from .agentchat import *
 from .code_utils import DEFAULT_MODEL, FAST_MODEL
+from .oai import *
+
+warnings.warn(
+    "The `flaml.autogen` module is deprecated and will be removed in a future release. "
+    "Please refer to `https://github.com/microsoft/autogen` for latest usage.",
+    DeprecationWarning,
+    stacklevel=2,
+)
--- a/flaml/autogen/agentchat/init.py
+++ b/flaml/autogen/agentchat/init.py
@@ -1,12 +1,12 @@
 from .agent import Agent
-from .responsive_agent import ResponsiveAgent
 from .assistant_agent import AssistantAgent
-from .user_proxy_agent import UserProxyAgent
+from .conversable_agent import ConversableAgent
 from .groupchat import GroupChat, GroupChatManager
+from .user_proxy_agent import UserProxyAgent

 __all__ = [
    "Agent",
-    "ResponsiveAgent",
+    "ConversableAgent",
    "AssistantAgent",
    "UserProxyAgent",
    "GroupChat",
--- a/flaml/autogen/agentchat/agent.py
+++ b/flaml/autogen/agentchat/agent.py
@@ -25,10 +25,10 @@ class Agent:
        return self._name

    def send(self, message: Union[Dict, str], recipient: "Agent", request_reply: Optional[bool] = None):
-        """(Aabstract method) Send a message to another agent."""
+        """(Abstract method) Send a message to another agent."""

    async def a_send(self, message: Union[Dict, str], recipient: "Agent", request_reply: Optional[bool] = None):
-        """(Aabstract async method) Send a message to another agent."""
+        """(Abstract async method) Send a message to another agent."""

    def receive(self, message: Union[Dict, str], sender: "Agent", request_reply: Optional[bool] = None):
        """(Abstract method) Receive a message from another agent."""
--- a/flaml/autogen/agentchat/assistant_agent.py
+++ b/flaml/autogen/agentchat/assistant_agent.py
@@ -1,26 +1,27 @@
-from .responsive_agent import ResponsiveAgent
 from typing import Callable, Dict, Optional, Union

+from .conversable_agent import ConversableAgent

-class AssistantAgent(ResponsiveAgent):
-    """(In preview) Assistant agent, designed to solve a task with LLM.

-    AssistantAgent is a subclass of ResponsiveAgent configured with a default system message.
-    The default system message is designed to solve a task with LLM,
-    including suggesting python code blocks and debugging.
-    `human_input_mode` is default to "NEVER"
-    and `code_execution_config` is default to False.
-    This agent doesn't execute code by default, and expects the user to execute the code.
+class AssistantAgent(ConversableAgent):
+    """(In preview) Assistant agent, designed to solve tasks with LLM.
+
+    AssistantAgent is a subclass of ConversableAgent configured with a default system message.
+    The default system message is designed to solve tasks with LLM,
+    including suggesting Python code blocks and debugging.
+    `human_input_mode` defaults to "NEVER"
+    and `code_execution_config` defaults to False.
+    This agent doesn't execute code by default and expects the user to execute the code.
    """

    DEFAULT_SYSTEM_MESSAGE = """You are a helpful AI assistant.
 Solve tasks using your coding and language skills.
-In the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.
+In the following cases, suggest Python code (in a Python coding block) or shell script (in an sh coding block) for the user to execute.
    1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.
    2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.
 Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
 When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
-If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
+If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use the 'print' function for the output when relevant. Check the execution result returned by the user.
 If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
 When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.
 Reply "TERMINATE" in the end when everything is done.
@@ -35,24 +36,24 @@ Reply "TERMINATE" in the end when everything is done.
        max_consecutive_auto_reply: Optional[int] = None,
        human_input_mode: Optional[str] = "NEVER",
        code_execution_config: Optional[Union[Dict, bool]] = False,
-        **kwargs,
+        **kwargs: Dict,
    ):
        """
        Args:
-            name (str): agent name.
-            system_message (str): system message for the ChatCompletion inference.
-                Please override this attribute if you want to reprogram the agent.
-            llm_config (dict): llm inference configuration.
-                Please refer to [autogen.Completion.create](/docs/reference/autogen/oai/completion#create)
+            name (str): Agent name.
+            system_message (Optional[str]): System message for the ChatCompletion inference.
+                Override this attribute if you want to reprogram the agent.
+            llm_config (Optional[Union[Dict, bool]]): LLM inference configuration.
+                Refer to [autogen.Completion.create](/docs/reference/autogen/oai/completion#create)
                for available options.
-            is_termination_msg (function): a function that takes a message in the form of a dictionary
+            is_termination_msg (Optional[Callable[[Dict], bool]]): A function that takes a message in the form of a dictionary
                and returns a boolean value indicating if this received message is a termination message.
                The dict can contain the following keys: "content", "role", "name", "function_call".
-            max_consecutive_auto_reply (int): the maximum number of consecutive auto replies.
-                default to None (no limit provided, class attribute MAX_CONSECUTIVE_AUTO_REPLY will be used as the limit in this case).
+            max_consecutive_auto_reply (Optional[int]): The maximum number of consecutive auto replies.
+                Defaults to None (no limit provided, class attribute MAX_CONSECUTIVE_AUTO_REPLY will be used as the limit in this case).
                The limit only plays a role when human_input_mode is not "ALWAYS".
-            **kwargs (dict): Please refer to other kwargs in
-                [ResponsiveAgent](responsive_agent#__init__).
+            **kwargs (Dict): Additional keyword arguments. Refer to other kwargs in
+                [ConversableAgent](conversable_agent#__init__).
        """
        super().__init__(
            name,
--- a/flaml/autogen/agentchat/contrib/math_user_proxy_agent.py
+++ b/flaml/autogen/agentchat/contrib/math_user_proxy_agent.py
@@ -1,14 +1,14 @@
-import re
 import os
-from pydantic import BaseModel, Extra, root_validator
-from typing import Any, Callable, Dict, List, Optional, Union
+import re
 from time import sleep
+from typing import Any, Callable, Dict, List, Optional, Union
+
+from pydantic import BaseModel, Extra, root_validator

 from flaml.autogen.agentchat import Agent, UserProxyAgent
-from flaml.autogen.code_utils import UNKNOWN, extract_code, execute_code, infer_lang
+from flaml.autogen.code_utils import UNKNOWN, execute_code, extract_code, infer_lang
 from flaml.autogen.math_utils import get_answer

-
 PROMPTS = {
    # default
    "default": """Let's use Python to solve a math problem.
@@ -156,7 +156,7 @@ class MathUserProxyAgent(UserProxyAgent):
                    when the number of auto reply reaches the max_consecutive_auto_reply or when is_termination_msg is True.
            default_auto_reply (str or dict or None): the default auto reply message when no code execution or llm based reply is generated.
            max_invalid_q_per_step (int): (ADDED) the maximum number of invalid queries per step.
-            **kwargs (dict): other kwargs in [UserProxyAgent](user_proxy_agent#__init__).
+            **kwargs (dict): other kwargs in [UserProxyAgent](../user_proxy_agent#__init__).
        """
        super().__init__(
            name=name,
@@ -165,7 +165,7 @@ class MathUserProxyAgent(UserProxyAgent):
            default_auto_reply=default_auto_reply,
            **kwargs,
        )
-        self.register_auto_reply([Agent, None], MathUserProxyAgent._generate_math_reply, 1)
+        self.register_reply([Agent, None], MathUserProxyAgent._generate_math_reply, 1)
        # fixed var
        self._max_invalid_q_per_step = max_invalid_q_per_step

--- a/flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py
+++ b/flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py
@@ -1,6 +1,7 @@
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union
+
 from flaml.autogen.agentchat.agent import Agent
 from flaml.autogen.agentchat.assistant_agent import AssistantAgent
-from typing import Callable, Dict, Optional, Union, List, Tuple, Any


 class RetrieveAssistantAgent(AssistantAgent):
@@ -16,7 +17,7 @@ class RetrieveAssistantAgent(AssistantAgent):

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
-        self.register_auto_reply(Agent, RetrieveAssistantAgent._generate_retrieve_assistant_reply)
+        self.register_reply(Agent, RetrieveAssistantAgent._generate_retrieve_assistant_reply)

    def _generate_retrieve_assistant_reply(
        self,
--- a/flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
+++ b/flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
@@ -1,12 +1,13 @@
-import chromadb
-from flaml.autogen.agentchat.agent import Agent
-from flaml.autogen.agentchat import UserProxyAgent
-from flaml.autogen.retrieve_utils import create_vector_db_from_dir, query_vector_db, num_tokens_from_text
-from flaml.autogen.code_utils import extract_code
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union

-from typing import Callable, Dict, Optional, Union, List, Tuple, Any
+import chromadb
 from IPython import get_ipython

+from flaml.autogen.agentchat import UserProxyAgent
+from flaml.autogen.agentchat.agent import Agent
+from flaml.autogen.code_utils import extract_code
+from flaml.autogen.retrieve_utils import create_vector_db_from_dir, num_tokens_from_text, query_vector_db
+
 try:
    from termcolor import colored
 except ImportError:
@@ -122,7 +123,7 @@ class RetrieveUserProxyAgent(UserProxyAgent):
                    can be found at `https://www.sbert.net/docs/pretrained_models.html`. The default model is a
                    fast model. If you want to use a high performance model, `all-mpnet-base-v2` is recommended.
                - customized_prompt (Optional, str): the customized prompt for the retrieve chat. Default is None.
-            **kwargs (dict): other kwargs in [UserProxyAgent](user_proxy_agent#__init__).
+            **kwargs (dict): other kwargs in [UserProxyAgent](../user_proxy_agent#__init__).
        """
        super().__init__(
            name=name,
@@ -148,7 +149,7 @@ class RetrieveUserProxyAgent(UserProxyAgent):
        self._ipython = get_ipython()
        self._doc_idx = -1  # the index of the current used doc
        self._results = {}  # the results of the current query
-        self.register_auto_reply(Agent, RetrieveUserProxyAgent._generate_retrieve_user_reply)
+        self.register_reply(Agent, RetrieveUserProxyAgent._generate_retrieve_user_reply)

    @staticmethod
    def get_max_tokens(model="gpt-3.5-turbo"):
--- a/flaml/autogen/agentchat/conversable_agent.py
+++ b/flaml/autogen/agentchat/conversable_agent.py
@@ -1,10 +1,10 @@
 import asyncio
-from collections import defaultdict
 import copy
 import json
+from collections import defaultdict
 from typing import Any, Callable, Dict, List, Optional, Tuple, Type, Union
+
 from flaml.autogen import oai
-from .agent import Agent
 from flaml.autogen.code_utils import (
    DEFAULT_MODEL,
    UNKNOWN,
@@ -13,6 +13,8 @@ from flaml.autogen.code_utils import (
    infer_lang,
 )

+from .agent import Agent
+
 try:
    from termcolor import colored
 except ImportError:
@@ -21,11 +23,11 @@ except ImportError:
        return x


-class ResponsiveAgent(Agent):
-    """(Experimental) A class for generic responsive agents which can be configured as assistant or user proxy.
+class ConversableAgent(Agent):
+    """(In preview) A class for generic conversable agents which can be configured as assistant or user proxy.

    After receiving each message, the agent will send a reply to the sender unless the msg is a termination msg.
-    For example, AssistantAgent and UserProxyAgent are subclasses of ResponsiveAgent,
+    For example, AssistantAgent and UserProxyAgent are subclasses of this class,
    configured with different default settings.

    To modify auto reply, override `generate_reply` method.
@@ -119,12 +121,12 @@ class ResponsiveAgent(Agent):
        self._default_auto_reply = default_auto_reply
        self._reply_func_list = []
        self.reply_at_receive = defaultdict(bool)
-        self.register_auto_reply([Agent, None], ResponsiveAgent.generate_oai_reply)
-        self.register_auto_reply([Agent, None], ResponsiveAgent.generate_code_execution_reply)
-        self.register_auto_reply([Agent, None], ResponsiveAgent.generate_function_call_reply)
-        self.register_auto_reply([Agent, None], ResponsiveAgent.check_termination_and_human_reply)
+        self.register_reply([Agent, None], ConversableAgent.generate_oai_reply)
+        self.register_reply([Agent, None], ConversableAgent.generate_code_execution_reply)
+        self.register_reply([Agent, None], ConversableAgent.generate_function_call_reply)
+        self.register_reply([Agent, None], ConversableAgent.check_termination_and_human_reply)

-    def register_auto_reply(
+    def register_reply(
        self,
        trigger: Union[Type[Agent], str, Agent, Callable[[Agent], bool], List],
        reply_func: Callable,
@@ -151,7 +153,7 @@ class ResponsiveAgent(Agent):
                The function takes a recipient agent, a list of messages, a sender agent and a config as input and returns a reply message.
        ```python
        def reply_func(
-            recipient: ResponsiveAgent,
+            recipient: ConversableAgent,
            messages: Optional[List[Dict]] = None,
            sender: Optional[Agent] = None,
            config: Optional[Any] = None,
@@ -499,7 +501,7 @@ class ResponsiveAgent(Agent):

    def initiate_chat(
        self,
-        recipient: "ResponsiveAgent",
+        recipient: "ConversableAgent",
        clear_history: Optional[bool] = True,
        silent: Optional[bool] = False,
        **context,
@@ -522,7 +524,7 @@ class ResponsiveAgent(Agent):

    async def a_initiate_chat(
        self,
-        recipient: "ResponsiveAgent",
+        recipient: "ConversableAgent",
        clear_history: Optional[bool] = True,
        silent: Optional[bool] = False,
        **context,
@@ -611,7 +613,7 @@ class ResponsiveAgent(Agent):
        if messages is None:
            messages = self._oai_messages[sender]
        last_n_messages = code_execution_config.pop("last_n_messages", 1)
-        for i in range(last_n_messages):
+        for i in range(min(len(messages), last_n_messages)):
            message = messages[-(i + 1)]
            code_blocks = extract_code(message["content"])
            if len(code_blocks) == 1 and code_blocks[0][0] == UNKNOWN:
--- a/flaml/autogen/agentchat/groupchat.py
+++ b/flaml/autogen/agentchat/groupchat.py
@@ -1,8 +1,9 @@
-from dataclasses import dataclass
 import sys
+from dataclasses import dataclass
 from typing import Dict, List, Optional, Union
+
 from .agent import Agent
-from .responsive_agent import ResponsiveAgent
+from .conversable_agent import ConversableAgent


@dataclass
@@ -39,7 +40,7 @@ class GroupChat:
 Read the following conversation.
 Then select the next role from {self.agent_names} to play. Only return the role."""

-    def select_speaker(self, last_speaker: Agent, selector: ResponsiveAgent):
+    def select_speaker(self, last_speaker: Agent, selector: ConversableAgent):
        """Select the next speaker."""
        selector.update_system_message(self.select_speaker_msg())
        final, name = selector.generate_oai_reply(
@@ -63,7 +64,7 @@ Then select the next role from {self.agent_names} to play. Only return the role.
        return "\n".join([f"{agent.name}: {agent.system_message}" for agent in self.agents])


-class GroupChatManager(ResponsiveAgent):
+class GroupChatManager(ConversableAgent):
    """(In preview) A chat manager agent that can manage a group chat of multiple agents."""

    def __init__(
@@ -84,7 +85,7 @@ class GroupChatManager(ResponsiveAgent):
            system_message=system_message,
            **kwargs,
        )
-        self.register_auto_reply(Agent, GroupChatManager.run_chat, config=groupchat, reset_config=GroupChat.reset)
+        self.register_reply(Agent, GroupChatManager.run_chat, config=groupchat, reset_config=GroupChat.reset)
        # self._random = random.Random(seed)

    def run_chat(
--- a/flaml/autogen/agentchat/user_proxy_agent.py
+++ b/flaml/autogen/agentchat/user_proxy_agent.py
@@ -1,14 +1,15 @@
-from .responsive_agent import ResponsiveAgent
 from typing import Callable, Dict, Optional, Union

+from .conversable_agent import ConversableAgent

-class UserProxyAgent(ResponsiveAgent):
+
+class UserProxyAgent(ConversableAgent):
    """(In preview) A proxy agent for the user, that can execute code and provide feedback to the other agents.

-    UserProxyAgent is a subclass of ResponsiveAgent configured with `human_input_mode` to ALWAYS
+    UserProxyAgent is a subclass of ConversableAgent configured with `human_input_mode` to ALWAYS
    and `llm_config` to False. By default, the agent will prompt for human input every time a message is received.
    Code execution is enabled by default. LLM-based auto reply is disabled by default.
-    To modify auto reply, register a method with (`register_auto_reply`)[responsive_agent#register_auto_reply].
+    To modify auto reply, register a method with (`register_reply`)[conversable_agent#register_reply].
    To modify the way to get human input, override `get_human_input` method.
    To modify the way to execute code blocks, single code block, or function call, override `execute_code_blocks`,
    `run_code`, and `execute_function` methods respectively.
--- a/flaml/autogen/code_utils.py
+++ b/flaml/autogen/code_utils.py
@@ -1,13 +1,14 @@
+import logging
+import os
+import pathlib
+import re
 import signal
 import subprocess
 import sys
-import os
-import pathlib
-from typing import List, Dict, Tuple, Optional, Union, Callable
-import re
 import time
 from hashlib import md5
-import logging
+from typing import Callable, Dict, List, Optional, Tuple, Union
+
 from flaml.autogen import oai

 try:
@@ -124,7 +125,7 @@ def improve_function(file_name, func_name, objective, **config):
    """(work in progress) Improve the function to achieve the objective."""
    params = {**_IMPROVE_FUNCTION_CONFIG, **config}
    # read the entire file into a str
-    with open(file_name, "r") as f:
+    with open(file_name) as f:
        file_string = f.read()
    response = oai.Completion.create(
        {"func_name": func_name, "objective": objective, "file_string": file_string}, **params
@@ -157,7 +158,7 @@ def improve_code(files, objective, suggest_only=True, **config):
    code = ""
    for file_name in files:
        # read the entire file into a string
-        with open(file_name, "r") as f:
+        with open(file_name) as f:
            file_string = f.read()
        code += f"""{file_name}:
 {file_string}
--- a/flaml/autogen/math_utils.py
+++ b/flaml/autogen/math_utils.py
@@ -1,5 +1,6 @@
 from typing import Optional
-from flaml.autogen import oai, DEFAULT_MODEL
+
+from flaml.autogen import DEFAULT_MODEL, oai

 _MATH_PROMPT = "{problem} Solve the problem carefully. Simplify your answer as much as possible. Put the final answer in \\boxed{{}}."
 _MATH_CONFIG = {
@@ -129,7 +130,7 @@ def _fix_a_slash_b(string: str) -> str:
    try:
        a = int(a_str)
        b = int(b_str)
-        assert string == "{}/{}".format(a, b)
+        assert string == f"{a}/{b}"
        new_string = "\\frac{" + str(a) + "}{" + str(b) + "}"
        return new_string
    except Exception:
--- a/flaml/autogen/oai/init.py
+++ b/flaml/autogen/oai/init.py
@@ -1,10 +1,10 @@
-from flaml.autogen.oai.completion import Completion, ChatCompletion
+from flaml.autogen.oai.completion import ChatCompletion, Completion
 from flaml.autogen.oai.openai_utils import (
-    get_config_list,
+    config_list_from_json,
+    config_list_from_models,
    config_list_gpt4_gpt35,
    config_list_openai_aoai,
-    config_list_from_models,
-    config_list_from_json,
+    get_config_list,
 )

 __all__ = [
--- a/flaml/autogen/oai/completion.py
+++ b/flaml/autogen/oai/completion.py
@@ -1,28 +1,31 @@
-from time import sleep
 import logging
-import time
-from typing import List, Optional, Dict, Callable, Union
-import sys
 import shutil
+import sys
+import time
+from time import sleep
+from typing import Callable, Dict, List, Optional, Union
+
 import numpy as np
-from flaml import tune, BlendSearch
-from flaml.tune.space import is_constant
+
+from flaml import BlendSearch, tune
 from flaml.automl.logger import logger_formatter
+from flaml.tune.space import is_constant
+
 from .openai_utils import get_key

 try:
-    import openai
-    from openai.error import (
-        ServiceUnavailableError,
-        RateLimitError,
-        APIError,
-        InvalidRequestError,
-        APIConnectionError,
-        Timeout,
-        AuthenticationError,
-    )
-    from openai import Completion as openai_Completion
    import diskcache
+    import openai
+    from openai import Completion as openai_Completion
+    from openai.error import (
+        APIConnectionError,
+        APIError,
+        AuthenticationError,
+        InvalidRequestError,
+        RateLimitError,
+        ServiceUnavailableError,
+        Timeout,
+    )

    ERROR = None
 except ImportError:
@@ -697,7 +700,7 @@ class Completion(openai_Completion):
                E.g., `prompt="Complete the following sentence: {prefix}, context={"prefix": "Today I feel"}`.
                The actual prompt will be:
                "Complete the following sentence: Today I feel".
-                More examples can be found at [templating](/docs/Use-Cases/Autogen#templating).
+                More examples can be found at [templating](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#templating).
            use_cache (bool, Optional): Whether to use cached responses.
            config_list (List, Optional): List of configurations for the completion to try.
                The first one that does not raise an error will be used.
--- a/flaml/autogen/oai/openai_utils.py
+++ b/flaml/autogen/oai/openai_utils.py
@@ -1,7 +1,7 @@
-import os
 import json
-from typing import List, Optional, Dict, Set, Union
 import logging
+import os
+from typing import Dict, List, Optional, Set, Union

 NON_CACHE_KEY = ["api_key", "api_base", "api_type", "api_version"]

--- a/flaml/autogen/retrieve_utils.py
+++ b/flaml/autogen/retrieve_utils.py
@@ -1,13 +1,14 @@
-from typing import List, Union, Dict, Tuple
-import os
-import requests
-from urllib.parse import urlparse
 import glob
-import tiktoken
-import chromadb
-from chromadb.api import API
-import chromadb.utils.embedding_functions as ef
 import logging
+import os
+from typing import Dict, List, Tuple, Union
+from urllib.parse import urlparse
+
+import chromadb
+import chromadb.utils.embedding_functions as ef
+import requests
+import tiktoken
+from chromadb.api import API

 logger = logging.getLogger(__name__)
 TEXT_FORMATS = ["txt", "json", "csv", "tsv", "md", "html", "htm", "rtf", "rst", "jsonl", "log", "xml", "yaml", "yml"]
@@ -125,7 +126,7 @@ def split_files_to_chunks(
    """Split a list of files into chunks of max_tokens."""
    chunks = []
    for file in files:
-        with open(file, "r") as f:
+        with open(file) as f:
            text = f.read()
        chunks += split_text_to_chunks(text, max_tokens, chunk_mode, must_break_at_empty_line)
    return chunks
--- a/flaml/automl/init.py
+++ b/flaml/automl/init.py
@@ -1,5 +1,9 @@
-from flaml.automl.automl import AutoML, size
 from flaml.automl.logger import logger_formatter
-from flaml.automl.state import SearchState, AutoMLState

-__all__ = ["AutoML", "AutoMLState", "SearchState", "logger_formatter", "size"]
+try:
+    from flaml.automl.automl import AutoML, size
+    from flaml.automl.state import AutoMLState, SearchState
+
+    __all__ = ["AutoML", "AutoMLState", "SearchState", "logger_formatter", "size"]
+except ImportError:
+    __all__ = ["logger_formatter"]
--- a/flaml/automl/automl.py
+++ b/flaml/automl/automl.py
--- a/flaml/automl/contrib/init.py
+++ b/flaml/automl/contrib/init.py
@@ -0,0 +1 @@
+from .histgb import HistGradientBoostingEstimator
--- a/flaml/automl/contrib/histgb.py
+++ b/flaml/automl/contrib/histgb.py
@@ -0,0 +1,75 @@
+try:
+    from sklearn.ensemble import HistGradientBoostingClassifier, HistGradientBoostingRegressor
+except ImportError as e:
+    print(f"scikit-learn is required for HistGradientBoostingEstimator. Please install it; error: {e}")
+
+from flaml import tune
+from flaml.automl.model import SKLearnEstimator
+from flaml.automl.task import Task
+
+
+class HistGradientBoostingEstimator(SKLearnEstimator):
+    """The class for tuning Histogram Gradient Boosting."""
+
+    ITER_HP = "max_iter"
+    HAS_CALLBACK = False
+    DEFAULT_ITER = 100
+
+    @classmethod
+    def search_space(cls, data_size: int, task, **params) -> dict:
+        upper = max(5, min(32768, int(data_size[0])))  # upper must be larger than lower
+        return {
+            "n_estimators": {
+                "domain": tune.lograndint(lower=4, upper=upper),
+                "init_value": 4,
+                "low_cost_init_value": 4,
+            },
+            "max_leaves": {
+                "domain": tune.lograndint(lower=4, upper=upper),
+                "init_value": 4,
+                "low_cost_init_value": 4,
+            },
+            "min_samples_leaf": {
+                "domain": tune.lograndint(lower=2, upper=2**7 + 1),
+                "init_value": 20,
+            },
+            "learning_rate": {
+                "domain": tune.loguniform(lower=1 / 1024, upper=1.0),
+                "init_value": 0.1,
+            },
+            "log_max_bin": {  # log transformed with base 2, <= 256
+                "domain": tune.lograndint(lower=3, upper=9),
+                "init_value": 8,
+            },
+            "l2_regularization": {
+                "domain": tune.loguniform(lower=1 / 1024, upper=1024),
+                "init_value": 1.0,
+            },
+        }
+
+    def config2params(self, config: dict) -> dict:
+        params = super().config2params(config)
+        if "log_max_bin" in params:
+            params["max_bins"] = (1 << params.pop("log_max_bin")) - 1
+        if "max_leaves" in params:
+            params["max_leaf_nodes"] = params.get("max_leaf_nodes", params.pop("max_leaves"))
+        if "n_estimators" in params:
+            params["max_iter"] = params.get("max_iter", params.pop("n_estimators"))
+        if "random_state" not in params:
+            params["random_state"] = 24092023
+        if "n_jobs" in params:
+            params.pop("n_jobs")
+        return params
+
+    def __init__(
+        self,
+        task: Task,
+        **config,
+    ):
+        super().__init__(task, **config)
+        self.params["verbose"] = 0
+
+        if self._task.is_classification():
+            self.estimator_class = HistGradientBoostingClassifier
+        else:
+            self.estimator_class = HistGradientBoostingRegressor
--- a/flaml/automl/data.py
+++ b/flaml/automl/data.py
@@ -2,21 +2,29 @@
 #  * Copyright (c) Microsoft Corporation. All rights reserved.
 #  * Licensed under the MIT License. See LICENSE file in the
 #  * project root for license information.
-import numpy as np
-from datetime import datetime
-from typing import TYPE_CHECKING, Union
+import json
 import os
+import random
+import re
+import uuid
+from datetime import datetime, timedelta
+from decimal import ROUND_HALF_UP, Decimal
+from typing import TYPE_CHECKING, Union
+
+import numpy as np
+
+from flaml.automl.spark import DataFrame, F, Series, T, pd, ps, psDataFrame, psSeries
 from flaml.automl.training_log import training_log_reader
-from flaml.automl.spark import ps, psDataFrame, psSeries, DataFrame, Series, pd

 try:
-    from scipy.sparse import vstack, issparse
+    from scipy.sparse import issparse, vstack
 except ImportError:
    pass

 if TYPE_CHECKING:
    from flaml.automl.task import Task

+
 TS_TIMESTAMP_COL = "ds"
 TS_VALUE_COL = "y"

@@ -41,8 +49,12 @@ def load_openml_dataset(dataset_id, data_dir=None, random_state=0, dataset_forma
        y_train: A series or array of labels for training data.
        y_test:  A series or array of labels for test data.
    """
-    import openml
    import pickle
+
+    try:
+        import openml
+    except ImportError:
+        openml = None
    from sklearn.model_selection import train_test_split

    filename = "openml_ds" + str(dataset_id) + ".pkl"
@@ -53,15 +65,15 @@ def load_openml_dataset(dataset_id, data_dir=None, random_state=0, dataset_forma
            dataset = pickle.load(f)
    else:
        print("download dataset from openml")
-        dataset = openml.datasets.get_dataset(dataset_id)
+        dataset = openml.datasets.get_dataset(dataset_id) if openml else None
        if not os.path.exists(data_dir):
            os.makedirs(data_dir)
        with open(filepath, "wb") as f:
            pickle.dump(dataset, f, pickle.HIGHEST_PROTOCOL)
-    print("Dataset name:", dataset.name)
+    print("Dataset name:", dataset.name) if dataset else None
    try:
        X, y, *__ = dataset.get_data(target=dataset.default_target_attribute, dataset_format=dataset_format)
-    except ValueError:
+    except (ValueError, AttributeError, TypeError):
        from sklearn.datasets import fetch_openml

        X, y = fetch_openml(data_id=dataset_id, return_X_y=True)
@@ -93,9 +105,10 @@ def load_openml_task(task_id, data_dir):
        y_train: A series of labels for training data.
        y_test:  A series of labels for test data.
    """
-    import openml
    import pickle

+    import openml
+
    task = openml.tasks.get_task(task_id)
    filename = "openml_task" + str(task_id) + ".pkl"
    filepath = os.path.join(data_dir, filename)
@@ -289,7 +302,7 @@ class DataTransformer:
                    y = y.rename(TS_VALUE_COL)
            for column in X.columns:
                # sklearn\utils\validation.py needs int/float values
-                if X[column].dtype.name in ("object", "category"):
+                if X[column].dtype.name in ("object", "category", "string"):
                    if X[column].nunique() == 1 or X[column].nunique(dropna=True) == n - X[column].isnull().sum():
                        X.drop(columns=column, inplace=True)
                        drop = True
@@ -341,8 +354,8 @@ class DataTransformer:
                    drop = True
                else:
                    drop = False
-                from sklearn.impute import SimpleImputer
                from sklearn.compose import ColumnTransformer
+                from sklearn.impute import SimpleImputer

                self.transformer = ColumnTransformer(
                    [
@@ -441,3 +454,343 @@ class DataTransformer:
 def group_counts(groups):
    _, i, c = np.unique(groups, return_counts=True, return_index=True)
    return c[np.argsort(i)]
+
+
+def get_random_dataframe(n_rows: int = 200, ratio_none: float = 0.1, seed: int = 42) -> DataFrame:
+    """Generate a random pandas DataFrame with various data types for testing.
+    This function creates a DataFrame with multiple column types including:
+    - Timestamps
+    - Integers
+    - Floats
+    - Categorical values
+    - Booleans
+    - Lists (tags)
+    - Decimal strings
+    - UUIDs
+    - Binary data (as hex strings)
+    - JSON blobs
+    - Nullable text fields
+    Parameters
+    ----------
+    n_rows : int, default=200
+        Number of rows in the generated DataFrame
+    ratio_none : float, default=0.1
+        Probability of generating None values in applicable columns
+    seed : int, default=42
+        Random seed for reproducibility
+    Returns
+    -------
+    pd.DataFrame
+        A DataFrame with 14 columns of various data types
+    Examples
+    --------
+    >>> df = get_random_dataframe(100, 0.05, 123)
+    >>> df.shape
+    (100, 14)
+    >>> df.dtypes
+    timestamp       datetime64[ns]
+    id                       int64
+    score                  float64
+    status                  object
+    flag                    object
+    count                   object
+    value                   object
+    tags                    object
+    rating                  object
+    uuid                    object
+    binary                  object
+    json_blob               object
+    category              category
+    nullable_text           object
+    dtype: object
+    """
+
+    np.random.seed(seed)
+    random.seed(seed)
+
+    def random_tags():
+        tags = ["AI", "ML", "data", "robotics", "vision"]
+        return random.sample(tags, k=random.randint(1, 3)) if random.random() > ratio_none else None
+
+    def random_decimal():
+        return (
+            str(Decimal(random.uniform(1, 5)).quantize(Decimal("0.01"), rounding=ROUND_HALF_UP))
+            if random.random() > ratio_none
+            else None
+        )
+
+    def random_json_blob():
+        blob = {"a": random.randint(1, 10), "b": random.random()}
+        return json.dumps(blob) if random.random() > ratio_none else None
+
+    def random_binary():
+        return bytes(random.randint(0, 255) for _ in range(4)).hex() if random.random() > ratio_none else None
+
+    data = {
+        "timestamp": [
+            datetime(2020, 1, 1) + timedelta(days=np.random.randint(0, 1000)) if np.random.rand() > ratio_none else None
+            for _ in range(n_rows)
+        ],
+        "id": range(1, n_rows + 1),
+        "score": np.random.uniform(0, 100, n_rows),
+        "status": np.random.choice(
+            ["active", "inactive", "pending", None],
+            size=n_rows,
+            p=[(1 - ratio_none) / 3, (1 - ratio_none) / 3, (1 - ratio_none) / 3, ratio_none],
+        ),
+        "flag": np.random.choice(
+            [True, False, None], size=n_rows, p=[(1 - ratio_none) / 2, (1 - ratio_none) / 2, ratio_none]
+        ),
+        "count": [np.random.randint(0, 100) if np.random.rand() > ratio_none else None for _ in range(n_rows)],
+        "value": [round(np.random.normal(50, 15), 2) if np.random.rand() > ratio_none else None for _ in range(n_rows)],
+        "tags": [random_tags() for _ in range(n_rows)],
+        "rating": [random_decimal() for _ in range(n_rows)],
+        "uuid": [str(uuid.uuid4()) if np.random.rand() > ratio_none else None for _ in range(n_rows)],
+        "binary": [random_binary() for _ in range(n_rows)],
+        "json_blob": [random_json_blob() for _ in range(n_rows)],
+        "category": pd.Categorical(
+            np.random.choice(
+                ["A", "B", "C", None],
+                size=n_rows,
+                p=[(1 - ratio_none) / 3, (1 - ratio_none) / 3, (1 - ratio_none) / 3, ratio_none],
+            )
+        ),
+        "nullable_text": [random.choice(["Good", "Bad", "Average", None]) for _ in range(n_rows)],
+    }
+
+    return pd.DataFrame(data)
+
+
+def auto_convert_dtypes_spark(
+    df: psDataFrame,
+    na_values: list = None,
+    category_threshold: float = 0.3,
+    convert_threshold: float = 0.6,
+    sample_ratio: float = 0.1,
+) -> tuple[psDataFrame, dict]:
+    """Automatically convert data types in a PySpark DataFrame using heuristics.
+
+    This function analyzes a sample of the DataFrame to infer appropriate data types
+    and applies the conversions. It handles timestamps, numeric values, booleans,
+    and categorical fields.
+
+    Args:
+        df: A PySpark DataFrame to convert.
+        na_values: List of strings to be considered as NA/NaN. Defaults to
+            ['NA', 'na', 'NULL', 'null', ''].
+        category_threshold: Maximum ratio of unique values to total values
+            to consider a column categorical. Defaults to 0.3.
+        convert_threshold: Minimum ratio of successfully converted values required
+            to apply a type conversion. Defaults to 0.6.
+        sample_ratio: Fraction of data to sample for type inference. Defaults to 0.1.
+
+    Returns:
+        tuple: (The DataFrame with converted types, A dictionary mapping column names to
+                their inferred types as strings)
+
+    Note:
+        - 'category' in the schema dict is conceptual as PySpark doesn't have a true
+            category type like pandas
+        - The function uses sampling for efficiency with large datasets
+    """
+    n_rows = df.count()
+    if na_values is None:
+        na_values = ["NA", "na", "NULL", "null", ""]
+
+    # Normalize NA-like values
+    for colname, coltype in df.dtypes:
+        if coltype == "string":
+            df = df.withColumn(
+                colname,
+                F.when(F.trim(F.lower(F.col(colname))).isin([v.lower() for v in na_values]), None).otherwise(
+                    F.col(colname)
+                ),
+            )
+
+    schema = {}
+    for colname in df.columns:
+        # Sample once at an appropriate ratio
+        sample_ratio_to_use = min(1.0, sample_ratio if n_rows * sample_ratio > 100 else 100 / n_rows)
+        col_sample = df.select(colname).sample(withReplacement=False, fraction=sample_ratio_to_use).dropna()
+        sample_count = col_sample.count()
+
+        inferred_type = "string"  # Default
+
+        if col_sample.dtypes[0][1] != "string":
+            schema[colname] = col_sample.dtypes[0][1]
+            continue
+
+        if sample_count == 0:
+            schema[colname] = "string"
+            continue
+
+        # Check if timestamp
+        ts_col = col_sample.withColumn("parsed", F.to_timestamp(F.col(colname)))
+
+        # Check numeric
+        if (
+            col_sample.withColumn("n", F.col(colname).cast("double")).filter("n is not null").count()
+            >= sample_count * convert_threshold
+        ):
+            # All whole numbers?
+            all_whole = (
+                col_sample.withColumn("n", F.col(colname).cast("double"))
+                .filter("n is not null")
+                .withColumn("frac", F.abs(F.col("n") % 1))
+                .filter("frac > 0.000001")
+                .count()
+                == 0
+            )
+            inferred_type = "int" if all_whole else "double"
+
+        # Check low-cardinality (category-like)
+        elif (
+            sample_count > 0
+            and col_sample.select(F.countDistinct(F.col(colname))).collect()[0][0] / sample_count <= category_threshold
+        ):
+            inferred_type = "category"  # Will just be string, but marked as such
+
+        # Check if timestamp
+        elif ts_col.filter(F.col("parsed").isNotNull()).count() >= sample_count * convert_threshold:
+            inferred_type = "timestamp"
+
+        schema[colname] = inferred_type
+
+    # Apply inferred schema
+    for colname, inferred_type in schema.items():
+        if inferred_type == "int":
+            df = df.withColumn(colname, F.col(colname).cast(T.IntegerType()))
+        elif inferred_type == "double":
+            df = df.withColumn(colname, F.col(colname).cast(T.DoubleType()))
+        elif inferred_type == "boolean":
+            df = df.withColumn(
+                colname,
+                F.when(F.lower(F.col(colname)).isin("true", "yes", "1"), True)
+                .when(F.lower(F.col(colname)).isin("false", "no", "0"), False)
+                .otherwise(None),
+            )
+        elif inferred_type == "timestamp":
+            df = df.withColumn(colname, F.to_timestamp(F.col(colname)))
+        elif inferred_type == "category":
+            df = df.withColumn(colname, F.col(colname).cast(T.StringType()))  # Marked conceptually
+
+        # otherwise keep as string (or original type)
+
+    return df, schema
+
+
+def auto_convert_dtypes_pandas(
+    df: DataFrame,
+    na_values: list = None,
+    category_threshold: float = 0.3,
+    convert_threshold: float = 0.6,
+    sample_ratio: float = 1.0,
+) -> tuple[DataFrame, dict]:
+    """Automatically convert data types in a pandas DataFrame using heuristics.
+
+    This function analyzes the DataFrame to infer appropriate data types
+    and applies the conversions. It handles timestamps, timedeltas, numeric values,
+    and categorical fields.
+
+    Args:
+        df: A pandas DataFrame to convert.
+        na_values: List of strings to be considered as NA/NaN. Defaults to
+            ['NA', 'na', 'NULL', 'null', ''].
+        category_threshold: Maximum ratio of unique values to total values
+            to consider a column categorical. Defaults to 0.3.
+        convert_threshold: Minimum ratio of successfully converted values required
+            to apply a type conversion. Defaults to 0.6.
+        sample_ratio: Fraction of data to sample for type inference. Not used in pandas version
+            but included for API compatibility. Defaults to 1.0.
+
+    Returns:
+        tuple: (The DataFrame with converted types, A dictionary mapping column names to
+                their inferred types as strings)
+    """
+    if na_values is None:
+        na_values = {"NA", "na", "NULL", "null", ""}
+    # Remove the empty string separately (handled by the regex `^\s*$`)
+    vals = [re.escape(v) for v in na_values if v != ""]
+    # Build inner alternation group
+    inner = "|".join(vals) if vals else ""
+    if inner:
+        pattern = re.compile(rf"^\s*(?:{inner})?\s*$")
+    else:
+        pattern = re.compile(r"^\s*$")
+
+    df_converted = df.convert_dtypes()
+    schema = {}
+
+    # Sample if needed (for API compatibility)
+    if sample_ratio < 1.0:
+        df = df.sample(frac=sample_ratio)
+
+    n_rows = len(df)
+
+    for col in df.columns:
+        series = df[col]
+        # Replace NA-like values if string
+        if series.dtype == object:
+            mask = series.astype(str).str.match(pattern)
+            series_cleaned = series.where(~mask, np.nan)
+        else:
+            series_cleaned = series
+
+        # Skip conversion if already non-object data type, except bool which can potentially be categorical
+        if (
+            not isinstance(series_cleaned.dtype, pd.BooleanDtype)
+            and not isinstance(series_cleaned.dtype, pd.StringDtype)
+            and series_cleaned.dtype != "object"
+        ):
+            # Keep the original data type for non-object dtypes
+            df_converted[col] = series
+            schema[col] = str(series_cleaned.dtype)
+            continue
+
+        # print(f"type: {series_cleaned.dtype}, column: {series_cleaned.name}")
+
+        if not isinstance(series_cleaned.dtype, pd.BooleanDtype):
+            # Try numeric (int or float)
+            numeric = pd.to_numeric(series_cleaned, errors="coerce")
+            if numeric.notna().sum() >= n_rows * convert_threshold:
+                if (numeric.dropna() % 1 == 0).all():
+                    try:
+                        df_converted[col] = numeric.astype("int")  # Nullable integer
+                        schema[col] = "int"
+                        continue
+                    except Exception:
+                        pass
+                df_converted[col] = numeric.astype("double")
+                schema[col] = "double"
+                continue
+
+            # Try datetime
+            datetime_converted = pd.to_datetime(series_cleaned, errors="coerce")
+            if datetime_converted.notna().sum() >= n_rows * convert_threshold:
+                df_converted[col] = datetime_converted
+                schema[col] = "timestamp"
+                continue
+
+            # Try timedelta
+            try:
+                timedelta_converted = pd.to_timedelta(series_cleaned, errors="coerce")
+                if timedelta_converted.notna().sum() >= n_rows * convert_threshold:
+                    df_converted[col] = timedelta_converted
+                    schema[col] = "timedelta"
+                    continue
+            except TypeError:
+                pass
+
+        # Try category
+        try:
+            unique_ratio = series_cleaned.nunique(dropna=True) / n_rows if n_rows > 0 else 1.0
+            if unique_ratio <= category_threshold:
+                df_converted[col] = series_cleaned.astype("category")
+                schema[col] = "category"
+                continue
+        except Exception:
+            pass
+        df_converted[col] = series_cleaned.astype("string")
+        schema[col] = "string"
+
+    return df_converted, schema
--- a/flaml/automl/logger.py
+++ b/flaml/automl/logger.py
@@ -1,7 +1,37 @@
 import logging
+import os
+
+
+class ColoredFormatter(logging.Formatter):
+    # ANSI escape codes for colors
+    COLORS = {
+        # logging.DEBUG: "\033[36m",  # Cyan
+        # logging.INFO: "\033[32m",   # Green
+        logging.WARNING: "\033[33m",  # Yellow
+        logging.ERROR: "\033[31m",  # Red
+        logging.CRITICAL: "\033[1;31m",  # Bright Red
+    }
+    RESET = "\033[0m"  # Reset to default
+
+    def __init__(self, fmt, datefmt, use_color=True):
+        super().__init__(fmt, datefmt)
+        self.use_color = use_color
+
+    def format(self, record):
+        formatted = super().format(record)
+        if self.use_color:
+            color = self.COLORS.get(record.levelno, "")
+            if color:
+                return f"{color}{formatted}{self.RESET}"
+        return formatted
+

 logger = logging.getLogger(__name__)
-logger_formatter = logging.Formatter(
-    "[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S"
+use_color = True
+if os.getenv("FLAML_LOG_NO_COLOR"):
+    use_color = False
+
+logger_formatter = ColoredFormatter(
+    "[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S", use_color
 )
 logger.propagate = False
--- a/flaml/automl/ml.py
+++ b/flaml/automl/ml.py
@@ -2,30 +2,31 @@
 #  * Copyright (c) FLAML authors. All rights reserved.
 #  * Licensed under the MIT License. See LICENSE file in the
 #  * project root for license information.
-import time
-from typing import Union, Callable, TypeVar, Optional, Tuple
 import logging
+import time
+from typing import Callable, Optional, Tuple, TypeVar, Union

 import numpy as np

-
 from flaml.automl.data import group_counts
-from flaml.automl.task.task import Task
 from flaml.automl.model import BaseEstimator, TransformersEstimator
-from flaml.automl.spark import psDataFrame, psSeries, ERROR as SPARK_ERROR, Series, DataFrame
+from flaml.automl.spark import ERROR as SPARK_ERROR
+from flaml.automl.spark import DataFrame, Series, psDataFrame, psSeries
+from flaml.automl.task.task import Task
+from flaml.automl.time_series import TimeSeriesDataset

 try:
    from sklearn.metrics import (
-        mean_squared_error,
-        r2_score,
-        roc_auc_score,
        accuracy_score,
-        mean_absolute_error,
-        log_loss,
        average_precision_score,
        f1_score,
+        log_loss,
+        mean_absolute_error,
        mean_absolute_percentage_error,
+        mean_squared_error,
        ndcg_score,
+        r2_score,
+        roc_auc_score,
    )
 except ImportError:
    pass
@@ -33,7 +34,6 @@ except ImportError:
 if SPARK_ERROR is None:
    from flaml.automl.spark.metrics import spark_metric_loss_score

-from flaml.automl.time_series import TimeSeriesDataset

 logger = logging.getLogger(__name__)

@@ -89,6 +89,11 @@ huggingface_metric_to_mode = {
    "wer": "min",
 }
 huggingface_submetric_to_metric = {"rouge1": "rouge", "rouge2": "rouge"}
+spark_metric_name_dict = {
+    "Regression": ["r2", "rmse", "mse", "mae", "var"],
+    "Binary Classification": ["pr_auc", "roc_auc"],
+    "Multi-class Classification": ["accuracy", "log_loss", "f1", "micro_f1", "macro_f1"],
+}


 def metric_loss_score(
@@ -122,9 +127,21 @@ def metric_loss_score(
            import datasets

            datasets_metric_name = huggingface_submetric_to_metric.get(metric_name, metric_name.split(":")[0])
-            metric = datasets.load_metric(datasets_metric_name)
            metric_mode = huggingface_metric_to_mode[datasets_metric_name]

+            # datasets>=3 removed load_metric; prefer evaluate if available
+            try:
+                import evaluate
+
+                metric = evaluate.load(datasets_metric_name, trust_remote_code=True)
+            except Exception:
+                if hasattr(datasets, "load_metric"):
+                    metric = datasets.load_metric(datasets_metric_name, trust_remote_code=True)
+                else:
+                    from datasets import load_metric as _load_metric  # older datasets
+
+                    metric = _load_metric(datasets_metric_name, trust_remote_code=True)
+
            if metric_name.startswith("seqeval"):
                y_processed_true = [[labels[tr] for tr in each_list] for each_list in y_processed_true]
            elif metric in ("pearsonr", "spearmanr"):
@@ -294,14 +311,14 @@ def get_y_pred(estimator, X, eval_metric, task: Task):
    else:
        y_pred = estimator.predict(X)

-    if isinstance(y_pred, Series) or isinstance(y_pred, DataFrame):
+    if isinstance(y_pred, (Series, DataFrame)):
        y_pred = y_pred.values

    return y_pred


 def to_numpy(x):
-    if isinstance(x, Series or isinstance(x, DataFrame)):
+    if isinstance(x, (Series, DataFrame)):
        x = x.values
    else:
        x = np.ndarray(x)
@@ -323,7 +340,7 @@ def compute_estimator(
    estimator_name: str,
    eval_method: str,
    eval_metric: Union[str, Callable],
-    best_val_loss=np.Inf,
+    best_val_loss=np.inf,
    n_jobs: Optional[int] = 1,  # some estimators of EstimatorSubclass don't accept n_jobs. Should be None in that case.
    estimator_class: Optional[EstimatorSubclass] = None,
    cv_score_agg_func: Optional[callable] = None,
@@ -334,6 +351,14 @@ def compute_estimator(
    if fit_kwargs is None:
        fit_kwargs = {}

+    fe_params = {}
+    for param, value in config_dic.items():
+        if param.startswith("fe."):
+            fe_params[param] = value
+
+    for param, value in fe_params.items():
+        config_dic.pop(param)
+
    estimator_class = estimator_class or task.estimator_class_from_str(estimator_name)
    estimator = estimator_class(
        **config_dic,
@@ -401,12 +426,21 @@ def train_estimator(
    free_mem_ratio=0,
 ) -> Tuple[EstimatorSubclass, float]:
    start_time = time.time()
+    fe_params = {}
+    for param, value in config_dic.items():
+        if param.startswith("fe."):
+            fe_params[param] = value
+
+    for param, value in fe_params.items():
+        config_dic.pop(param)
+
    estimator_class = estimator_class or task.estimator_class_from_str(estimator_name)
    estimator = estimator_class(
        **config_dic,
        task=task,
        n_jobs=n_jobs,
    )
+
    if fit_kwargs is None:
        fit_kwargs = {}

@@ -552,7 +586,7 @@ def _eval_estimator(

        # TODO: why are integer labels being cast to str in the first place?

-        if isinstance(val_pred_y, Series) or isinstance(val_pred_y, DataFrame) or isinstance(val_pred_y, np.ndarray):
+        if isinstance(val_pred_y, (Series, DataFrame, np.ndarray)):
            test = val_pred_y if isinstance(val_pred_y, np.ndarray) else val_pred_y.values
            if not np.issubdtype(test.dtype, np.number):
                # some NLP models return a list
@@ -567,17 +601,27 @@ def _eval_estimator(

        pred_time = (time.time() - pred_start) / num_val_rows

-        val_loss = metric_loss_score(
-            eval_metric,
-            y_processed_predict=val_pred_y,
-            y_processed_true=y_val,
-            labels=labels,
-            sample_weight=weight_val,
-            groups=groups_val,
-        )
+        try:
+            val_loss = metric_loss_score(
+                eval_metric,
+                y_processed_predict=val_pred_y,
+                y_processed_true=y_val,
+                labels=labels,
+                sample_weight=weight_val,
+                groups=groups_val,
+            )
+        except ValueError as e:
+            # `r2_score` and other metrics may raise a `ValueError` when a model returns `inf` or `nan` values. In this case, we set the val_loss to infinity.
+            val_loss = np.inf
+            logger.warning(f"ValueError {e} happened in `metric_loss_score`, set `val_loss` to `np.inf`")
        metric_for_logging = {"pred_time": pred_time}
        if log_training_metric:
-            train_pred_y = get_y_pred(estimator, X_train, eval_metric, task)
+            # For time series forecasting, X_train may be a sampled dataset whose
+            # test partition can be empty. Use the training partition from X_val
+            # (which is the dataset used to define y_train above) to keep shapes
+            # aligned and avoid empty prediction inputs.
+            X_train_for_metric = X_val.X_train if isinstance(X_val, TimeSeriesDataset) else X_train
+            train_pred_y = get_y_pred(estimator, X_train_for_metric, eval_metric, task)
            metric_for_logging["train_loss"] = metric_loss_score(
                eval_metric,
                train_pred_y,
--- a/flaml/automl/model.py
+++ b/flaml/automl/model.py
--- a/flaml/automl/nlp/README.md
+++ b/flaml/automl/nlp/README.md
@@ -4,16 +4,15 @@ This directory contains utility functions used by AutoNLP. Currently we support

 Please refer to this [link](https://microsoft.github.io/FLAML/docs/Examples/AutoML-NLP) for examples.

-
 # Troubleshooting fine-tuning HPO for pre-trained language models

 The frequent updates of transformers may lead to fluctuations in the results of tuning. To help users quickly troubleshoot the result of AutoNLP when a tuning failure occurs (e.g., failing to reproduce previous results), we have provided the following jupyter notebook:

-* [Troubleshooting HPO for fine-tuning pre-trained language models](https://github.com/microsoft/FLAML/blob/main/notebook/research/acl2021.ipynb)
+- [Troubleshooting HPO for fine-tuning pre-trained language models](https://github.com/microsoft/FLAML/blob/main/notebook/research/acl2021.ipynb)

 Our findings on troubleshooting fine-tuning the Electra and RoBERTa model for the GLUE dataset can be seen in the following paper published in ACL 2021:

-* [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://arxiv.org/abs/2106.09204). Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.
+- [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://arxiv.org/abs/2106.09204). Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.

 ```bibtex
@inproceedings{liu2021hpo,
--- a/flaml/automl/nlp/huggingface/data_collator.py
+++ b/flaml/automl/nlp/huggingface/data_collator.py
@@ -1,17 +1,18 @@
-from dataclasses import dataclass
-from transformers.data.data_collator import (
-    DataCollatorWithPadding,
-    DataCollatorForTokenClassification,
-    DataCollatorForSeq2Seq,
-)
 from collections import OrderedDict
+from dataclasses import dataclass
+
+from transformers.data.data_collator import (
+    DataCollatorForSeq2Seq,
+    DataCollatorForTokenClassification,
+    DataCollatorWithPadding,
+)

 from flaml.automl.task.task import (
-    TOKENCLASSIFICATION,
    MULTICHOICECLASSIFICATION,
-    SUMMARIZATION,
    SEQCLASSIFICATION,
    SEQREGRESSION,
+    SUMMARIZATION,
+    TOKENCLASSIFICATION,
 )


@@ -19,6 +20,7 @@ from flaml.automl.task.task import (
 class DataCollatorForMultipleChoiceClassification(DataCollatorWithPadding):
    def __call__(self, features):
        from itertools import chain
+
        import torch

        label_name = "label" if "label" in features[0].keys() else "labels"
@@ -30,7 +32,7 @@ class DataCollatorForMultipleChoiceClassification(DataCollatorWithPadding):
            [{k: v[i] for k, v in feature.items()} for i in range(num_choices)] for feature in features
        ]
        flattened_features = list(chain(*flattened_features))
-        batch = super(DataCollatorForMultipleChoiceClassification, self).__call__(flattened_features)
+        batch = super().__call__(flattened_features)
        # Un-flatten
        batch = {k: v.view(batch_size, num_choices, -1) for k, v in batch.items()}
        # Add back labels
--- a/flaml/automl/nlp/huggingface/training_args.py
+++ b/flaml/automl/nlp/huggingface/training_args.py
@@ -1,10 +1,11 @@
 import argparse
 from dataclasses import dataclass, field
-from typing import Optional, List
+from typing import List, Optional
+
 from flaml.automl.task.task import NLG_TASKS

 try:
-    from transformers import TrainingArguments
+    from transformers import Seq2SeqTrainingArguments as TrainingArguments
 except ImportError:
    TrainingArguments = object

@@ -76,6 +77,14 @@ class TrainingArgumentsForAuto(TrainingArguments):

    logging_steps: int = field(default=500, metadata={"help": "Log every X updates steps."})

+    # Newer versions of HuggingFace Transformers may access `TrainingArguments.generation_config`
+    # (e.g., in generation-aware trainers/callbacks). Keep this attribute to remain compatible
+    # while defaulting to None for non-generation tasks.
+    generation_config: Optional[object] = field(
+        default=None,
+        metadata={"help": "Optional generation config (or path) used by generation-aware trainers."},
+    )
+
    @staticmethod
    def load_args_from_console():
        from dataclasses import fields
--- a/flaml/automl/nlp/huggingface/utils.py
+++ b/flaml/automl/nlp/huggingface/utils.py
@@ -1,14 +1,16 @@
 from itertools import chain
+
 import numpy as np
-from flaml.automl.task.task import (
-    SUMMARIZATION,
-    SEQREGRESSION,
-    SEQCLASSIFICATION,
-    MULTICHOICECLASSIFICATION,
-    TOKENCLASSIFICATION,
-    NLG_TASKS,
-)
+
 from flaml.automl.data import pd
+from flaml.automl.task.task import (
+    MULTICHOICECLASSIFICATION,
+    NLG_TASKS,
+    SEQCLASSIFICATION,
+    SEQREGRESSION,
+    SUMMARIZATION,
+    TOKENCLASSIFICATION,
+)


 def todf(X, Y, column_name):
@@ -209,29 +211,28 @@ def tokenize_onedataframe(
    hf_args=None,
    prefix_str=None,
 ):
-    with tokenizer.as_target_tokenizer():
-        _, tokenized_column_names = tokenize_row(
-            dict(X.iloc[0]),
+    _, tokenized_column_names = tokenize_row(
+        dict(X.iloc[0]),
+        tokenizer,
+        prefix=(prefix_str,) if task is SUMMARIZATION else None,
+        task=task,
+        hf_args=hf_args,
+        return_column_name=True,
+    )
+    d = X.apply(
+        lambda x: tokenize_row(
+            x,
            tokenizer,
            prefix=(prefix_str,) if task is SUMMARIZATION else None,
            task=task,
            hf_args=hf_args,
-            return_column_name=True,
-        )
-        d = X.apply(
-            lambda x: tokenize_row(
-                x,
-                tokenizer,
-                prefix=(prefix_str,) if task is SUMMARIZATION else None,
-                task=task,
-                hf_args=hf_args,
-            ),
-            axis=1,
-            result_type="expand",
-        )
-        X_tokenized = pd.DataFrame(columns=tokenized_column_names)
-        X_tokenized[tokenized_column_names] = d
-        return X_tokenized
+        ),
+        axis=1,
+        result_type="expand",
+    )
+    X_tokenized = pd.DataFrame(columns=tokenized_column_names)
+    X_tokenized[tokenized_column_names] = d
+    return X_tokenized


 def tokenize_row(
@@ -243,7 +244,7 @@ def tokenize_row(
    return_column_name=False,
 ):
    if prefix:
-        this_row = tuple(["".join(x) for x in zip(prefix, this_row)])
+        this_row = tuple("".join(x) for x in zip(prefix, this_row))

    # tokenizer.pad_token = tokenizer.eos_token
    tokenized_example = tokenizer(
@@ -377,6 +378,7 @@ def load_model(checkpoint_path, task, num_labels=None):
    transformers.logging.set_verbosity_error()

    from transformers import AutoConfig
+
    from flaml.automl.task.task import (
        SEQCLASSIFICATION,
        SEQREGRESSION,
@@ -384,14 +386,16 @@ def load_model(checkpoint_path, task, num_labels=None):
    )

    def get_this_model(checkpoint_path, task, model_config):
-        from transformers import AutoModelForSequenceClassification
-        from transformers import AutoModelForSeq2SeqLM
-        from transformers import AutoModelForMultipleChoice
-        from transformers import AutoModelForTokenClassification
+        from transformers import (
+            AutoModelForMultipleChoice,
+            AutoModelForSeq2SeqLM,
+            AutoModelForSequenceClassification,
+            AutoModelForTokenClassification,
+        )

        if task in (SEQCLASSIFICATION, SEQREGRESSION):
            return AutoModelForSequenceClassification.from_pretrained(
-                checkpoint_path, config=model_config, ignore_mismatched_sizes=True
+                checkpoint_path, config=model_config, ignore_mismatched_sizes=True, trust_remote_code=True
            )
        elif task == TOKENCLASSIFICATION:
            return AutoModelForTokenClassification.from_pretrained(checkpoint_path, config=model_config)
--- a/flaml/automl/nlp/utils.py
+++ b/flaml/automl/nlp/utils.py
@@ -1,11 +1,12 @@
-from typing import Dict, Any
+from typing import Any, Dict
+
 import numpy as np

 from flaml.automl.task.task import (
-    SUMMARIZATION,
-    SEQREGRESSION,
-    SEQCLASSIFICATION,
    MULTICHOICECLASSIFICATION,
+    SEQCLASSIFICATION,
+    SEQREGRESSION,
+    SUMMARIZATION,
    TOKENCLASSIFICATION,
 )

@@ -24,14 +25,12 @@ def load_default_huggingface_metric_for_task(task):


 def is_a_list_of_str(this_obj):
-    return (isinstance(this_obj, list) or isinstance(this_obj, np.ndarray)) and all(
-        isinstance(x, str) for x in this_obj
-    )
+    return isinstance(this_obj, (list, np.ndarray)) and all(isinstance(x, str) for x in this_obj)


 def _clean_value(value: Any) -> str:
    if isinstance(value, float):
-        return "{:.5}".format(value)
+        return f"{value:.5}"
    else:
        return str(value).replace("/", "_")

@@ -85,7 +84,7 @@ class Counter:
    @staticmethod
    def get_trial_fold_name(local_dir, trial_config, trial_id):
        Counter.counter += 1
-        experiment_tag = "{0}_{1}".format(str(Counter.counter), format_vars(trial_config))
+        experiment_tag = f"{str(Counter.counter)}_{format_vars(trial_config)}"
        logdir = get_logdir_name(_generate_dirname(experiment_tag, trial_id=trial_id), local_dir)
        return logdir

--- a/flaml/automl/spark/init.py
+++ b/flaml/automl/spark/init.py
@@ -1,3 +1,5 @@
+import atexit
+import logging
 import os

 os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
@@ -6,15 +8,18 @@ try:
    import pyspark.pandas as ps
    import pyspark.sql.functions as F
    import pyspark.sql.types as T
+    from pyspark.pandas import DataFrame as psDataFrame
+    from pyspark.pandas import Series as psSeries
+    from pyspark.pandas import set_option
    from pyspark.sql import DataFrame as sparkDataFrame
-    from pyspark.pandas import DataFrame as psDataFrame, Series as psSeries, set_option
+    from pyspark.sql import SparkSession
    from pyspark.util import VersionUtils
 except ImportError:

    class psDataFrame:
        pass

-    F = T = ps = sparkDataFrame = psSeries = psDataFrame
+    F = T = ps = sparkDataFrame = SparkSession = psSeries = psDataFrame
    _spark_major_minor_version = set_option = None
    ERROR = ImportError(
        """Please run pip install flaml[spark]
@@ -30,3 +35,60 @@ try:
    from pandas import DataFrame, Series
 except ImportError:
    DataFrame = Series = pd = None
+
+
+logger = logging.getLogger(__name__)
+
+
+def disable_spark_ansi_mode():
+    """Disable Spark ANSI mode if it is enabled."""
+    spark = SparkSession.getActiveSession() if hasattr(SparkSession, "getActiveSession") else None
+    adjusted = False
+    try:
+        ps_conf = ps.get_option("compute.fail_on_ansi_mode")
+    except Exception:
+        ps_conf = None
+    ansi_conf = [None, ps_conf]  # ansi_conf and ps_conf original values
+    # Spark may store the config as string 'true'/'false' (or boolean in some contexts)
+    if spark is not None:
+        ansi_conf[0] = spark.conf.get("spark.sql.ansi.enabled")
+        ansi_enabled = (
+            (isinstance(ansi_conf[0], str) and ansi_conf[0].lower() == "true")
+            or (isinstance(ansi_conf[0], bool) and ansi_conf[0] is True)
+            or ansi_conf[0] is None
+        )
+        try:
+            if ansi_enabled:
+                logger.debug("Adjusting spark.sql.ansi.enabled to false")
+                spark.conf.set("spark.sql.ansi.enabled", "false")
+                adjusted = True
+        except Exception:
+            # If reading/setting options fail for some reason, keep going and let
+            # pandas-on-Spark raise a meaningful error later.
+            logger.exception("Failed to set spark.sql.ansi.enabled")
+
+    if ansi_conf[1]:
+        logger.debug("Adjusting pandas-on-Spark compute.fail_on_ansi_mode to False")
+        ps.set_option("compute.fail_on_ansi_mode", False)
+        adjusted = True
+
+    return spark, ansi_conf, adjusted
+
+
+def restore_spark_ansi_mode(spark, ansi_conf, adjusted):
+    """Restore Spark ANSI mode to its original setting."""
+    # Restore the original spark.sql.ansi.enabled to avoid persistent side-effects.
+    if adjusted and spark and ansi_conf[0] is not None:
+        try:
+            logger.debug(f"Restoring spark.sql.ansi.enabled to {ansi_conf[0]}")
+            spark.conf.set("spark.sql.ansi.enabled", ansi_conf[0])
+        except Exception:
+            logger.exception("Failed to restore spark.sql.ansi.enabled")
+
+    if adjusted and ansi_conf[1]:
+        logger.debug(f"Restoring pandas-on-Spark compute.fail_on_ansi_mode to {ansi_conf[1]}")
+        ps.set_option("compute.fail_on_ansi_mode", ansi_conf[1])
+
+
+spark, ansi_conf, adjusted = disable_spark_ansi_mode()
+atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
--- a/flaml/automl/spark/configs.py
+++ b/flaml/automl/spark/configs.py
@@ -1,97 +0,0 @@
-ParamList_LightGBM_Base = [
-    "baggingFraction",
-    "baggingFreq",
-    "baggingSeed",
-    "binSampleCount",
-    "boostFromAverage",
-    "boostingType",
-    "catSmooth",
-    "categoricalSlotIndexes",
-    "categoricalSlotNames",
-    "catl2",
-    "chunkSize",
-    "dataRandomSeed",
-    "defaultListenPort",
-    "deterministic",
-    "driverListenPort",
-    "dropRate",
-    "dropSeed",
-    "earlyStoppingRound",
-    "executionMode",
-    "extraSeed" "featureFraction",
-    "featureFractionByNode",
-    "featureFractionSeed",
-    "featuresCol",
-    "featuresShapCol",
-    "fobj" "improvementTolerance",
-    "initScoreCol",
-    "isEnableSparse",
-    "isProvideTrainingMetric",
-    "labelCol",
-    "lambdaL1",
-    "lambdaL2",
-    "leafPredictionCol",
-    "learningRate",
-    "matrixType",
-    "maxBin",
-    "maxBinByFeature",
-    "maxCatThreshold",
-    "maxCatToOnehot",
-    "maxDeltaStep",
-    "maxDepth",
-    "maxDrop",
-    "metric",
-    "microBatchSize",
-    "minDataInLeaf",
-    "minDataPerBin",
-    "minDataPerGroup",
-    "minGainToSplit",
-    "minSumHessianInLeaf",
-    "modelString",
-    "monotoneConstraints",
-    "monotoneConstraintsMethod",
-    "monotonePenalty",
-    "negBaggingFraction",
-    "numBatches",
-    "numIterations",
-    "numLeaves",
-    "numTasks",
-    "numThreads",
-    "objectiveSeed",
-    "otherRate",
-    "parallelism",
-    "passThroughArgs",
-    "posBaggingFraction",
-    "predictDisableShapeCheck",
-    "predictionCol",
-    "repartitionByGroupingColumn",
-    "seed",
-    "skipDrop",
-    "slotNames",
-    "timeout",
-    "topK",
-    "topRate",
-    "uniformDrop",
-    "useBarrierExecutionMode",
-    "useMissing",
-    "useSingleDatasetMode",
-    "validationIndicatorCol",
-    "verbosity",
-    "weightCol",
-    "xGBoostDartMode",
-    "zeroAsMissing",
-    "objective",
-]
-ParamList_LightGBM_Classifier = ParamList_LightGBM_Base + [
-    "isUnbalance",
-    "probabilityCol",
-    "rawPredictionCol",
-    "thresholds",
-]
-ParamList_LightGBM_Regressor = ParamList_LightGBM_Base + ["tweedieVariancePower"]
-ParamList_LightGBM_Ranker = ParamList_LightGBM_Base + [
-    "groupCol",
-    "evalAt",
-    "labelGain",
-    "maxPosition",
-]
--- a/flaml/automl/spark/metrics.py
+++ b/flaml/automl/spark/metrics.py
@@ -1,14 +1,17 @@
-import numpy as np
+import json
 from typing import Union
-from flaml.automl.spark import psSeries, F
+
+import numpy as np
 from pyspark.ml.evaluation import (
    BinaryClassificationEvaluator,
-    RegressionEvaluator,
    MulticlassClassificationEvaluator,
    MultilabelClassificationEvaluator,
    RankingEvaluator,
+    RegressionEvaluator,
 )

+from flaml.automl.spark import F, T, psDataFrame, psSeries, sparkDataFrame
+

 def ps_group_counts(groups: Union[psSeries, np.ndarray]) -> np.ndarray:
    if isinstance(groups, np.ndarray):
@@ -34,6 +37,16 @@ def _compute_label_from_probability(df, probability_col, prediction_col):
    return df


+def string_to_array(s):
+    try:
+        return json.loads(s)
+    except json.JSONDecodeError:
+        return []
+
+
+string_to_array_udf = F.udf(string_to_array, T.ArrayType(T.DoubleType()))
+
+
 def spark_metric_loss_score(
    metric_name: str,
    y_predict: psSeries,
@@ -133,6 +146,11 @@ def spark_metric_loss_score(
        )
    elif metric_name == "log_loss":
        # For log_loss, prediction_col should be probability, and we need to convert it to label
+        # handle data like "{'type': '1', 'values': '[1, 2, 3]'}"
+        # Fix cannot resolve "array_max(prediction)" due to data type mismatch: Parameter 1 requires the "ARRAY" type,
+        # however "prediction" has the type "STRUCT<type: TINYINT, size: INT, indices: ARRAY<INT>, values: ARRAY<DOUBLE>>"
+        df = df.withColumn(prediction_col, df[prediction_col].cast(T.StringType()))
+        df = df.withColumn(prediction_col, string_to_array_udf(df[prediction_col]))
        df = _compute_label_from_probability(df, prediction_col, prediction_col + "_label")
        evaluator = MulticlassClassificationEvaluator(
            metricName="logLoss",
--- a/flaml/automl/spark/utils.py
+++ b/flaml/automl/spark/utils.py
@@ -1,17 +1,19 @@
 import logging
-from typing import Union, List, Optional, Tuple
+from typing import List, Optional, Tuple, Union
+
 import numpy as np
+
 from flaml.automl.spark import (
-    sparkDataFrame,
-    ps,
+    DataFrame,
    F,
+    Series,
    T,
+    _spark_major_minor_version,
+    ps,
    psDataFrame,
    psSeries,
-    _spark_major_minor_version,
-    DataFrame,
-    Series,
    set_option,
+    sparkDataFrame,
 )

 logger = logging.getLogger(__name__)
@@ -57,17 +59,29 @@ def to_pandas_on_spark(
    ```
    """
    set_option("compute.default_index_type", default_index_type)
-    if isinstance(df, (DataFrame, Series)):
-        return ps.from_pandas(df)
-    elif isinstance(df, sparkDataFrame):
-        if _spark_major_minor_version[0] == 3 and _spark_major_minor_version[1] < 3:
-            return df.to_pandas_on_spark(index_col=index_col)
+    try:
+        orig_ps_conf = ps.get_option("compute.fail_on_ansi_mode")
+    except Exception:
+        orig_ps_conf = None
+    if orig_ps_conf:
+        ps.set_option("compute.fail_on_ansi_mode", False)
+
+    try:
+        if isinstance(df, (DataFrame, Series)):
+            return ps.from_pandas(df)
+        elif isinstance(df, sparkDataFrame):
+            if _spark_major_minor_version[0] == 3 and _spark_major_minor_version[1] < 3:
+                return df.to_pandas_on_spark(index_col=index_col)
+            else:
+                return df.pandas_api(index_col=index_col)
+        elif isinstance(df, (psDataFrame, psSeries)):
+            return df
        else:
-            return df.pandas_api(index_col=index_col)
-    elif isinstance(df, (psDataFrame, psSeries)):
-        return df
-    else:
-        raise TypeError(f"{type(df)} is not one of pandas.DataFrame, pandas.Series and pyspark.sql.DataFrame")
+            raise TypeError(f"{type(df)} is not one of pandas.DataFrame, pandas.Series and pyspark.sql.DataFrame")
+    finally:
+        # Restore original config
+        if orig_ps_conf:
+            ps.set_option("compute.fail_on_ansi_mode", orig_ps_conf)


 def train_test_split_pyspark(
--- a/flaml/automl/state.py
+++ b/flaml/automl/state.py
@@ -1,13 +1,15 @@
-import inspect
 import copy
+import inspect
 import time
 from typing import Any, Optional
+
 import numpy as np
+
 from flaml import tune
 from flaml.automl.logger import logger
 from flaml.automl.ml import compute_estimator, train_estimator
+from flaml.automl.spark import DataFrame, Series, psDataFrame, psSeries
 from flaml.automl.time_series.ts_data import TimeSeriesDataset
-from flaml.automl.spark import psDataFrame, psSeries, DataFrame, Series


 class SearchState:
@@ -35,10 +37,9 @@ class SearchState:
        if isinstance(domain_one_dim, sample.Domain):
            renamed_type = list(inspect.signature(domain_one_dim.is_valid).parameters.values())[0].annotation
            type_match = (
-                renamed_type == Any
+                renamed_type is Any
                or isinstance(value_one_dim, renamed_type)
-                or isinstance(value_one_dim, int)
-                and renamed_type is float
+                or (renamed_type is float and isinstance(value_one_dim, int))
            )
            if not (type_match and domain_one_dim.is_valid(value_one_dim)):
                return False
@@ -63,6 +64,7 @@ class SearchState:
        custom_hp=None,
        max_iter=None,
        budget=None,
+        featurization="auto",
    ):
        self.init_eci = learner_class.cost_relative2lgbm() if budget >= 0 else 1
        self._search_space_domain = {}
@@ -80,6 +82,7 @@ class SearchState:
        else:
            data_size = data.shape
            search_space = learner_class.search_space(data_size=data_size, task=task)
+
        self.data_size = data_size

        if custom_hp is not None:
@@ -89,9 +92,7 @@ class SearchState:
            starting_point = AutoMLState.sanitize(starting_point)
            if max_iter > 1 and not self.valid_starting_point(starting_point, search_space):
                # If the number of iterations is larger than 1, remove invalid point
-                logger.warning(
-                    "Starting point {} removed because it is outside of the search space".format(starting_point)
-                )
+                logger.warning(f"Starting point {starting_point} removed because it is outside of the search space")
                starting_point = None
        elif isinstance(starting_point, list):
            starting_point = [AutoMLState.sanitize(x) for x in starting_point]
@@ -206,7 +207,7 @@ class SearchState:
        self.val_loss, self.config = obj, config

    def get_hist_config_sig(self, sample_size, config):
-        config_values = tuple([config[k] for k in self._hp_names if k in config])
+        config_values = tuple(config[k] for k in self._hp_names if k in config)
        config_sig = str(sample_size) + "_" + str(config_values)
        return config_sig

@@ -288,9 +289,11 @@ class AutoMLState:
        budget = (
            None
            if state.time_budget < 0
-            else state.time_budget - state.time_from_start
-            if sample_size == state.data_size[0]
-            else (state.time_budget - state.time_from_start) / 2 * sample_size / state.data_size[0]
+            else (
+                state.time_budget - state.time_from_start
+                if sample_size == state.data_size[0]
+                else (state.time_budget - state.time_from_start) / 2 * sample_size / state.data_size[0]
+            )
        )

        (
@@ -351,6 +354,7 @@ class AutoMLState:
        estimator: str,
        config_w_resource: dict,
        sample_size: Optional[int] = None,
+        is_retrain: bool = False,
    ):
        if not sample_size:
            sample_size = config_w_resource.get("FLAML_sample_size", len(self.y_train_all))
@@ -376,9 +380,8 @@ class AutoMLState:
            this_estimator_kwargs[
                "groups"
            ] = groups  # NOTE: _train_with_config is after kwargs is updated to fit_kwargs_by_estimator
-
+        this_estimator_kwargs.update({"is_retrain": is_retrain})
        budget = None if self.time_budget < 0 else self.time_budget - self.time_from_start
-
        estimator, train_time = train_estimator(
            X_train=sampled_X_train,
            y_train=sampled_y_train,
--- a/flaml/automl/task/factory.py
+++ b/flaml/automl/task/factory.py
@@ -1,8 +1,9 @@
 from typing import Optional, Union
+
 import numpy as np

 from flaml.automl.data import DataFrame, Series
-from flaml.automl.task.task import Task, TS_FORECAST
+from flaml.automl.task.task import TS_FORECAST, Task


 def task_factory(
--- a/flaml/automl/task/generic_task.py
+++ b/flaml/automl/task/generic_task.py
@@ -1,43 +1,39 @@
 import logging
 import time
 from typing import List, Optional
-import numpy as np
-from flaml.automl.data import TS_TIMESTAMP_COL, concat
-from flaml.automl.ml import EstimatorSubclass, get_val_loss, default_cv_score_agg_func

-from flaml.automl.task.task import (
-    Task,
-    get_classification_objective,
-    TS_FORECAST,
-    TS_FORECASTPANEL,
-)
-from flaml.config import RANDOM_SEED
-from flaml.automl.spark import ps, psDataFrame, psSeries, pd
+import numpy as np
+
+from flaml.automl.data import TS_TIMESTAMP_COL, concat
+from flaml.automl.ml import EstimatorSubclass, default_cv_score_agg_func, get_val_loss
+from flaml.automl.spark import pd, ps, psDataFrame, psSeries
 from flaml.automl.spark.utils import (
    iloc_pandas_on_spark,
+    len_labels,
+    set_option,
    spark_kFold,
    train_test_split_pyspark,
    unique_pandas_on_spark,
    unique_value_first_index,
-    len_labels,
-    set_option,
 )
+from flaml.automl.task.task import TS_FORECAST, TS_FORECASTPANEL, Task, get_classification_objective
+from flaml.config import RANDOM_SEED

 try:
    from scipy.sparse import issparse
 except ImportError:
    pass
 try:
-    from sklearn.utils import shuffle
    from sklearn.model_selection import (
-        train_test_split,
-        RepeatedStratifiedKFold,
-        RepeatedKFold,
        GroupKFold,
-        TimeSeriesSplit,
        GroupShuffleSplit,
+        RepeatedKFold,
+        RepeatedStratifiedKFold,
        StratifiedGroupKFold,
+        TimeSeriesSplit,
+        train_test_split,
    )
+    from sklearn.utils import shuffle
 except ImportError:
    pass

@@ -49,19 +45,31 @@ class GenericTask(Task):
    def estimators(self):
        if self._estimators is None:
            # put this into a function to avoid circular dependency
+            from flaml.automl.contrib.histgb import HistGradientBoostingEstimator
            from flaml.automl.model import (
-                XGBoostSklearnEstimator,
-                XGBoostLimitDepthEstimator,
-                RandomForestEstimator,
+                CatBoostEstimator,
+                ElasticNetEstimator,
+                ExtraTreesEstimator,
+                KNeighborsEstimator,
+                LassoLarsEstimator,
                LGBMEstimator,
                LRL1Classifier,
                LRL2Classifier,
-                CatBoostEstimator,
-                ExtraTreesEstimator,
-                KNeighborsEstimator,
+                RandomForestEstimator,
+                SGDEstimator,
+                SparkAFTSurvivalRegressionEstimator,
+                SparkGBTEstimator,
+                SparkGLREstimator,
+                SparkLGBMEstimator,
+                SparkLinearRegressionEstimator,
+                SparkLinearSVCEstimator,
+                SparkNaiveBayesEstimator,
+                SparkRandomForestEstimator,
+                SVCEstimator,
                TransformersEstimator,
                TransformersEstimatorModelSelection,
-                SparkLGBMEstimator,
+                XGBoostLimitDepthEstimator,
+                XGBoostSklearnEstimator,
            )

            self._estimators = {
@@ -70,6 +78,7 @@ class GenericTask(Task):
                "rf": RandomForestEstimator,
                "lgbm": LGBMEstimator,
                "lgbm_spark": SparkLGBMEstimator,
+                "rf_spark": SparkRandomForestEstimator,
                "lrl1": LRL1Classifier,
                "lrl2": LRL2Classifier,
                "catboost": CatBoostEstimator,
@@ -77,6 +86,17 @@ class GenericTask(Task):
                "kneighbor": KNeighborsEstimator,
                "transformer": TransformersEstimator,
                "transformer_ms": TransformersEstimatorModelSelection,
+                "histgb": HistGradientBoostingEstimator,
+                "svc": SVCEstimator,
+                "sgd": SGDEstimator,
+                "nb_spark": SparkNaiveBayesEstimator,
+                "enet": ElasticNetEstimator,
+                "lassolars": LassoLarsEstimator,
+                "glr_spark": SparkGLREstimator,
+                "lr_spark": SparkLinearRegressionEstimator,
+                "svc_spark": SparkLinearSVCEstimator,
+                "gbt_spark": SparkGBTEstimator,
+                "aft_spark": SparkAFTSurvivalRegressionEstimator,
            }
        return self._estimators

@@ -268,8 +288,8 @@ class GenericTask(Task):
            seed=RANDOM_SEED,
        )
        columns_to_drop = [c for c in df_all_train.columns if c in [stratify_column, "sample_weight"]]
-        X_train = df_all_train.drop(columns_to_drop)
-        X_val = df_all_val.drop(columns_to_drop)
+        X_train = df_all_train.drop(columns=columns_to_drop)
+        X_val = df_all_val.drop(columns=columns_to_drop)
        y_train = df_all_train[stratify_column]
        y_val = df_all_val[stratify_column]

@@ -345,6 +365,465 @@ class GenericTask(Task):
            X_train, X_val, y_train, y_val = GenericTask._split_pyspark(state, X, y, split_ratio, stratify)
        return X_train, X_val, y_train, y_val

+    def _handle_missing_labels_fast(
+        self,
+        state,
+        X_train,
+        X_val,
+        y_train,
+        y_val,
+        X_train_all,
+        y_train_all,
+        is_spark_dataframe,
+        data_is_df,
+    ):
+        """Handle missing labels by adding first instance to the set with missing label.
+
+        This is the faster version that may create some overlap but ensures all labels
+        are present in both sets. If a label is missing from train, it adds the first
+        instance to train. If a label is missing from val, it adds the first instance to val.
+        If no labels are missing, no instances are duplicated.
+
+        Args:
+            state: The state object containing fit parameters
+            X_train, X_val: Training and validation features
+            y_train, y_val: Training and validation labels
+            X_train_all, y_train_all: Complete dataset
+            is_spark_dataframe: Whether data is pandas_on_spark
+            data_is_df: Whether data is DataFrame/Series
+
+        Returns:
+            Tuple of (X_train, X_val, y_train, y_val) with missing labels added
+        """
+        # Check which labels are present in train and val sets
+        if is_spark_dataframe:
+            label_set_train, _ = unique_pandas_on_spark(y_train)
+            label_set_val, _ = unique_pandas_on_spark(y_val)
+            label_set_all, first = unique_value_first_index(y_train_all)
+        else:
+            label_set_all, first = unique_value_first_index(y_train_all)
+            label_set_train = np.unique(y_train)
+            label_set_val = np.unique(y_val)
+
+        # Find missing labels
+        missing_in_train = np.setdiff1d(label_set_all, label_set_train)
+        missing_in_val = np.setdiff1d(label_set_all, label_set_val)
+
+        # Add first instance of missing labels to train set
+        if len(missing_in_train) > 0:
+            missing_train_indices = []
+            for label in missing_in_train:
+                label_matches = np.where(label_set_all == label)[0]
+                if len(label_matches) > 0 and label_matches[0] < len(first):
+                    missing_train_indices.append(first[label_matches[0]])
+
+            if len(missing_train_indices) > 0:
+                X_missing_train = (
+                    iloc_pandas_on_spark(X_train_all, missing_train_indices)
+                    if is_spark_dataframe
+                    else X_train_all.iloc[missing_train_indices]
+                    if data_is_df
+                    else X_train_all[missing_train_indices]
+                )
+                y_missing_train = (
+                    iloc_pandas_on_spark(y_train_all, missing_train_indices)
+                    if is_spark_dataframe
+                    else y_train_all.iloc[missing_train_indices]
+                    if isinstance(y_train_all, (pd.Series, psSeries))
+                    else y_train_all[missing_train_indices]
+                )
+                X_train = concat(X_missing_train, X_train)
+                y_train = concat(y_missing_train, y_train) if data_is_df else np.concatenate([y_missing_train, y_train])
+
+                # Handle sample_weight if present
+                if "sample_weight" in state.fit_kwargs:
+                    sample_weight_source = (
+                        state.sample_weight_all
+                        if hasattr(state, "sample_weight_all")
+                        else state.fit_kwargs.get("sample_weight")
+                    )
+                    if sample_weight_source is not None and max(missing_train_indices) < len(sample_weight_source):
+                        missing_weights = (
+                            sample_weight_source[missing_train_indices]
+                            if isinstance(sample_weight_source, np.ndarray)
+                            else sample_weight_source.iloc[missing_train_indices]
+                        )
+                        state.fit_kwargs["sample_weight"] = concat(missing_weights, state.fit_kwargs["sample_weight"])
+
+        # Add first instance of missing labels to val set
+        if len(missing_in_val) > 0:
+            missing_val_indices = []
+            for label in missing_in_val:
+                label_matches = np.where(label_set_all == label)[0]
+                if len(label_matches) > 0 and label_matches[0] < len(first):
+                    missing_val_indices.append(first[label_matches[0]])
+
+            if len(missing_val_indices) > 0:
+                X_missing_val = (
+                    iloc_pandas_on_spark(X_train_all, missing_val_indices)
+                    if is_spark_dataframe
+                    else X_train_all.iloc[missing_val_indices]
+                    if data_is_df
+                    else X_train_all[missing_val_indices]
+                )
+                y_missing_val = (
+                    iloc_pandas_on_spark(y_train_all, missing_val_indices)
+                    if is_spark_dataframe
+                    else y_train_all.iloc[missing_val_indices]
+                    if isinstance(y_train_all, (pd.Series, psSeries))
+                    else y_train_all[missing_val_indices]
+                )
+                X_val = concat(X_missing_val, X_val)
+                y_val = concat(y_missing_val, y_val) if data_is_df else np.concatenate([y_missing_val, y_val])
+
+                # Handle sample_weight if present
+                if (
+                    "sample_weight" in state.fit_kwargs
+                    and hasattr(state, "weight_val")
+                    and state.weight_val is not None
+                ):
+                    sample_weight_source = (
+                        state.sample_weight_all
+                        if hasattr(state, "sample_weight_all")
+                        else state.fit_kwargs.get("sample_weight")
+                    )
+                    if sample_weight_source is not None and max(missing_val_indices) < len(sample_weight_source):
+                        missing_weights = (
+                            sample_weight_source[missing_val_indices]
+                            if isinstance(sample_weight_source, np.ndarray)
+                            else sample_weight_source.iloc[missing_val_indices]
+                        )
+                        state.weight_val = concat(missing_weights, state.weight_val)
+
+        return X_train, X_val, y_train, y_val
+
+    def _handle_missing_labels_no_overlap(
+        self,
+        state,
+        X_train,
+        X_val,
+        y_train,
+        y_val,
+        X_train_all,
+        y_train_all,
+        is_spark_dataframe,
+        data_is_df,
+        split_ratio,
+    ):
+        """Handle missing labels intelligently to avoid overlap when possible.
+
+        This is the slower but more precise version that:
+        - For single-instance classes: Adds to both sets (unavoidable overlap)
+        - For multi-instance classes: Re-splits them properly to avoid overlap
+
+        Args:
+            state: The state object containing fit parameters
+            X_train, X_val: Training and validation features
+            y_train, y_val: Training and validation labels
+            X_train_all, y_train_all: Complete dataset
+            is_spark_dataframe: Whether data is pandas_on_spark
+            data_is_df: Whether data is DataFrame/Series
+            split_ratio: The ratio for splitting
+
+        Returns:
+            Tuple of (X_train, X_val, y_train, y_val) with missing labels handled
+        """
+        # Check which labels are present in train and val sets
+        if is_spark_dataframe:
+            label_set_train, _ = unique_pandas_on_spark(y_train)
+            label_set_val, _ = unique_pandas_on_spark(y_val)
+            label_set_all, first = unique_value_first_index(y_train_all)
+        else:
+            label_set_all, first = unique_value_first_index(y_train_all)
+            label_set_train = np.unique(y_train)
+            label_set_val = np.unique(y_val)
+
+        # Find missing labels
+        missing_in_train = np.setdiff1d(label_set_all, label_set_train)
+        missing_in_val = np.setdiff1d(label_set_all, label_set_val)
+
+        # Handle missing labels intelligently
+        # For classes with only 1 instance: add to both sets (unavoidable overlap)
+        # For classes with multiple instances: move/split them properly to avoid overlap
+
+        if len(missing_in_train) > 0:
+            # Process missing labels in training set
+            for label in missing_in_train:
+                # Find all indices for this label in the original data
+                if is_spark_dataframe:
+                    label_indices = np.where(y_train_all.to_numpy() == label)[0].tolist()
+                else:
+                    label_indices = np.where(np.asarray(y_train_all) == label)[0].tolist()
+
+                num_instances = len(label_indices)
+
+                if num_instances == 1:
+                    # Single instance: must add to both train and val (unavoidable overlap)
+                    X_single = (
+                        iloc_pandas_on_spark(X_train_all, label_indices)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[label_indices]
+                        if data_is_df
+                        else X_train_all[label_indices]
+                    )
+                    y_single = (
+                        iloc_pandas_on_spark(y_train_all, label_indices)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[label_indices]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[label_indices]
+                    )
+                    X_train = concat(X_single, X_train)
+                    y_train = concat(y_single, y_train) if data_is_df else np.concatenate([y_single, y_train])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs:
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and label_indices[0] < len(sample_weight_source):
+                            single_weight = (
+                                sample_weight_source[label_indices]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[label_indices]
+                            )
+                            state.fit_kwargs["sample_weight"] = concat(single_weight, state.fit_kwargs["sample_weight"])
+                else:
+                    # Multiple instances: move some from val to train (no overlap needed)
+                    # Calculate how many to move to train (leave at least 1 in val)
+                    num_to_train = max(1, min(num_instances - 1, int(num_instances * (1 - split_ratio))))
+                    indices_to_move = label_indices[:num_to_train]
+
+                    X_to_move = (
+                        iloc_pandas_on_spark(X_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[indices_to_move]
+                        if data_is_df
+                        else X_train_all[indices_to_move]
+                    )
+                    y_to_move = (
+                        iloc_pandas_on_spark(y_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[indices_to_move]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[indices_to_move]
+                    )
+
+                    # Add to train
+                    X_train = concat(X_to_move, X_train)
+                    y_train = concat(y_to_move, y_train) if data_is_df else np.concatenate([y_to_move, y_train])
+
+                    # Remove from val (they are currently all in val)
+                    if is_spark_dataframe:
+                        val_mask = ~y_val.isin([label])
+                        X_val = X_val[val_mask]
+                        y_val = y_val[val_mask]
+                    else:
+                        val_mask = np.asarray(y_val) != label
+                        if data_is_df:
+                            X_val = X_val[val_mask]
+                            y_val = y_val[val_mask]
+                        else:
+                            X_val = X_val[val_mask]
+                            y_val = y_val[val_mask]
+
+                    # Add remaining instances back to val
+                    remaining_indices = label_indices[num_to_train:]
+                    if len(remaining_indices) > 0:
+                        X_remaining = (
+                            iloc_pandas_on_spark(X_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else X_train_all.iloc[remaining_indices]
+                            if data_is_df
+                            else X_train_all[remaining_indices]
+                        )
+                        y_remaining = (
+                            iloc_pandas_on_spark(y_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else y_train_all.iloc[remaining_indices]
+                            if isinstance(y_train_all, (pd.Series, psSeries))
+                            else y_train_all[remaining_indices]
+                        )
+                        X_val = concat(X_remaining, X_val)
+                        y_val = concat(y_remaining, y_val) if data_is_df else np.concatenate([y_remaining, y_val])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs:
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and max(indices_to_move) < len(sample_weight_source):
+                            weights_to_move = (
+                                sample_weight_source[indices_to_move]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[indices_to_move]
+                            )
+                            state.fit_kwargs["sample_weight"] = concat(
+                                weights_to_move, state.fit_kwargs["sample_weight"]
+                            )
+
+                            if (
+                                len(remaining_indices) > 0
+                                and hasattr(state, "weight_val")
+                                and state.weight_val is not None
+                            ):
+                                # Remove and re-add weights for val
+                                if isinstance(state.weight_val, np.ndarray):
+                                    state.weight_val = state.weight_val[val_mask]
+                                else:
+                                    state.weight_val = state.weight_val[val_mask]
+
+                                if max(remaining_indices) < len(sample_weight_source):
+                                    remaining_weights = (
+                                        sample_weight_source[remaining_indices]
+                                        if isinstance(sample_weight_source, np.ndarray)
+                                        else sample_weight_source.iloc[remaining_indices]
+                                    )
+                                    state.weight_val = concat(remaining_weights, state.weight_val)
+
+        if len(missing_in_val) > 0:
+            # Process missing labels in validation set
+            for label in missing_in_val:
+                # Find all indices for this label in the original data
+                if is_spark_dataframe:
+                    label_indices = np.where(y_train_all.to_numpy() == label)[0].tolist()
+                else:
+                    label_indices = np.where(np.asarray(y_train_all) == label)[0].tolist()
+
+                num_instances = len(label_indices)
+
+                if num_instances == 1:
+                    # Single instance: must add to both train and val (unavoidable overlap)
+                    X_single = (
+                        iloc_pandas_on_spark(X_train_all, label_indices)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[label_indices]
+                        if data_is_df
+                        else X_train_all[label_indices]
+                    )
+                    y_single = (
+                        iloc_pandas_on_spark(y_train_all, label_indices)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[label_indices]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[label_indices]
+                    )
+                    X_val = concat(X_single, X_val)
+                    y_val = concat(y_single, y_val) if data_is_df else np.concatenate([y_single, y_val])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs and hasattr(state, "weight_val"):
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and label_indices[0] < len(sample_weight_source):
+                            single_weight = (
+                                sample_weight_source[label_indices]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[label_indices]
+                            )
+                            if state.weight_val is not None:
+                                state.weight_val = concat(single_weight, state.weight_val)
+                else:
+                    # Multiple instances: move some from train to val (no overlap needed)
+                    # Calculate how many to move to val (leave at least 1 in train)
+                    num_to_val = max(1, min(num_instances - 1, int(num_instances * split_ratio)))
+                    indices_to_move = label_indices[:num_to_val]
+
+                    X_to_move = (
+                        iloc_pandas_on_spark(X_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[indices_to_move]
+                        if data_is_df
+                        else X_train_all[indices_to_move]
+                    )
+                    y_to_move = (
+                        iloc_pandas_on_spark(y_train_all, indices_to_move)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[indices_to_move]
+                        if isinstance(y_train_all, (pd.Series, psSeries))
+                        else y_train_all[indices_to_move]
+                    )
+
+                    # Add to val
+                    X_val = concat(X_to_move, X_val)
+                    y_val = concat(y_to_move, y_val) if data_is_df else np.concatenate([y_to_move, y_val])
+
+                    # Remove from train (they are currently all in train)
+                    if is_spark_dataframe:
+                        train_mask = ~y_train.isin([label])
+                        X_train = X_train[train_mask]
+                        y_train = y_train[train_mask]
+                    else:
+                        train_mask = np.asarray(y_train) != label
+                        if data_is_df:
+                            X_train = X_train[train_mask]
+                            y_train = y_train[train_mask]
+                        else:
+                            X_train = X_train[train_mask]
+                            y_train = y_train[train_mask]
+
+                    # Add remaining instances back to train
+                    remaining_indices = label_indices[num_to_val:]
+                    if len(remaining_indices) > 0:
+                        X_remaining = (
+                            iloc_pandas_on_spark(X_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else X_train_all.iloc[remaining_indices]
+                            if data_is_df
+                            else X_train_all[remaining_indices]
+                        )
+                        y_remaining = (
+                            iloc_pandas_on_spark(y_train_all, remaining_indices)
+                            if is_spark_dataframe
+                            else y_train_all.iloc[remaining_indices]
+                            if isinstance(y_train_all, (pd.Series, psSeries))
+                            else y_train_all[remaining_indices]
+                        )
+                        X_train = concat(X_remaining, X_train)
+                        y_train = concat(y_remaining, y_train) if data_is_df else np.concatenate([y_remaining, y_train])
+
+                    # Handle sample_weight
+                    if "sample_weight" in state.fit_kwargs:
+                        sample_weight_source = (
+                            state.sample_weight_all
+                            if hasattr(state, "sample_weight_all")
+                            else state.fit_kwargs.get("sample_weight")
+                        )
+                        if sample_weight_source is not None and max(indices_to_move) < len(sample_weight_source):
+                            weights_to_move = (
+                                sample_weight_source[indices_to_move]
+                                if isinstance(sample_weight_source, np.ndarray)
+                                else sample_weight_source.iloc[indices_to_move]
+                            )
+                            if hasattr(state, "weight_val") and state.weight_val is not None:
+                                state.weight_val = concat(weights_to_move, state.weight_val)
+
+                            if len(remaining_indices) > 0:
+                                # Remove and re-add weights for train
+                                if isinstance(state.fit_kwargs["sample_weight"], np.ndarray):
+                                    state.fit_kwargs["sample_weight"] = state.fit_kwargs["sample_weight"][train_mask]
+                                else:
+                                    state.fit_kwargs["sample_weight"] = state.fit_kwargs["sample_weight"][train_mask]
+
+                                if max(remaining_indices) < len(sample_weight_source):
+                                    remaining_weights = (
+                                        sample_weight_source[remaining_indices]
+                                        if isinstance(sample_weight_source, np.ndarray)
+                                        else sample_weight_source.iloc[remaining_indices]
+                                    )
+                                    state.fit_kwargs["sample_weight"] = concat(
+                                        remaining_weights, state.fit_kwargs["sample_weight"]
+                                    )
+
+        return X_train, X_val, y_train, y_val
+
    def prepare_data(
        self,
        state,
@@ -357,6 +836,7 @@ class GenericTask(Task):
        n_splits,
        data_is_df,
        sample_weight_full,
+        allow_label_overlap=True,
    ) -> int:
        X_val, y_val = state.X_val, state.y_val
        if issparse(X_val):
@@ -422,8 +902,8 @@ class GenericTask(Task):
                X_train_all, y_train_all = shuffle(X_train_all, y_train_all, random_state=RANDOM_SEED)
            if data_is_df:
                X_train_all.reset_index(drop=True, inplace=True)
-            if isinstance(y_train_all, pd.Series):
-                y_train_all.reset_index(drop=True, inplace=True)
+        if isinstance(y_train_all, pd.Series):
+            y_train_all.reset_index(drop=True, inplace=True)

        X_train, y_train = X_train_all, y_train_all
        state.groups_all = state.groups
@@ -485,31 +965,47 @@ class GenericTask(Task):
            elif self.is_classification():
                # for classification, make sure the labels are complete in both
                # training and validation data
-                label_set, first = unique_value_first_index(y_train_all)
-                rest = []
-                last = 0
-                first.sort()
-                for i in range(len(first)):
-                    rest.extend(range(last, first[i]))
-                    last = first[i] + 1
-                rest.extend(range(last, len(y_train_all)))
-                X_first = X_train_all.iloc[first] if data_is_df else X_train_all[first]
-                X_rest = X_train_all.iloc[rest] if data_is_df else X_train_all[rest]
-                y_rest = (
-                    y_train_all[rest]
-                    if isinstance(y_train_all, np.ndarray)
-                    else iloc_pandas_on_spark(y_train_all, rest)
-                    if is_spark_dataframe
-                    else y_train_all.iloc[rest]
-                )
-                stratify = y_rest if split_type == "stratified" else None
+                stratify = y_train_all if split_type == "stratified" else None
                X_train, X_val, y_train, y_val = self._train_test_split(
-                    state, X_rest, y_rest, first, rest, split_ratio, stratify
+                    state, X_train_all, y_train_all, split_ratio=split_ratio, stratify=stratify
                )
-                X_train = concat(X_first, X_train)
-                y_train = concat(label_set, y_train) if data_is_df else np.concatenate([label_set, y_train])
-                X_val = concat(X_first, X_val)
-                y_val = concat(label_set, y_val) if data_is_df else np.concatenate([label_set, y_val])
+
+                # Handle missing labels using the appropriate strategy
+                if allow_label_overlap:
+                    # Fast version: adds first instance to set with missing label (may create overlap)
+                    X_train, X_val, y_train, y_val = self._handle_missing_labels_fast(
+                        state,
+                        X_train,
+                        X_val,
+                        y_train,
+                        y_val,
+                        X_train_all,
+                        y_train_all,
+                        is_spark_dataframe,
+                        data_is_df,
+                    )
+                else:
+                    # Precise version: avoids overlap when possible (slower)
+                    X_train, X_val, y_train, y_val = self._handle_missing_labels_no_overlap(
+                        state,
+                        X_train,
+                        X_val,
+                        y_train,
+                        y_val,
+                        X_train_all,
+                        y_train_all,
+                        is_spark_dataframe,
+                        data_is_df,
+                        split_ratio,
+                    )
+
+                if isinstance(y_train, (psDataFrame, pd.DataFrame)) and y_train.shape[1] == 1:
+                    y_train = y_train[y_train.columns[0]]
+                    y_val = y_val[y_val.columns[0]]
+                    # Only set name if y_train_all is a Series (not a DataFrame)
+                    if isinstance(y_train_all, (pd.Series, psSeries)):
+                        y_train.name = y_val.name = y_train_all.name
+
            elif self.is_regression():
                X_train, X_val, y_train, y_val = self._train_test_split(
                    state, X_train_all, y_train_all, split_ratio=split_ratio
@@ -656,7 +1152,6 @@ class GenericTask(Task):
            fit_kwargs = {}
        if cv_score_agg_func is None:
            cv_score_agg_func = default_cv_score_agg_func
-        start_time = time.time()
        val_loss_folds = []
        log_metric_folds = []
        metric = None
@@ -698,7 +1193,10 @@ class GenericTask(Task):
            elif isinstance(kf, TimeSeriesSplit):
                kf = kf.split(X_train_split, y_train_split)
            else:
-                kf = kf.split(X_train_split)
+                try:
+                    kf = kf.split(X_train_split)
+                except TypeError:
+                    kf = kf.split(X_train_split, y_train_split)

        for train_index, val_index in kf:
            if shuffle:
@@ -721,10 +1219,10 @@ class GenericTask(Task):
            if not is_spark_dataframe:
                y_train, y_val = y_train_split[train_index], y_train_split[val_index]
                if weight is not None:
-                    fit_kwargs["sample_weight"], weight_val = (
-                        weight[train_index],
-                        weight[val_index],
+                    fit_kwargs["sample_weight"] = (
+                        weight[train_index] if isinstance(weight, np.ndarray) else weight.iloc[train_index]
                    )
+                    weight_val = weight[val_index] if isinstance(weight, np.ndarray) else weight.iloc[val_index]
                if groups is not None:
                    fit_kwargs["groups"] = (
                        groups[train_index] if isinstance(groups, np.ndarray) else groups.iloc[train_index]
@@ -763,8 +1261,6 @@ class GenericTask(Task):
            if is_spark_dataframe:
                X_train.spark.unpersist()  # uncache data to free memory
                X_val.spark.unpersist()  # uncache data to free memory
-            if budget and time.time() - start_time >= budget:
-                break
        val_loss, metric = cv_score_agg_func(val_loss_folds, log_metric_folds)
        n = total_fold_num
        pred_time /= n
@@ -807,27 +1303,23 @@ class GenericTask(Task):
        elif self.is_ts_forecastpanel():
            estimator_list = ["tft"]
        else:
+            estimator_list = [
+                "lgbm",
+                "rf",
+                "xgboost",
+                "extra_tree",
+                "xgb_limitdepth",
+                "lgbm_spark",
+                "rf_spark",
+                "sgd",
+            ]
            try:
                import catboost

-                estimator_list = [
-                    "lgbm",
-                    "rf",
-                    "catboost",
-                    "xgboost",
-                    "extra_tree",
-                    "xgb_limitdepth",
-                    "lgbm_spark",
-                ]
+                estimator_list += ["catboost"]
            except ImportError:
-                estimator_list = [
-                    "lgbm",
-                    "rf",
-                    "xgboost",
-                    "extra_tree",
-                    "xgb_limitdepth",
-                    "lgbm_spark",
-                ]
+                pass
+
            # if self.is_ts_forecast():
            #     # catboost is removed because it has a `name` parameter, making it incompatible with hcrystalball
            #     if "catboost" in estimator_list:
@@ -859,9 +1351,7 @@ class GenericTask(Task):
            return metric

        if self.is_nlp():
-            from flaml.automl.nlp.utils import (
-                load_default_huggingface_metric_for_task,
-            )
+            from flaml.automl.nlp.utils import load_default_huggingface_metric_for_task

            return load_default_huggingface_metric_for_task(self.name)
        elif self.is_binary():
--- a/flaml/automl/task/task.py
+++ b/flaml/automl/task/task.py
@@ -1,6 +1,8 @@
 from abc import ABC, abstractmethod
 from typing import TYPE_CHECKING, List, Optional, Tuple, Union
+
 import numpy as np
+
 from flaml.automl.data import DataFrame, Series, psDataFrame, psSeries

 if TYPE_CHECKING:
@@ -190,7 +192,7 @@ class Task(ABC):
                * Valid str options depend on different tasks.
                For classification tasks, valid choices are
                    ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified.
-                For regression tasks, valid choices are ["auto", 'uniform', 'time'].
+                For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group'].
                    "auto" -> uniform.
                For time series forecast tasks, must be "auto" or 'time'.
                For ranking task, must be "auto" or 'group'.
--- a/flaml/automl/task/time_series_task.py
+++ b/flaml/automl/task/time_series_task.py
@@ -2,26 +2,25 @@ import logging
 import time
 from typing import List

-import pandas as pd
 import numpy as np
+import pandas as pd
 from scipy.sparse import issparse
 from sklearn.model_selection import (
    GroupKFold,
    TimeSeriesSplit,
 )

-from flaml.automl.ml import get_val_loss, default_cv_score_agg_func
-from flaml.automl.time_series.ts_data import (
-    TimeSeriesDataset,
-    DataTransformerTS,
-    normalize_ts_data,
-)
-
+from flaml.automl.ml import default_cv_score_agg_func, get_val_loss
 from flaml.automl.task.task import (
-    Task,
-    get_classification_objective,
    TS_FORECAST,
    TS_FORECASTPANEL,
+    Task,
+    get_classification_objective,
+)
+from flaml.automl.time_series.ts_data import (
+    DataTransformerTS,
+    TimeSeriesDataset,
+    normalize_ts_data,
 )

 logger = logging.getLogger(__name__)
@@ -33,18 +32,24 @@ class TimeSeriesTask(Task):
        if self._estimators is None:
            # put this into a function to avoid circular dependency
            from flaml.automl.time_series import (
+                ARIMA,
+                LGBM_TS,
+                RF_TS,
+                SARIMAX,
+                Average,
+                CatBoost_TS,
+                ExtraTrees_TS,
+                HoltWinters,
+                LassoLars_TS,
+                Naive,
+                Orbit,
+                Prophet,
+                SeasonalAverage,
+                SeasonalNaive,
+                TCNEstimator,
+                TemporalFusionTransformerEstimator,
                XGBoost_TS,
                XGBoostLimitDepth_TS,
-                RF_TS,
-                LGBM_TS,
-                ExtraTrees_TS,
-                CatBoost_TS,
-                Prophet,
-                Orbit,
-                ARIMA,
-                SARIMAX,
-                TemporalFusionTransformerEstimator,
-                HoltWinters,
            )

            self._estimators = {
@@ -58,8 +63,19 @@ class TimeSeriesTask(Task):
                "holt-winters": HoltWinters,
                "catboost": CatBoost_TS,
                "tft": TemporalFusionTransformerEstimator,
+                "lassolars": LassoLars_TS,
+                "tcn": TCNEstimator,
+                "snaive": SeasonalNaive,
+                "naive": Naive,
+                "savg": SeasonalAverage,
+                "avg": Average,
            }

+            if self._estimators["tcn"] is None:
+                # remove TCN if import failed
+                del self._estimators["tcn"]
+                logger.info("Couldn't import pytorch_lightning, skipping TCN estimator")
+
            try:
                from prophet import Prophet as foo

@@ -72,7 +88,7 @@ class TimeSeriesTask(Task):

                self._estimators["orbit"] = Orbit
            except ImportError:
-                logger.info("Couldn't import Prophet, skipping")
+                logger.info("Couldn't import orbit, skipping")

        return self._estimators

@@ -135,7 +151,7 @@ class TimeSeriesTask(Task):
                raise ValueError("Must supply either X_train_all and y_train_all, or dataframe and label")

            try:
-                dataframe[self.time_col] = pd.to_datetime(dataframe[self.time_col])
+                dataframe.loc[:, self.time_col] = pd.to_datetime(dataframe[self.time_col])
            except Exception:
                raise ValueError(
                    f"For '{TS_FORECAST}' task, time column {self.time_col} must contain timestamp values."
@@ -370,9 +386,8 @@ class TimeSeriesTask(Task):
        return X

    def preprocess(self, X, transformer=None):
-        if isinstance(X, pd.DataFrame) or isinstance(X, np.ndarray) or isinstance(X, pd.Series):
-            X = X.copy()
-            X = normalize_ts_data(X, self.target_names, self.time_col)
+        if isinstance(X, (pd.DataFrame, np.ndarray, pd.Series)):
+            X = normalize_ts_data(X.copy(), self.target_names, self.time_col)
            return self._preprocess(X, transformer)
        elif isinstance(X, int):
            return X
@@ -513,7 +528,7 @@ def remove_ts_duplicates(
    duplicates = X.duplicated()

    if any(duplicates):
-        logger.warning("Duplicate timestamp values found in timestamp column. " f"\n{X.loc[duplicates, X][time_col]}")
+        logger.warning("Duplicate timestamp values found in timestamp column. " f"\n{X.loc[duplicates, time_col]}")
        X = X.drop_duplicates()
        logger.warning("Removed duplicate rows based on all columns")
        assert (
--- a/flaml/automl/time_series/init.py
+++ b/flaml/automl/time_series/init.py
@@ -1,17 +1,27 @@
-from .ts_model import (
-    Prophet,
-    Orbit,
-    ARIMA,
-    SARIMAX,
-    HoltWinters,
-    LGBM_TS,
-    XGBoost_TS,
-    RF_TS,
-    ExtraTrees_TS,
-    XGBoostLimitDepth_TS,
-    CatBoost_TS,
-    TimeSeriesEstimator,
-)
 from .tft import TemporalFusionTransformerEstimator
+from .ts_model import (
+    ARIMA,
+    LGBM_TS,
+    RF_TS,
+    SARIMAX,
+    Average,
+    CatBoost_TS,
+    ExtraTrees_TS,
+    HoltWinters,
+    LassoLars_TS,
+    Naive,
+    Orbit,
+    Prophet,
+    SeasonalAverage,
+    SeasonalNaive,
+    TimeSeriesEstimator,
+    XGBoost_TS,
+    XGBoostLimitDepth_TS,
+)
+
+try:
+    from .tcn import TCNEstimator
+except ImportError:
+    TCNEstimator = None

 from .ts_data import TimeSeriesDataset
--- a/flaml/automl/time_series/feature.py
+++ b/flaml/automl/time_series/feature.py
@@ -1,5 +1,5 @@
-import math
 import datetime
+import math
 from functools import lru_cache

 import pandas as pd
--- a/flaml/automl/time_series/sklearn.py
+++ b/flaml/automl/time_series/sklearn.py
@@ -12,29 +12,35 @@ except ImportError:
    DataFrame = Series = None

 import numpy as np
-from sklearn.preprocessing import StandardScaler
 from sklearn.decomposition import PCA
+from sklearn.preprocessing import StandardScaler


 def make_lag_features(X: pd.DataFrame, y: pd.Series, lags: int):
-    """Transform input data X, y into autoregressive form - shift
-    them appropriately based on horizon and create `lags` columns.
+    """Transform input data X, y into autoregressive form by creating `lags` columns.
+
+    This function is called automatically by FLAML during the training process
+    to convert time series data into a format suitable for sklearn-based regression
+    models (e.g., lgbm, rf, xgboost). Users do NOT need to manually call this function
+    or create lagged features themselves.

    Parameters
    ----------
    X : pandas.DataFrame
-        Input features.
+        Input feature DataFrame, which may contain temporal features and/or exogenous variables.

    y : array_like, (1d)
-        Target vector.
+        Target vector (time series values to forecast).

-    horizon : int
-        length of X for `predict` method
+    lags : int
+        Number of lagged time steps to use as features.

    Returns
    -------
    pandas.DataFrame
-        shifted dataframe with `lags` columns
+        Shifted dataframe with `lags` columns for each original feature.
+        The target variable y is also lagged to prevent data leakage
+        (i.e., we use y(t-1), y(t-2), ..., y(t-lags) to predict y(t)).
    """
    lag_features = []

@@ -55,6 +61,17 @@ def make_lag_features(X: pd.DataFrame, y: pd.Series, lags: int):


 class SklearnWrapper:
+    """Wrapper class for using sklearn-based models for time series forecasting.
+
+    This wrapper automatically handles the transformation of time series data into
+    a supervised learning format by creating lagged features. It trains separate
+    models for each step in the forecast horizon.
+
+    Users typically don't interact with this class directly - it's used internally
+    by FLAML when sklearn-based estimators (lgbm, rf, xgboost, etc.) are selected
+    for time series forecasting tasks.
+    """
+
    def __init__(
        self,
        model_class: type,
@@ -76,6 +93,8 @@ class SklearnWrapper:
            self.pca = None

    def fit(self, X: pd.DataFrame, y: pd.Series, **kwargs):
+        if "is_retrain" in kwargs:
+            kwargs.pop("is_retrain")
        self._X = X
        self._y = y

@@ -92,7 +111,14 @@ class SklearnWrapper:

        for i, model in enumerate(self.models):
            offset = i + self.lags
-            model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
+            if len(X) - offset > 2:
+                # series with length 2 will meet All features are either constant or ignored.
+                # TODO: see why the non-constant features are ignored. Selector?
+                model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
+            elif len(X) > offset and "catboost" not in str(model).lower():
+                model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
+            else:
+                print("[INFO]: Length of data should longer than period + lags.")
        return self

    def predict(self, X, X_train=None, y_train=None):
--- a/flaml/automl/time_series/tcn.py
+++ b/flaml/automl/time_series/tcn.py
@@ -0,0 +1,286 @@
+# This file is adapted from
+# https://github.com/locuslab/TCN/blob/master/TCN/tcn.py
+# https://github.com/locuslab/TCN/blob/master/TCN/adding_problem/add_test.py
+
+import datetime
+import logging
+import time
+
+import pandas as pd
+import pytorch_lightning as pl
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
+from pytorch_lightning.loggers import TensorBoardLogger
+from torch.nn.utils import weight_norm
+from torch.utils.data import DataLoader, TensorDataset
+
+from flaml import tune
+from flaml.automl.data import add_time_idx_col
+from flaml.automl.logger import logger, logger_formatter
+from flaml.automl.time_series.ts_data import TimeSeriesDataset
+from flaml.automl.time_series.ts_model import TimeSeriesEstimator
+
+
+class Chomp1d(nn.Module):
+    def __init__(self, chomp_size):
+        super().__init__()
+        self.chomp_size = chomp_size
+
+    def forward(self, x):
+        return x[:, :, : -self.chomp_size].contiguous()
+
+
+class TemporalBlock(nn.Module):
+    def __init__(self, n_inputs, n_outputs, kernel_size, stride, dilation, padding, dropout=0.2):
+        super().__init__()
+        self.conv1 = weight_norm(
+            nn.Conv1d(n_inputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation)
+        )
+        self.chomp1 = Chomp1d(padding)
+        self.relu1 = nn.ReLU()
+        self.dropout1 = nn.Dropout(dropout)
+
+        self.conv2 = weight_norm(
+            nn.Conv1d(n_outputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation)
+        )
+        self.chomp2 = Chomp1d(padding)
+        self.relu2 = nn.ReLU()
+        self.dropout2 = nn.Dropout(dropout)
+
+        self.net = nn.Sequential(
+            self.conv1, self.chomp1, self.relu1, self.dropout1, self.conv2, self.chomp2, self.relu2, self.dropout2
+        )
+        self.downsample = nn.Conv1d(n_inputs, n_outputs, 1) if n_inputs != n_outputs else None
+        self.relu = nn.ReLU()
+        self.init_weights()
+
+    def init_weights(self):
+        self.conv1.weight.data.normal_(0, 0.01)
+        self.conv2.weight.data.normal_(0, 0.01)
+        if self.downsample is not None:
+            self.downsample.weight.data.normal_(0, 0.01)
+
+    def forward(self, x):
+        out = self.net(x)
+        res = x if self.downsample is None else self.downsample(x)
+        return self.relu(out + res)
+
+
+class TCNForecaster(nn.Module):
+    def __init__(
+        self,
+        input_feature_num,
+        num_outputs,
+        num_channels,
+        kernel_size=2,
+        dropout=0.2,
+    ):
+        super().__init__()
+        layers = []
+        num_levels = len(num_channels)
+        for i in range(num_levels):
+            dilation_size = 2**i
+            in_channels = input_feature_num if i == 0 else num_channels[i - 1]
+            out_channels = num_channels[i]
+            layers += [
+                TemporalBlock(
+                    in_channels,
+                    out_channels,
+                    kernel_size,
+                    stride=1,
+                    dilation=dilation_size,
+                    padding=(kernel_size - 1) * dilation_size,
+                    dropout=dropout,
+                )
+            ]
+
+        self.network = nn.Sequential(*layers)
+        self.linear = nn.Linear(num_channels[-1], num_outputs)
+
+    def forward(self, x):
+        y1 = self.network(x)
+        return self.linear(y1[:, :, -1])
+
+
+class TCNForecasterLightningModule(pl.LightningModule):
+    def __init__(self, model: TCNForecaster, learning_rate: float = 1e-3):
+        super().__init__()
+        self.model = model
+        self.learning_rate = learning_rate
+        self.loss_fn = nn.MSELoss()
+
+    def forward(self, x):
+        return self.model(x)
+
+    def step(self, batch, batch_idx):
+        x, y = batch
+        y_hat = self.model(x)
+        loss = self.loss_fn(y_hat, y)
+        return loss
+
+    def training_step(self, batch, batch_idx):
+        loss = self.step(batch, batch_idx)
+        self.log("train_loss", loss)
+        return loss
+
+    def validation_step(self, batch, batch_idx):
+        loss = self.step(batch, batch_idx)
+        self.log("val_loss", loss)
+        return loss
+
+    def configure_optimizers(self):
+        return torch.optim.Adam(self.parameters(), lr=self.learning_rate)
+
+
+class DataframeDataset(torch.utils.data.Dataset):
+    def __init__(self, dataframe, target_column, features_columns, sequence_length, train=True):
+        self.data = torch.tensor(dataframe[features_columns].to_numpy(), dtype=torch.float)
+        self.sequence_length = sequence_length
+        if train:
+            self.labels = torch.tensor(dataframe[target_column].to_numpy(), dtype=torch.float)
+        self.is_train = train
+
+    def __len__(self):
+        return len(self.data) - self.sequence_length + 1
+
+    def __getitem__(self, idx):
+        data = self.data[idx : idx + self.sequence_length]
+        data = data.permute(1, 0)
+        if self.is_train:
+            label = self.labels[idx : idx + self.sequence_length]
+            return data, label
+        else:
+            return data
+
+
+class TCNEstimator(TimeSeriesEstimator):
+    """The class for tuning TCN Forecaster"""
+
+    @classmethod
+    def search_space(cls, data, task, pred_horizon, **params):
+        space = {
+            "num_levels": {
+                "domain": tune.randint(lower=4, upper=20),  # hidden = 2^num_hidden
+                "init_value": 4,
+            },
+            "num_hidden": {
+                "domain": tune.randint(lower=4, upper=8),  # hidden = 2^num_hidden
+                "init_value": 5,
+            },
+            "kernel_size": {
+                "domain": tune.choice([2, 3, 5, 7]),  # common choices for kernel size
+                "init_value": 3,
+            },
+            "dropout": {
+                "domain": tune.uniform(lower=0.0, upper=0.5),  # standard range for dropout
+                "init_value": 0.1,
+            },
+            "learning_rate": {
+                "domain": tune.loguniform(lower=1e-4, upper=1e-1),  # typical range for learning rate
+                "init_value": 1e-3,
+            },
+        }
+        return space
+
+    def __init__(self, task="ts_forecast", n_jobs=1, **params):
+        super().__init__(task, **params)
+        logging.getLogger("pytorch_lightning").setLevel(logging.WARNING)
+
+    def fit(self, X_train: TimeSeriesDataset, y_train=None, budget=None, **kwargs):
+        start_time = time.time()
+        if budget is not None:
+            deltabudget = datetime.timedelta(seconds=budget)
+        else:
+            deltabudget = None
+        X_train = self.enrich(X_train)
+        super().fit(X_train, y_train, budget, **kwargs)
+
+        self.batch_size = kwargs.get("batch_size", 64)
+        self.horizon = kwargs.get("period", 1)
+        self.feature_cols = X_train.time_varying_known_reals
+        self.target_col = X_train.target_names[0]
+
+        train_dataset = DataframeDataset(
+            X_train.train_data,
+            self.target_col,
+            self.feature_cols,
+            self.horizon,
+        )
+        train_loader = DataLoader(train_dataset, batch_size=self.batch_size, shuffle=False)
+        if not X_train.test_data.empty:
+            val_dataset = DataframeDataset(
+                X_train.test_data,
+                self.target_col,
+                self.feature_cols,
+                self.horizon,
+            )
+        else:
+            val_dataset = DataframeDataset(
+                X_train.train_data.sample(frac=0.2, random_state=kwargs.get("random_state", 0)),
+                self.target_col,
+                self.feature_cols,
+                self.horizon,
+            )
+
+        val_loader = DataLoader(val_dataset, batch_size=self.batch_size, shuffle=False)
+
+        model = TCNForecaster(
+            len(self.feature_cols),
+            self.horizon,
+            [2 ** self.params["num_hidden"]] * self.params["num_levels"],
+            self.params["kernel_size"],
+            self.params["dropout"],
+        )
+
+        pl_module = TCNForecasterLightningModule(model, self.params["learning_rate"])
+
+        # Training loop
+        # gpus is deprecated in v1.7 and removed in v2.0
+        # accelerator="auto" can cast all condition.
+        trainer = pl.Trainer(
+            max_epochs=kwargs.get("max_epochs", 10),
+            accelerator="auto",
+            callbacks=[
+                EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min"),
+                LearningRateMonitor(),
+            ],
+            logger=TensorBoardLogger(kwargs.get("log_dir", "logs/lightning_logs")),  # logging results to a tensorboard
+            max_time=deltabudget,
+            enable_model_summary=False,
+            enable_progress_bar=False,
+        )
+        trainer.fit(
+            pl_module,
+            train_dataloaders=train_loader,
+            val_dataloaders=val_loader,
+        )
+        best_model = trainer.model
+        self._model = best_model
+        train_time = time.time() - start_time
+        return train_time
+
+    def predict(self, X):
+        X = self.enrich(X)
+        if isinstance(X, TimeSeriesDataset):
+            # Use X_train if X_val is empty (e.g., when computing training metrics)
+            df = X.X_val if len(X.test_data) > 0 else X.X_train
+        else:
+            df = X
+        dataset = DataframeDataset(
+            df,
+            self.target_col,
+            self.feature_cols,
+            self.horizon,
+            train=False,
+        )
+        data_loader = DataLoader(dataset, batch_size=self.batch_size, shuffle=False)
+        self._model.eval()
+        raw_preds = []
+        for batch_x in data_loader:
+            raw_pred = self._model(batch_x)
+            raw_preds.append(raw_pred)
+        raw_preds = torch.cat(raw_preds, dim=0)
+        preds = pd.Series(raw_preds.detach().numpy().ravel())
+        return preds
--- a/flaml/automl/time_series/tft.py
+++ b/flaml/automl/time_series/tft.py
@@ -1,3 +1,4 @@
+import inspect
 import time

 try:
@@ -105,12 +106,18 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):

    def fit(self, X_train, y_train, budget=None, **kwargs):
        import warnings
-        import pytorch_lightning as pl
+
+        try:
+            import lightning.pytorch as pl
+            from lightning.pytorch.callbacks import EarlyStopping, LearningRateMonitor
+            from lightning.pytorch.loggers import TensorBoardLogger
+        except ImportError:
+            import pytorch_lightning as pl
+            from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
+            from pytorch_lightning.loggers import TensorBoardLogger
        import torch
        from pytorch_forecasting import TemporalFusionTransformer
        from pytorch_forecasting.metrics import QuantileLoss
-        from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
-        from pytorch_lightning.loggers import TensorBoardLogger

        # a bit of monkey patching to fix the MacOS test
        # all the log_prediction method appears to do is plot stuff, which ?breaks github tests
@@ -131,12 +138,26 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
        lr_logger = LearningRateMonitor()  # log the learning rate
        logger = TensorBoardLogger(kwargs.get("log_dir", "lightning_logs"))  # logging results to a tensorboard
        default_trainer_kwargs = dict(
-            gpus=self._kwargs.get("gpu_per_trial", [0]) if torch.cuda.is_available() else None,
            max_epochs=max_epochs,
            gradient_clip_val=gradient_clip_val,
            callbacks=[lr_logger, early_stop_callback],
            logger=logger,
        )
+
+        # PyTorch Lightning >=2.0 replaced `gpus` with `accelerator`/`devices`.
+        # Also, passing `gpus=None` is not accepted on newer versions.
+        trainer_sig_params = inspect.signature(pl.Trainer.__init__).parameters
+        if torch.cuda.is_available() and "gpus" in trainer_sig_params:
+            gpus = self._kwargs.get("gpu_per_trial", None)
+            if gpus is not None:
+                default_trainer_kwargs["gpus"] = gpus
+        elif torch.cuda.is_available() and "devices" in trainer_sig_params:
+            devices = self._kwargs.get("gpu_per_trial", None)
+            if devices == -1:
+                devices = "auto"
+            if devices is not None:
+                default_trainer_kwargs["accelerator"] = "gpu"
+                default_trainer_kwargs["devices"] = devices
        trainer = pl.Trainer(
            **default_trainer_kwargs,
        )
@@ -156,7 +177,14 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
            val_dataloaders=val_dataloader,
        )
        best_model_path = trainer.checkpoint_callback.best_model_path
-        best_tft = TemporalFusionTransformer.load_from_checkpoint(best_model_path)
+        # PyTorch 2.6 changed `torch.load` default `weights_only` from False -> True.
+        # Some Lightning checkpoints (including those produced here) can require full unpickling.
+        # This path is generated locally during training, so it's trusted.
+        load_sig_params = inspect.signature(TemporalFusionTransformer.load_from_checkpoint).parameters
+        if "weights_only" in load_sig_params:
+            best_tft = TemporalFusionTransformer.load_from_checkpoint(best_model_path, weights_only=False)
+        else:
+            best_tft = TemporalFusionTransformer.load_from_checkpoint(best_model_path)
        train_time = time.time() - current_time
        self._model = best_tft
        return train_time
@@ -169,7 +197,11 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
        last_data_cols = self.group_ids.copy()
        last_data_cols.append(self.target_names[0])
        last_data = self.data[lambda x: x.time_idx == x.time_idx.max()][last_data_cols]
-        decoder_data = X.X_val if isinstance(X, TimeSeriesDataset) else X
+        # Use X_train if test_data is empty (e.g., when computing training metrics)
+        if isinstance(X, TimeSeriesDataset):
+            decoder_data = X.X_val if len(X.test_data) > 0 else X.X_train
+        else:
+            decoder_data = X
        if "time_idx" not in decoder_data:
            decoder_data = add_time_idx_col(decoder_data)
        decoder_data["time_idx"] += encoder_data["time_idx"].max() + 1 - decoder_data["time_idx"].min()
--- a/flaml/automl/time_series/ts_data.py
+++ b/flaml/automl/time_series/ts_data.py
@@ -2,17 +2,18 @@ import copy
 import datetime
 import math
 from dataclasses import dataclass, field
-from typing import List, Optional, Callable, Dict, Generator, Union
+from typing import Callable, Dict, Generator, List, Optional, Union

 import numpy as np

 try:
    import pandas as pd
    from pandas import DataFrame, Series, to_datetime
+    from pandas.api.types import is_datetime64_any_dtype
    from scipy.sparse import issparse
-    from sklearn.preprocessing import LabelEncoder
-    from sklearn.impute import SimpleImputer
    from sklearn.compose import ColumnTransformer
+    from sklearn.impute import SimpleImputer
+    from sklearn.preprocessing import LabelEncoder

    from .feature import monthly_fourier_features
 except ImportError:
@@ -26,6 +27,8 @@ except ImportError:
    DataFrame = Series = None


+# dataclass will remove empty default value even with field(default_factory=lambda: [])
+# Change into default=None to place the attr
@dataclass
 class TimeSeriesDataset:
    train_data: pd.DataFrame
@@ -34,10 +37,10 @@ class TimeSeriesDataset:
    target_names: List[str]
    frequency: str
    test_data: pd.DataFrame
-    time_varying_known_categoricals: List[str] = field(default_factory=lambda: [])
-    time_varying_known_reals: List[str] = field(default_factory=lambda: [])
-    time_varying_unknown_categoricals: List[str] = field(default_factory=lambda: [])
-    time_varying_unknown_reals: List[str] = field(default_factory=lambda: [])
+    time_varying_known_categoricals: List[str] = field(default=None)
+    time_varying_known_reals: List[str] = field(default=None)
+    time_varying_unknown_categoricals: List[str] = field(default=None)
+    time_varying_unknown_reals: List[str] = field(default=None)

    def __init__(
        self,
@@ -118,7 +121,12 @@ class TimeSeriesDataset:

    @property
    def X_all(self) -> pd.DataFrame:
-        return pd.concat([self.X_train, self.X_val], axis=0)
+        # Remove empty or all-NA columns before concatenation
+        X_train_filtered = self.X_train.dropna(axis=1, how="all")
+        X_val_filtered = self.X_val.dropna(axis=1, how="all")
+
+        # Concatenate the filtered DataFrames
+        return pd.concat([X_train_filtered, X_val_filtered], axis=0)

    @property
    def y_train(self) -> pd.DataFrame:
@@ -390,8 +398,17 @@ class DataTransformerTS:
        assert len(self.num_columns) == 0, "Trying to call fit() twice, something is wrong"

        for column in X.columns:
+            # Never treat the time column as a feature for sklearn preprocessing
+            if column == self.time_col:
+                continue
+
+            # Robust datetime detection (covers datetime64[ms/us/ns], tz-aware, etc.)
+            if is_datetime64_any_dtype(X[column]):
+                self.datetime_columns.append(column)
+                continue
+
            # sklearn/utils/validation.py needs int/float values
-            if X[column].dtype.name in ("object", "category"):
+            if X[column].dtype.name in ("object", "category", "string"):
                if (
                    # drop columns where all values are the same
                    X[column].nunique() == 1
@@ -403,7 +420,7 @@ class DataTransformerTS:
                    self.cat_columns.append(column)
            elif X[column].nunique(dropna=True) < 2:
                self.drop_columns.append(column)
-            elif X[column].dtype.name == "datetime64[ns]":
+            elif X[column].dtype.name in ["datetime64[ns]", "datetime64[s]"]:
                pass  # these will be processed at model level,
                # so they can also be done in the predict method
            else:
@@ -460,7 +477,7 @@ class DataTransformerTS:
                if "__NAN__" not in X[col].cat.categories:
                    X[col] = X[col].cat.add_categories("__NAN__").fillna("__NAN__")
            else:
-                X[col] = X[col].fillna("__NAN__")
+                X[col] = X[col].fillna("__NAN__").infer_objects(copy=False)
                X[col] = X[col].astype("category")

        for column in self.num_columns:
@@ -529,14 +546,12 @@ def normalize_ts_data(X_train_all, target_names, time_col, y_train_all=None):


 def validate_data_basic(X_train_all, y_train_all):
-    assert isinstance(X_train_all, np.ndarray) or issparse(X_train_all) or isinstance(X_train_all, pd.DataFrame), (
-        "X_train_all must be a numpy array, a pandas dataframe, " "or Scipy sparse matrix."
-    )
+    assert isinstance(X_train_all, (np.ndarray, DataFrame)) or issparse(
+        X_train_all
+    ), "X_train_all must be a numpy array, a pandas dataframe, or Scipy sparse matrix."

-    assert (
-        isinstance(y_train_all, np.ndarray)
-        or isinstance(y_train_all, pd.Series)
-        or isinstance(y_train_all, pd.DataFrame)
+    assert isinstance(
+        y_train_all, (np.ndarray, pd.Series, pd.DataFrame)
    ), "y_train_all must be a numpy array or a pandas series or DataFrame."

    assert X_train_all.size != 0 and y_train_all.size != 0, "Input data must not be empty, use None if no data"
--- a/flaml/automl/time_series/ts_model.py
+++ b/flaml/automl/time_series/ts_model.py
@@ -1,8 +1,8 @@
-import time
 import logging
-import os
-from datetime import datetime
 import math
+import os
+import time
+from datetime import datetime
 from typing import List, Optional, Union

 try:
@@ -22,26 +22,27 @@ except ImportError:
 import numpy as np

 from flaml import tune
-from flaml.model import (
-    suppress_stdout_stderr,
-    SKLearnEstimator,
-    logger,
-    LGBMEstimator,
-    XGBoostSklearnEstimator,
-    RandomForestEstimator,
-    ExtraTreesEstimator,
-    XGBoostLimitDepthEstimator,
+from flaml.automl.data import TS_TIMESTAMP_COL, TS_VALUE_COL
+from flaml.automl.model import (
    CatBoostEstimator,
-)
-from flaml.data import TS_TIMESTAMP_COL, TS_VALUE_COL
-from flaml.automl.time_series.ts_data import (
-    TimeSeriesDataset,
-    enrich_dataset,
-    enrich_dataframe,
-    normalize_ts_data,
-    create_forward_frame,
+    ExtraTreesEstimator,
+    LassoLarsEstimator,
+    LGBMEstimator,
+    RandomForestEstimator,
+    SKLearnEstimator,
+    XGBoostLimitDepthEstimator,
+    XGBoostSklearnEstimator,
+    logger,
+    suppress_stdout_stderr,
 )
 from flaml.automl.task import Task
+from flaml.automl.time_series.ts_data import (
+    TimeSeriesDataset,
+    create_forward_frame,
+    enrich_dataframe,
+    enrich_dataset,
+    normalize_ts_data,
+)


 class TimeSeriesEstimator(SKLearnEstimator):
@@ -143,6 +144,7 @@ class TimeSeriesEstimator(SKLearnEstimator):

    def score(self, X_val: DataFrame, y_val: Series, **kwargs):
        from sklearn.metrics import r2_score
+
        from ..ml import metric_loss_score

        y_pred = self.predict(X_val, **kwargs)
@@ -192,7 +194,13 @@ class Orbit(TimeSeriesEstimator):

        elif isinstance(X, TimeSeriesDataset):
            data = X
-            X = data.test_data[[self.time_col] + X.regressors]
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data[data.regressors + [data.time_col]]
+            else:
+                X = data.train_data[data.regressors + [data.time_col]]

        if self._model is not None:
            forecast = self._model.predict(X, **kwargs)
@@ -299,7 +307,13 @@ class Prophet(TimeSeriesEstimator):

        if isinstance(X, TimeSeriesDataset):
            data = X
-            X = data.test_data[data.regressors + [data.time_col]]
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data[data.regressors + [data.time_col]]
+            else:
+                X = data.train_data[data.regressors + [data.time_col]]

        X = X.rename(columns={self.time_col: "ds"})
        if self._model is not None:
@@ -325,11 +339,19 @@ class StatsModelsEstimator(TimeSeriesEstimator):

        if isinstance(X, TimeSeriesDataset):
            data = X
-            X = data.test_data[data.regressors + [data.time_col]]
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data[data.regressors + [data.time_col]]
+            else:
+                X = data.train_data[data.regressors + [data.time_col]]
        else:
            X = X[self.regressors + [self.time_col]]

        if isinstance(X, DataFrame):
+            if X.shape[0] == 0:
+                return pd.Series([], name=self.target_names[0], dtype=float)
            start = X[self.time_col].iloc[0]
            end = X[self.time_col].iloc[-1]
            if len(self.regressors):
@@ -610,15 +632,13 @@ class HoltWinters(StatsModelsEstimator):
        ):  # this would prevent heuristic initialization to work properly
            self.params["seasonal"] = None
        if (
-            self.params["seasonal"] == "mul" and (train_df.y == 0).sum() > 0
+            self.params["seasonal"] == "mul" and (train_df[target_col] == 0).sum() > 0
        ):  # cannot have multiplicative seasonality in this case
            self.params["seasonal"] = "add"
-        if self.params["trend"] == "mul" and (train_df.y == 0).sum() > 0:
+        if self.params["trend"] == "mul" and (train_df[target_col] == 0).sum() > 0:
            self.params["trend"] = "add"
-
        if not self.params["seasonal"] or self.params["trend"] not in ["mul", "add"]:
            self.params["damped_trend"] = False
-
        model = HWExponentialSmoothing(
            train_df[[target_col]],
            damped_trend=self.params["damped_trend"],
@@ -632,6 +652,125 @@ class HoltWinters(StatsModelsEstimator):
        return train_time


+class SimpleForecaster(StatsModelsEstimator):
+    """Base class for Naive Forecaster like Seasonal Naive, Naive, Seasonal Average, Average"""
+
+    @classmethod
+    def _search_space(cls, data: TimeSeriesDataset, task: Task, pred_horizon: int, **params):
+        return {
+            "season": {
+                "domain": tune.randint(1, pred_horizon),
+                "init_value": pred_horizon,
+            }
+        }
+
+    def joint_preprocess(self, X_train, y_train=None):
+        X_train = self.enrich(X_train)
+
+        self.regressors = []
+
+        if isinstance(X_train, TimeSeriesDataset):
+            data = X_train
+            target_col = data.target_names[0]
+            # this class only supports univariate regression
+            train_df = data.train_data[self.regressors + [target_col]]
+            train_df.index = to_datetime(data.train_data[data.time_col])
+        else:
+            target_col = TS_VALUE_COL
+            train_df = self._join(X_train, y_train)
+
+        self.time_col = data.time_col
+        self.target_names = data.target_names
+
+        train_df = self._preprocess(train_df)
+        return train_df, target_col
+
+    def fit(self, X_train, y_train=None, budget=None, **kwargs):
+        import warnings
+
+        warnings.filterwarnings("ignore")
+        from statsmodels.tsa.holtwinters import SimpleExpSmoothing
+
+        self.season = self.params.get("season", 1)
+        current_time = time.time()
+        super().fit(X_train, y_train, budget=budget, **kwargs)
+
+        train_df, target_col = self.joint_preprocess(X_train, y_train)
+
+        model = SimpleExpSmoothing(
+            train_df[[target_col]],
+        )
+        with suppress_stdout_stderr():
+            model = model.fit(smoothing_level=self.smoothing_level)
+        train_time = time.time() - current_time
+        self._model = model
+        return train_time
+
+
+class SeasonalNaive(SimpleForecaster):
+    smoothing_level = 1.0
+
+    def predict(self, X, **kwargs):
+        if isinstance(X, int):
+            forecasts = []
+            for i in range(X):
+                forecast = self._model.forecast(steps=self.season)[0]
+                forecasts.append(forecast)
+            return pd.Series(forecasts)
+        else:
+            return super().predict(X, **kwargs)
+
+
+class Naive(SimpleForecaster):
+    smoothing_level = 0.0
+
+    @classmethod
+    def _search_space(cls, data: TimeSeriesDataset, task: Task, pred_horizon: int, **params):
+        return {}
+
+    def predict(self, X, **kwargs):
+        if isinstance(X, int):
+            last_observation = self._model.params["initial_level"]
+            return pd.Series([last_observation] * X)
+        else:
+            return super().predict(X, **kwargs)
+
+
+class SeasonalAverage(SimpleForecaster):
+    def fit(self, X_train, y_train=None, budget=None, **kwargs):
+        from statsmodels.tsa.ar_model import AutoReg, ar_select_order
+
+        start_time = time.time()
+
+        self.season = kwargs.get("season", 1)  # seasonality period
+        train_df, target_col = self.joint_preprocess(X_train, y_train)
+        selection_res = ar_select_order(train_df[target_col], maxlag=self.season)
+
+        # Fit autoregressive model with optimal order
+        model = AutoReg(train_df[target_col], lags=selection_res.ar_lags)
+        self._model = model.fit()
+        end_time = time.time()
+
+        return end_time - start_time
+
+
+class Average(SimpleForecaster):
+    @classmethod
+    def _search_space(cls, data: TimeSeriesDataset, task: Task, pred_horizon: int, **params):
+        return {}
+
+    def fit(self, X_train, y_train=None, budget=None, **kwargs):
+        from statsmodels.tsa.ar_model import AutoReg
+
+        start_time = time.time()
+        train_df, target_col = self.joint_preprocess(X_train, y_train)
+        model = AutoReg(train_df[target_col], lags=0)
+        self._model = model.fit()
+        end_time = time.time()
+
+        return end_time - start_time
+
+
 class TS_SKLearn(TimeSeriesEstimator):
    """The class for tuning SKLearn Regressors for time-series forecasting"""

@@ -710,6 +849,13 @@ class TS_SKLearn(TimeSeriesEstimator):
        if isinstance(X, TimeSeriesDataset):
            data = X
            X = data.test_data
+            # By default we predict on the dataset's test partition.
+            # Some internal call paths (e.g., training-metric logging) may pass a
+            # dataset whose test partition is empty; fall back to train partition.
+            if data.test_data is not None and len(data.test_data):
+                X = data.test_data
+            else:
+                X = data.train_data

        if self._model is not None:
            X = X[self.regressors]
@@ -758,3 +904,7 @@ class XGBoostLimitDepth_TS(TS_SKLearn):
 # catboost regressor is invalid because it has a `name` parameter, making it incompatible with hcrystalball
 class CatBoost_TS(TS_SKLearn):
    base_class = CatBoostEstimator
+
+
+class LassoLars_TS(TS_SKLearn):
+    base_class = LassoLarsEstimator
--- a/flaml/automl/training_log.py
+++ b/flaml/automl/training_log.py
@@ -4,14 +4,14 @@
 """

 import json
-from typing import IO
-from contextlib import contextmanager
 import logging
+from contextlib import contextmanager
+from typing import IO

 logger = logging.getLogger("flaml.automl")


-class TrainingLogRecord(object):
+class TrainingLogRecord:
    def __init__(
        self,
        record_id: int,
@@ -52,7 +52,7 @@ class TrainingLogCheckPoint(TrainingLogRecord):
        self.curr_best_record_id = curr_best_record_id


-class TrainingLogWriter(object):
+class TrainingLogWriter:
    def __init__(self, output_filename: str):
        self.output_filename = output_filename
        self.file = None
@@ -79,7 +79,7 @@ class TrainingLogWriter(object):
        sample_size,
    ):
        if self.file is None:
-            raise IOError("Call open() to open the output file first.")
+            raise OSError("Call open() to open the output file first.")
        if validation_loss is None:
            raise ValueError("TEST LOSS NONE ERROR!!!")
        record = TrainingLogRecord(
@@ -109,7 +109,7 @@ class TrainingLogWriter(object):

    def checkpoint(self):
        if self.file is None:
-            raise IOError("Call open() to open the output file first.")
+            raise OSError("Call open() to open the output file first.")
        if self.current_best_loss_record_id is None:
            logger.warning("flaml.training_log: checkpoint() called before any record is written, skipped.")
            return
@@ -124,7 +124,7 @@ class TrainingLogWriter(object):
        self.file = None  # for pickle


-class TrainingLogReader(object):
+class TrainingLogReader:
    def __init__(self, filename: str):
        self.filename = filename
        self.file = None
@@ -134,7 +134,7 @@ class TrainingLogReader(object):

    def records(self):
        if self.file is None:
-            raise IOError("Call open() before reading log file.")
+            raise OSError("Call open() before reading log file.")
        for line in self.file:
            data = json.loads(line)
            if len(data) == 1:
@@ -149,7 +149,7 @@ class TrainingLogReader(object):

    def get_record(self, record_id) -> TrainingLogRecord:
        if self.file is None:
-            raise IOError("Call open() before reading log file.")
+            raise OSError("Call open() before reading log file.")
        for rec in self.records():
            if rec.record_id == record_id:
                return rec
--- a/flaml/data.py
+++ b/flaml/data.py
@@ -1,9 +0,0 @@
-import warnings
-
-from flaml.automl.data import *
-
-
-warnings.warn(
-    "Importing from `flaml.data` is deprecated. Please use `flaml.automl.data`.",
-    DeprecationWarning,
-)
--- a/flaml/default/README.md
+++ b/flaml/default/README.md
@@ -14,7 +14,6 @@ estimator.fit(X_train, y_train)
 estimator.predict(X_test, y_test)
 ```

-
 1. Use AutoML.fit(). set `starting_points="data"` and `max_iter=0`.

 ```python
@@ -36,10 +35,17 @@ automl.fit(X_train, y_train, **automl_settings)
 from flaml.default import preprocess_and_suggest_hyperparams

 X, y = load_iris(return_X_y=True, as_frame=True)
-X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
-hyperparams, estimator_class, X_transformed, y_transformed, feature_transformer, label_transformer = preprocess_and_suggest_hyperparams(
-    "classification", X_train, y_train, "lgbm"
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.33, random_state=42
 )
+(
+    hyperparams,
+    estimator_class,
+    X_transformed,
+    y_transformed,
+    feature_transformer,
+    label_transformer,
+) = preprocess_and_suggest_hyperparams("classification", X_train, y_train, "lgbm")
 model = estimator_class(**hyperparams)  # estimator_class is LGBMClassifier
 model.fit(X_transformed, y_train)  # LGBMClassifier can handle raw labels
 X_test = feature_transformer.transform(X_test)  # preprocess test data
@@ -172,7 +178,7 @@ Change "binary" into "multiclass" or "regression" for the other tasks.

 For more technical details, please check our research paper.

-* [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. arXiv preprint arXiv:2202.09927 (2022).
+- [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. arXiv preprint arXiv:2202.09927 (2022).

 ```bibtex
@article{Kayali2022default,
--- a/flaml/default/init.py
+++ b/flaml/default/init.py
@@ -1,18 +1,18 @@
-from .suggest import (
-    suggest_config,
-    suggest_learner,
-    suggest_hyperparams,
-    preprocess_and_suggest_hyperparams,
-    meta_feature,
-)
 from .estimator import (
-    flamlize_estimator,
-    LGBMClassifier,
-    LGBMRegressor,
-    XGBClassifier,
-    XGBRegressor,
-    RandomForestClassifier,
-    RandomForestRegressor,
    ExtraTreesClassifier,
    ExtraTreesRegressor,
+    LGBMClassifier,
+    LGBMRegressor,
+    RandomForestClassifier,
+    RandomForestRegressor,
+    XGBClassifier,
+    XGBRegressor,
+    flamlize_estimator,
+)
+from .suggest import (
+    meta_feature,
+    preprocess_and_suggest_hyperparams,
+    suggest_config,
+    suggest_hyperparams,
+    suggest_learner,
 )
--- a/flaml/default/estimator.py
+++ b/flaml/default/estimator.py
@@ -1,5 +1,7 @@
 from functools import wraps
+
 from flaml.automl.task.task import CLASSIFICATION
+
 from .suggest import preprocess_and_suggest_hyperparams

 DEFAULT_LOCATION = "default_location"
@@ -93,6 +95,27 @@ def flamlize_estimator(super_class, name: str, task: str, alternatives=None):
        def fit(self, X, y, *args, **params):
            hyperparams, estimator_name, X, y_transformed = self.suggest_hyperparams(X, y)
            self.set_params(**hyperparams)
+
+            # Transform eval_set if present
+            if "eval_set" in params and params["eval_set"] is not None:
+                transformed_eval_set = []
+                for eval_X, eval_y in params["eval_set"]:
+                    # Transform features
+                    eval_X_transformed = self._feature_transformer.transform(eval_X)
+                    # Transform labels if applicable
+                    if self._label_transformer and estimator_name in [
+                        "rf",
+                        "extra_tree",
+                        "xgboost",
+                        "xgb_limitdepth",
+                        "choose_xgb",
+                    ]:
+                        eval_y_transformed = self._label_transformer.transform(eval_y)
+                        transformed_eval_set.append((eval_X_transformed, eval_y_transformed))
+                    else:
+                        transformed_eval_set.append((eval_X_transformed, eval_y))
+                params["eval_set"] = transformed_eval_set
+
            if self._label_transformer and estimator_name in [
                "rf",
                "extra_tree",
@@ -105,7 +128,12 @@ def flamlize_estimator(super_class, name: str, task: str, alternatives=None):
                # if hasattr(self, "_classes"):
                #     self._classes = self._label_transformer.classes_
                # else:
-                self.classes_ = self._label_transformer.classes_
+                try:
+                    self.classes_ = self._label_transformer.classes_
+                except AttributeError:
+                    # xgboost 2: AttributeError: can't set attribute
+                    if "xgb" not in estimator_name:
+                        raise
                if "xgb" not in estimator_name:
                    # rf and et would do inverse transform automatically; xgb doesn't
                    self._label_transformer = None
--- a/flaml/default/greedy.py
+++ b/flaml/default/greedy.py
@@ -1,7 +1,7 @@
 import numpy as np
 import pandas as pd
-from sklearn.preprocessing import RobustScaler
 from sklearn.metrics import pairwise_distances
+from sklearn.preprocessing import RobustScaler


 def _augment(row):
@@ -12,7 +12,7 @@ def _augment(row):
 def construct_portfolio(regret_matrix, meta_features, regret_bound):
    """The portfolio construction algorithm.

-    (Reference)[https://arxiv.org/abs/2202.09927].
+    Reference: [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927).

    Args:
        regret_matrix: A dataframe of regret matrix.
@@ -32,6 +32,7 @@ def construct_portfolio(regret_matrix, meta_features, regret_bound):
    if meta_features is not None:
        scaler = RobustScaler()
        meta_features = meta_features.loc[tasks]
+        meta_features = meta_features.astype(float)
        meta_features.loc[:, :] = scaler.fit_transform(meta_features)
        nearest_task = {}
        for t in tasks:
--- a/flaml/default/portfolio.py
+++ b/flaml/default/portfolio.py
@@ -1,11 +1,13 @@
-import pandas as pd
-import numpy as np
 import argparse
-from pathlib import Path
 import json
+from pathlib import Path
+
+import numpy as np
+import pandas as pd
 from sklearn.preprocessing import RobustScaler
+
 from flaml.default import greedy
-from flaml.default.regret import load_result, build_regret
+from flaml.default.regret import build_regret, load_result
 from flaml.version import __version__

 regret_bound = 0.01
@@ -24,6 +26,7 @@ def config_predictor_tuple(tasks, configs, meta_features, regret_matrix):
    # pre-processing
    scaler = RobustScaler()
    meta_features_norm = meta_features.loc[tasks]  # this makes a copy
+    meta_features_norm = meta_features_norm.astype(float)
    meta_features_norm.loc[:, :] = scaler.fit_transform(meta_features_norm)

    proc = {
@@ -67,7 +70,7 @@ def build_portfolio(meta_features, regret, strategy):

 def load_json(filename):
    """Returns the contents of json file filename."""
-    with open(filename, "r") as f:
+    with open(filename) as f:
        return json.load(f)


--- a/flaml/default/regret.py
+++ b/flaml/default/regret.py
@@ -1,5 +1,6 @@
 import argparse
 from os import path
+
 import pandas as pd


--- a/flaml/default/suggest.py
+++ b/flaml/default/suggest.py
@@ -1,11 +1,13 @@
-import numpy as np
+import json
 import logging
 import pathlib
-import json
+
+import numpy as np
+
 from flaml.automl.data import DataTransformer
-from flaml.automl.task.task import CLASSIFICATION, get_classification_objective
-from flaml.automl.task.generic_task import len_labels
 from flaml.automl.task.factory import task_factory
+from flaml.automl.task.generic_task import len_labels
+from flaml.automl.task.task import CLASSIFICATION, get_classification_objective
 from flaml.version import __version__

 try:
@@ -41,7 +43,7 @@ def meta_feature(task, X_train, y_train, meta_feature_names):
                # 'numpy.ndarray' object has no attribute 'select_dtypes'
                this_feature.append(1)  # all features are numeric
        else:
-            raise ValueError("Feature {} not implemented. ".format(each_feature_name))
+            raise ValueError(f"Feature {each_feature_name} not implemented. ")

    return this_feature

@@ -55,7 +57,7 @@ def load_config_predictor(estimator_name, task, location=None):
    task = "multiclass" if task == "multi" else task  # TODO: multi -> multiclass?
    try:
        location = location or LOCATION
-        with open(f"{location}/{estimator_name}/{task}.json", "r") as f:
+        with open(f"{location}/{estimator_name}/{task}.json") as f:
            CONFIG_PREDICTORS[key] = predictor = json.load(f)
    except FileNotFoundError:
        raise FileNotFoundError(f"Portfolio has not been built for {estimator_name} on {task} task.")
--- a/flaml/fabric/init.py
+++ b/flaml/fabric/init.py
--- a/flaml/fabric/mlflow.py
+++ b/flaml/fabric/mlflow.py
--- a/flaml/ml.py
+++ b/flaml/ml.py
@@ -2,7 +2,6 @@ import warnings

 from flaml.automl.ml import *

-
 warnings.warn(
    "Importing from `flaml.ml` is deprecated. Please use `flaml.automl.ml`.",
    DeprecationWarning,
--- a/flaml/model.py
+++ b/flaml/model.py
@@ -1,9 +0,0 @@
-import warnings
-
-from flaml.automl.model import *
-
-
-warnings.warn(
-    "Importing from `flaml.model` is deprecated. Please use `flaml.automl.model`.",
-    DeprecationWarning,
-)
--- a/flaml/onlineml/README.md
+++ b/flaml/onlineml/README.md
@@ -1,10 +1,11 @@
 # ChaCha for Online AutoML

-FLAML includes *ChaCha* which is an automatic hyperparameter tuning solution for online machine learning. Online machine learning has the following properties: (1) data comes in sequential order; and (2) the performance of the machine learning model is evaluated online, i.e., at every iteration. *ChaCha* performs online AutoML respecting the aforementioned properties of online learning, and at the same time respecting the following constraints: (1) only a small constant number of 'live' models are allowed to perform online learning at the same time;  and (2) no model persistence or offline training is allowed, which means that once we decide to replace a 'live' model with a new one, the replaced model can no longer be retrieved.
+FLAML includes *ChaCha* which is an automatic hyperparameter tuning solution for online machine learning. Online machine learning has the following properties: (1) data comes in sequential order; and (2) the performance of the machine learning model is evaluated online, i.e., at every iteration. *ChaCha* performs online AutoML respecting the aforementioned properties of online learning, and at the same time respecting the following constraints: (1) only a small constant number of 'live' models are allowed to perform online learning at the same time; and (2) no model persistence or offline training is allowed, which means that once we decide to replace a 'live' model with a new one, the replaced model can no longer be retrieved.

 For more technical details about *ChaCha*, please check our paper.

-* [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.
+- [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.
+
 ```
@inproceedings{wu2021chacha,
    title={ChaCha for online AutoML},
@@ -23,8 +24,9 @@ An example of online namespace interactions tuning in VW:
 ```python
 # require: pip install flaml[vw]
 from flaml import AutoVW
-'''create an AutoVW instance for tuning namespace interactions'''
-autovw = AutoVW(max_live_model_num=5, search_space={'interactions': AutoVW.AUTOMATIC})
+
+"""create an AutoVW instance for tuning namespace interactions"""
+autovw = AutoVW(max_live_model_num=5, search_space={"interactions": AutoVW.AUTOMATIC})
 ```

 An example of online tuning of both namespace interactions and learning rate in VW:
@@ -33,12 +35,18 @@ An example of online tuning of both namespace interactions and learning rate in
 # require: pip install flaml[vw]
 from flaml import AutoVW
 from flaml.tune import loguniform
-''' create an AutoVW instance for tuning namespace interactions and learning rate'''
+
+""" create an AutoVW instance for tuning namespace interactions and learning rate"""
 # set up the search space and init config
-search_space_nilr = {'interactions': AutoVW.AUTOMATIC, 'learning_rate': loguniform(lower=2e-10, upper=1.0)}
-init_config_nilr = {'interactions': set(), 'learning_rate': 0.5}
+search_space_nilr = {
+    "interactions": AutoVW.AUTOMATIC,
+    "learning_rate": loguniform(lower=2e-10, upper=1.0),
+}
+init_config_nilr = {"interactions": set(), "learning_rate": 0.5}
 # create an AutoVW instance
-autovw = AutoVW(max_live_model_num=5, search_space=search_space_nilr, init_config=init_config_nilr)
+autovw = AutoVW(
+    max_live_model_num=5, search_space=search_space_nilr, init_config=init_config_nilr
+)
 ```

 A user can use the resulting AutoVW instances `autovw` in a similar way to a vanilla Vowpal Wabbit instance, i.e., `pyvw.vw`, to perform online learning by iteratively calling its `predict(data_example)` and `learn(data_example)` functions at each data example.
--- a/flaml/onlineml/autovw.py
+++ b/flaml/onlineml/autovw.py
@@ -1,16 +1,17 @@
-from typing import Optional, Union
 import logging
+from typing import Optional, Union
+
+from flaml.onlineml import OnlineTrialRunner
+from flaml.onlineml.trial import get_ns_feature_dim_from_vw_example
 from flaml.tune import (
-    Trial,
    Categorical,
    Float,
    PolynomialExpansionSet,
+    Trial,
    polynomial_expansion_set,
 )
-from flaml.onlineml import OnlineTrialRunner
 from flaml.tune.scheduler import ChaChaScheduler
 from flaml.tune.searcher import ChampionFrontierSearcher
-from flaml.onlineml.trial import get_ns_feature_dim_from_vw_example

 logger = logging.getLogger(__name__)

@@ -140,7 +141,7 @@ class AutoVW:
            max_live_model_num=self._max_live_model_num,
            searcher=searcher,
            scheduler=scheduler,
-            **self._automl_runner_args
+            **self._automl_runner_args,
        )

    def predict(self, data_sample):
--- a/flaml/onlineml/trial.py
+++ b/flaml/onlineml/trial.py
@@ -1,14 +1,16 @@
-import numpy as np
-import logging
-import time
-import math
-import copy
 import collections
+import copy
+import logging
+import math
+import time
 from typing import Optional, Union
+
+import numpy as np
+
 from flaml.tune import Trial

 try:
-    from sklearn.metrics import mean_squared_error, mean_absolute_error
+    from sklearn.metrics import mean_absolute_error, mean_squared_error
 except ImportError:
    pass

--- a/flaml/onlineml/trial_runner.py
+++ b/flaml/onlineml/trial_runner.py
@@ -1,10 +1,11 @@
-import numpy as np
+import logging
 import math
+
+import numpy as np
+
 from flaml.tune import Trial
 from flaml.tune.scheduler import TrialScheduler

-import logging
-
 logger = logging.getLogger(__name__)


--- a/flaml/tune/README.md
+++ b/flaml/tune/README.md
@@ -5,45 +5,47 @@ It can be used standalone, or together with ray tune or nni. Please find detaile

 Below are some quick examples.

-* Example for sequential tuning (recommended when compute resource is limited and each trial can consume all the resources):
+- Example for sequential tuning (recommended when compute resource is limited and each trial can consume all the resources):

 ```python
 # require: pip install flaml[blendsearch]
 from flaml import tune
 import time

+
 def evaluate_config(config):
-    '''evaluate a hyperparameter configuration'''
+    """evaluate a hyperparameter configuration"""
    # we uss a toy example with 2 hyperparameters
-    metric = (round(config['x'])-85000)**2 - config['x']/config['y']
+    metric = (round(config["x"]) - 85000) ** 2 - config["x"] / config["y"]
    # usually the evaluation takes an non-neglible cost
    # and the cost could be related to certain hyperparameters
    # in this example, we assume it's proportional to x
-    time.sleep(config['x']/100000)
+    time.sleep(config["x"] / 100000)
    # use tune.report to report the metric to optimize
    tune.report(metric=metric)

+
 analysis = tune.run(
-    evaluate_config,    # the function to evaluate a config
+    evaluate_config,  # the function to evaluate a config
    config={
-        'x': tune.lograndint(lower=1, upper=100000),
-        'y': tune.randint(lower=1, upper=100000)
-    }, # the search space
-    low_cost_partial_config={'x':1},    # a initial (partial) config with low cost
-    metric='metric',    # the name of the metric used for optimization
-    mode='min',         # the optimization mode, 'min' or 'max'
-    num_samples=-1,    # the maximal number of configs to try, -1 means infinite
-    time_budget_s=60,   # the time budget in seconds
-    local_dir='logs/',  # the local directory to store logs
+        "x": tune.lograndint(lower=1, upper=100000),
+        "y": tune.randint(lower=1, upper=100000),
+    },  # the search space
+    low_cost_partial_config={"x": 1},  # a initial (partial) config with low cost
+    metric="metric",  # the name of the metric used for optimization
+    mode="min",  # the optimization mode, 'min' or 'max'
+    num_samples=-1,  # the maximal number of configs to try, -1 means infinite
+    time_budget_s=60,  # the time budget in seconds
+    local_dir="logs/",  # the local directory to store logs
    # verbose=0,          # verbosity
    # use_ray=True, # uncomment when performing parallel tuning using ray
-    )
+)

 print(analysis.best_trial.last_result)  # the best trial's result
-print(analysis.best_config) # the best config
+print(analysis.best_config)  # the best config
 ```

-* Example for using ray tune's API:
+- Example for using ray tune's API:

 ```python
 # require: pip install flaml[blendsearch,ray]
@@ -51,36 +53,39 @@ from ray import tune as raytune
 from flaml import CFO, BlendSearch
 import time

+
 def evaluate_config(config):
-    '''evaluate a hyperparameter configuration'''
+    """evaluate a hyperparameter configuration"""
    # we use a toy example with 2 hyperparameters
-    metric = (round(config['x'])-85000)**2 - config['x']/config['y']
+    metric = (round(config["x"]) - 85000) ** 2 - config["x"] / config["y"]
    # usually the evaluation takes a non-neglible cost
    # and the cost could be related to certain hyperparameters
    # in this example, we assume it's proportional to x
-    time.sleep(config['x']/100000)
+    time.sleep(config["x"] / 100000)
    # use tune.report to report the metric to optimize
    tune.report(metric=metric)

+
 # provide a time budget (in seconds) for the tuning process
 time_budget_s = 60
 # provide the search space
 config_search_space = {
-        'x': tune.lograndint(lower=1, upper=100000),
-        'y': tune.randint(lower=1, upper=100000)
-    }
+    "x": tune.lograndint(lower=1, upper=100000),
+    "y": tune.randint(lower=1, upper=100000),
+}
 # provide the low cost partial config
-low_cost_partial_config={'x':1}
+low_cost_partial_config = {"x": 1}

 # set up CFO
 cfo = CFO(low_cost_partial_config=low_cost_partial_config)

 # set up BlendSearch
 blendsearch = BlendSearch(
-    metric="metric", mode="min",
+    metric="metric",
+    mode="min",
    space=config_search_space,
    low_cost_partial_config=low_cost_partial_config,
-    time_budget_s=time_budget_s
+    time_budget_s=time_budget_s,
 )
 # NOTE: when using BlendSearch as a search_alg in ray tune, you need to
 # configure the 'time_budget_s' for BlendSearch accordingly such that
@@ -89,28 +94,28 @@ blendsearch = BlendSearch(
 # automatically in flaml.

 analysis = raytune.run(
-    evaluate_config,    # the function to evaluate a config
+    evaluate_config,  # the function to evaluate a config
    config=config_search_space,
-    metric='metric',    # the name of the metric used for optimization
-    mode='min',         # the optimization mode, 'min' or 'max'
-    num_samples=-1,     # the maximal number of configs to try, -1 means infinite
-    time_budget_s=time_budget_s,   # the time budget in seconds
-    local_dir='logs/',  # the local directory to store logs
-    search_alg=blendsearch  # or cfo
+    metric="metric",  # the name of the metric used for optimization
+    mode="min",  # the optimization mode, 'min' or 'max'
+    num_samples=-1,  # the maximal number of configs to try, -1 means infinite
+    time_budget_s=time_budget_s,  # the time budget in seconds
+    local_dir="logs/",  # the local directory to store logs
+    search_alg=blendsearch,  # or cfo
 )

 print(analysis.best_trial.last_result)  # the best trial's result
 print(analysis.best_config)  # the best config
 ```

-* Example for using NNI: An example of using BlendSearch with NNI can be seen in [test](https://github.com/microsoft/FLAML/tree/main/test/nni). CFO can be used as well in a similar manner. To run the example, first make sure you have [NNI](https://nni.readthedocs.io/en/stable/) installed, then run:
+- Example for using NNI: An example of using BlendSearch with NNI can be seen in [test](https://github.com/microsoft/FLAML/tree/main/test/nni). CFO can be used as well in a similar manner. To run the example, first make sure you have [NNI](https://nni.readthedocs.io/en/stable/) installed, then run:

 ```shell
 $nnictl create --config ./config.yml
 ```

-* For more examples, please check out
-[notebooks](https://github.com/microsoft/FLAML/tree/main/notebook/).
+- For more examples, please check out
+  [notebooks](https://github.com/microsoft/FLAML/tree/main/notebook/).

 `flaml` offers two HPO methods: CFO and BlendSearch.
 `flaml.tune` uses BlendSearch by default.
@@ -185,16 +190,16 @@ tune.run(...
 )
 ```

-* Recommended scenario: cost-related hyperparameters exist, a low-cost
-initial point is known, and the search space is complex such that local search
-is prone to be stuck at local optima.
+- Recommended scenario: cost-related hyperparameters exist, a low-cost
+  initial point is known, and the search space is complex such that local search
+  is prone to be stuck at local optima.

-* Suggestion about using larger search space in BlendSearch:
-In hyperparameter optimization, a larger search space is desirable because it is more likely to include the optimal configuration (or one of the optimal configurations) in hindsight. However the performance (especially anytime performance) of most existing HPO methods is undesirable if the cost of the configurations in the search space has a large variation. Thus hand-crafted small search spaces (with relatively homogeneous cost) are often used in practice for these methods, which is subject to idiosyncrasy. BlendSearch combines the benefits of local search and global search, which enables a smart (economical) way of deciding where to explore in the search space even though it is larger than necessary. This allows users to specify a larger search space in BlendSearch, which is often easier and a better practice than narrowing down the search space by hand.
+- Suggestion about using larger search space in BlendSearch:
+  In hyperparameter optimization, a larger search space is desirable because it is more likely to include the optimal configuration (or one of the optimal configurations) in hindsight. However the performance (especially anytime performance) of most existing HPO methods is undesirable if the cost of the configurations in the search space has a large variation. Thus hand-crafted small search spaces (with relatively homogeneous cost) are often used in practice for these methods, which is subject to idiosyncrasy. BlendSearch combines the benefits of local search and global search, which enables a smart (economical) way of deciding where to explore in the search space even though it is larger than necessary. This allows users to specify a larger search space in BlendSearch, which is often easier and a better practice than narrowing down the search space by hand.

 For more technical details, please check our papers.

-* [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.
+- [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.

 ```bibtex
@inproceedings{wu2021cfo,
@@ -205,7 +210,7 @@ For more technical details, please check our papers.
 }
 ```

-* [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.
+- [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.

 ```bibtex
@inproceedings{wang2021blendsearch,
--- a/flaml/tune/init.py
+++ b/flaml/tune/init.py
@@ -3,16 +3,16 @@ try:

    assert ray_version >= "1.10.0"
    from ray.tune import (
-        uniform,
+        lograndint,
+        loguniform,
+        qlograndint,
+        qloguniform,
+        qrandint,
+        qrandn,
        quniform,
        randint,
-        qrandint,
        randn,
-        qrandn,
-        loguniform,
-        qloguniform,
-        lograndint,
-        qlograndint,
+        uniform,
    )

    if ray_version.startswith("1."):
@@ -20,21 +20,20 @@ try:
    else:
        from ray.tune.search import sample
 except (ImportError, AssertionError):
+    from . import sample
    from .sample import (
-        uniform,
+        lograndint,
+        loguniform,
+        qlograndint,
+        qloguniform,
+        qrandint,
+        qrandn,
        quniform,
        randint,
-        qrandint,
        randn,
-        qrandn,
-        loguniform,
-        qloguniform,
-        lograndint,
-        qlograndint,
+        uniform,
    )
-    from . import sample
-from .tune import run, report, INCUMBENT_RESULT
-from .sample import polynomial_expansion_set
-from .sample import PolynomialExpansionSet, Categorical, Float
+from .sample import Categorical, Float, PolynomialExpansionSet, polynomial_expansion_set
 from .trial import Trial
+from .tune import INCUMBENT_RESULT, report, run
 from .utils import choice
--- a/flaml/tune/analysis.py
+++ b/flaml/tune/analysis.py
@@ -15,10 +15,12 @@
 # This source file is adapted here because ray does not fully support Windows.

 # Copyright (c) Microsoft Corporation.
-from typing import Dict, Optional
-import numpy as np
-from .trial import Trial
 import logging
+from typing import Dict, Optional
+
+import numpy as np
+
+from .trial import Trial

 logger = logging.getLogger(__name__)

--- a/flaml/tune/logger.py
+++ b/flaml/tune/logger.py
@@ -0,0 +1,37 @@
+import logging
+import os
+
+
+class ColoredFormatter(logging.Formatter):
+    # ANSI escape codes for colors
+    COLORS = {
+        # logging.DEBUG: "\033[36m",  # Cyan
+        # logging.INFO: "\033[32m",   # Green
+        logging.WARNING: "\033[33m",  # Yellow
+        logging.ERROR: "\033[31m",  # Red
+        logging.CRITICAL: "\033[1;31m",  # Bright Red
+    }
+    RESET = "\033[0m"  # Reset to default
+
+    def __init__(self, fmt, datefmt, use_color=True):
+        super().__init__(fmt, datefmt)
+        self.use_color = use_color
+
+    def format(self, record):
+        formatted = super().format(record)
+        if self.use_color:
+            color = self.COLORS.get(record.levelno, "")
+            if color:
+                return f"{color}{formatted}{self.RESET}"
+        return formatted
+
+
+logger = logging.getLogger(__name__)
+use_color = True
+if os.getenv("FLAML_LOG_NO_COLOR"):
+    use_color = False
+
+logger_formatter = ColoredFormatter(
+    "[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S", use_color
+)
+logger.propagate = False
--- a/flaml/tune/sample.py
+++ b/flaml/tune/sample.py
@@ -19,6 +19,7 @@ import logging
 from copy import copy
 from math import isclose
 from typing import Any, Dict, List, Optional, Sequence, Union
+
 import numpy as np

 # Backwards compatibility
--- a/flaml/tune/scheduler/init.py
+++ b/flaml/tune/scheduler/init.py
@@ -1,6 +1,6 @@
-from .trial_scheduler import TrialScheduler
 from .online_scheduler import (
+    ChaChaScheduler,
    OnlineScheduler,
    OnlineSuccessiveDoublingScheduler,
-    ChaChaScheduler,
 )
+from .trial_scheduler import TrialScheduler
--- a/flaml/tune/scheduler/online_scheduler.py
+++ b/flaml/tune/scheduler/online_scheduler.py
@@ -1,9 +1,12 @@
-import numpy as np
 import logging
 from typing import Dict
-from flaml.tune.scheduler import TrialScheduler
+
+import numpy as np
+
 from flaml.tune import Trial

+from .trial_scheduler import TrialScheduler
+
 logger = logging.getLogger(__name__)


--- a/flaml/tune/searcher/blendsearch.py
+++ b/flaml/tune/searcher/blendsearch.py
@@ -2,10 +2,11 @@
 #  * Copyright (c) Microsoft Corporation. All rights reserved.
 #  * Licensed under the MIT License. See LICENSE file in the
 #  * project root for license information.
-from typing import Dict, Optional, List, Tuple, Callable, Union
-import numpy as np
-import time
 import pickle
+import time
+from typing import Callable, Dict, List, Optional, Tuple, Union
+
+import numpy as np

 try:
    from ray import __version__ as ray_version
@@ -18,17 +19,17 @@ try:
        from ray.tune.search import Searcher
        from ray.tune.search.optuna import OptunaSearch as GlobalSearch
 except (ImportError, AssertionError):
-    from .suggestion import Searcher
    from .suggestion import OptunaSearch as GlobalSearch
-from ..trial import unflatten_dict, flatten_dict
-from .. import INCUMBENT_RESULT
-from .search_thread import SearchThread
-from .flow2 import FLOW2
-from ..space import add_cost_to_space, indexof, normalize, define_by_run_func
-from ..result import TIME_TOTAL_S
-
+    from .suggestion import Searcher
 import logging

+from .. import INCUMBENT_RESULT
+from ..result import TIME_TOTAL_S
+from ..space import add_cost_to_space, define_by_run_func, indexof, normalize
+from ..trial import flatten_dict, unflatten_dict
+from .flow2 import FLOW2
+from .search_thread import SearchThread
+
 SEARCH_THREAD_EPS = 1.0
 PENALTY = 1e10  # penalty term for constraints
 logger = logging.getLogger(__name__)
@@ -216,7 +217,24 @@ class BlendSearch(Searcher):
        if global_search_alg is not None:
            self._gs = global_search_alg
        elif getattr(self, "__name__", None) != "CFO":
-            if space and self._ls.hierarchical:
+            # Use define-by-run for OptunaSearch when needed:
+            # - Hierarchical/conditional spaces are best supported via define-by-run.
+            # - Ray Tune domain/grid specs can trigger an "unresolved search space" warning
+            #   unless we switch to define-by-run.
+            use_define_by_run = bool(getattr(self._ls, "hierarchical", False))
+            if (not use_define_by_run) and isinstance(space, dict) and space:
+                try:
+                    from .variant_generator import parse_spec_vars
+
+                    _, domain_vars, grid_vars = parse_spec_vars(space)
+                    use_define_by_run = bool(domain_vars or grid_vars)
+                except Exception:
+                    # Be conservative: if we can't determine whether the space is
+                    # unresolved, fall back to the original behavior.
+                    use_define_by_run = False
+
+            self._use_define_by_run = use_define_by_run
+            if use_define_by_run:
                from functools import partial

                gs_space = partial(define_by_run_func, space=space)
@@ -243,13 +261,32 @@ class BlendSearch(Searcher):
                    evaluated_rewards=evaluated_rewards,
                )
            except (AssertionError, ValueError):
-                self._gs = GlobalSearch(
-                    space=gs_space,
-                    metric=metric,
-                    mode=mode,
-                    seed=gs_seed,
-                    sampler=sampler,
-                )
+                try:
+                    self._gs = GlobalSearch(
+                        space=gs_space,
+                        metric=metric,
+                        mode=mode,
+                        seed=gs_seed,
+                        sampler=sampler,
+                    )
+                except ValueError:
+                    # Ray Tune's OptunaSearch converts Tune domains into Optuna
+                    # distributions. Optuna disallows integer log distributions
+                    # with step != 1 (e.g., qlograndint with q>1), which can
+                    # raise here. Fall back to FLAML's OptunaSearch wrapper,
+                    # which handles these spaces more permissively.
+                    if getattr(GlobalSearch, "__module__", "").startswith("ray.tune"):
+                        from .suggestion import OptunaSearch as _FallbackOptunaSearch
+
+                        self._gs = _FallbackOptunaSearch(
+                            space=gs_space,
+                            metric=metric,
+                            mode=mode,
+                            seed=gs_seed,
+                            sampler=sampler,
+                        )
+                    else:
+                        raise
            self._gs.space = space
        else:
            self._gs = None
@@ -467,7 +504,7 @@ class BlendSearch(Searcher):
                            self._ls_bound_max,
                            self._subspace.get(trial_id, self._ls.space),
                        )
-                    if self._gs is not None and self._experimental and (not self._ls.hierarchical):
+                    if self._gs is not None and self._experimental and (not getattr(self, "_use_define_by_run", False)):
                        self._gs.add_evaluated_point(flatten_dict(config), objective)
                        # TODO: recover when supported
                        # converted = convert_key(config, self._gs.space)
@@ -931,27 +968,27 @@ try:

    assert ray_version >= "1.10.0"
    from ray.tune import (
-        uniform,
-        quniform,
        choice,
-        randint,
-        qrandint,
-        randn,
-        qrandn,
        loguniform,
        qloguniform,
+        qrandint,
+        qrandn,
+        quniform,
+        randint,
+        randn,
+        uniform,
    )
 except (ImportError, AssertionError):
    from ..sample import (
-        uniform,
-        quniform,
        choice,
-        randint,
-        qrandint,
-        randn,
-        qrandn,
        loguniform,
        qloguniform,
+        qrandint,
+        qrandn,
+        quniform,
+        randint,
+        randn,
+        uniform,
    )

 try:
@@ -978,7 +1015,7 @@ class BlendSearchTuner(BlendSearch, NNITuner):
        result = {
            "config": parameters,
            self._metric: extract_scalar_reward(value),
-            self.cost_attr: 1 if isinstance(value, float) else value.get(self.cost_attr, value.get("sequence", 1))
+            self.cost_attr: 1 if isinstance(value, float) else value.get(self.cost_attr, value.get("sequence", 1)),
            # if nni does not report training cost,
            # using sequence as an approximation.
            # if no sequence, using a constant 1
--- a/flaml/tune/searcher/cfo_cat.py
+++ b/flaml/tune/searcher/cfo_cat.py
@@ -2,8 +2,8 @@
 #  * Copyright (c) Microsoft Corporation. All rights reserved.
 #  * Licensed under the MIT License. See LICENSE file in the
 #  * project root for license information.
-from .flow2 import FLOW2
 from .blendsearch import CFO
+from .flow2 import FLOW2


 class FLOW2Cat(FLOW2):
--- a/flaml/tune/searcher/flow2.py
+++ b/flaml/tune/searcher/flow2.py
@@ -2,31 +2,34 @@
 #  * Copyright (c) Microsoft Corporation. All rights reserved.
 #  * Licensed under the MIT License. See LICENSE file in the
 #  * project root for license information.
-from typing import Dict, Optional, Tuple
-import numpy as np
 import logging
 from collections import defaultdict
+from typing import Dict, Optional, Tuple
+
+import numpy as np

 try:
    from ray import __version__ as ray_version

    assert ray_version >= "1.0.0"
    if ray_version.startswith("1."):
-        from ray.tune.suggest import Searcher
        from ray.tune import sample
+        from ray.tune.suggest import Searcher
    else:
        from ray.tune.search import Searcher, sample
    from ray.tune.utils.util import flatten_dict, unflatten_dict
 except (ImportError, AssertionError):
-    from .suggestion import Searcher
    from flaml.tune import sample
+
    from ..trial import flatten_dict, unflatten_dict
+    from .suggestion import Searcher
 from flaml.config import SAMPLE_MULTIPLY_FACTOR
+
 from ..space import (
    complete_config,
    denormalize,
-    normalize,
    generate_variants_compatible,
+    normalize,
 )

 logger = logging.getLogger(__name__)
@@ -106,7 +109,7 @@ class FLOW2(Searcher):
        else:
            mode = "min"

-        super(FLOW2, self).__init__(metric=metric, mode=mode)
+        super().__init__(metric=metric, mode=mode)
        # internally minimizes, so "max" => -1
        if mode == "max":
            self.metric_op = -1.0
@@ -135,7 +138,7 @@ class FLOW2(Searcher):
        self.max_resource = max_resource
        self._resource = None
        self._f_best = None  # only use for lexico_comapre. It represent the best value achieved by lexico_flow.
-        self._step_lb = np.Inf
+        self._step_lb = np.inf
        self._histories = None  # only use for lexico_comapre. It records the result of historical configurations.
        if space is not None:
            self._init_search()
@@ -347,7 +350,7 @@ class FLOW2(Searcher):
            else:
                assert (
                    self.lexico_objectives["tolerances"][k_metric][-1] == "%"
-                ), "String tolerance of {} should use %% as the suffix".format(k_metric)
+                ), f"String tolerance of {k_metric} should use %% as the suffix"
                tolerance_bound = self._f_best[k_metric] * (
                    1 + 0.01 * float(self.lexico_objectives["tolerances"][k_metric].replace("%", ""))
                )
@@ -382,7 +385,7 @@ class FLOW2(Searcher):
                else:
                    assert (
                        self.lexico_objectives["tolerances"][k_metric][-1] == "%"
-                    ), "String tolerance of {} should use %% as the suffix".format(k_metric)
+                    ), f"String tolerance of {k_metric} should use %% as the suffix"
                    tolerance_bound = self._f_best[k_metric] * (
                        1 + 0.01 * float(self.lexico_objectives["tolerances"][k_metric].replace("%", ""))
                    )
@@ -638,8 +641,10 @@ class FLOW2(Searcher):
            else:
                # key must be in space
                domain = space[key]
-                if self.hierarchical and not (
-                    domain is None or type(domain) in (str, int, float) or isinstance(domain, sample.Domain)
+                if (
+                    self.hierarchical
+                    and domain is not None
+                    and not isinstance(domain, (str, int, float, sample.Domain))
                ):
                    # not domain or hashable
                    # get rid of list type for hierarchical search space.
--- a/flaml/tune/searcher/online_searcher.py
+++ b/flaml/tune/searcher/online_searcher.py
@@ -1,9 +1,11 @@
-import numpy as np
-import logging
 import itertools
-from typing import Dict, Optional, List
-from flaml.tune import Categorical, Float, PolynomialExpansionSet, Trial
+import logging
+from typing import Dict, List, Optional
+
+import numpy as np
+
 from flaml.onlineml import VowpalWabbitTrial
+from flaml.tune import Categorical, Float, PolynomialExpansionSet, Trial
 from flaml.tune.searcher import CFO

 logger = logging.getLogger(__name__)
@@ -64,7 +66,7 @@ class ChampionFrontierSearcher(BaseSearcher):
    POLY_EXPANSION_ADDITION_NUM = 1
    # the order of polynomial expansions to add based on the given seed interactions
    EXPANSION_ORDER = 2
-    # the number of new challengers with new numerical hyperparamter configs
+    # the number of new challengers with new numerical hyperparameter configs
    NUMERICAL_NUM = 2

    # In order to use CFO, a loss name and loss values of configs are need
@@ -78,7 +80,7 @@ class ChampionFrontierSearcher(BaseSearcher):
    CFO_SEARCHER_METRIC_NAME = "pseudo_loss"
    CFO_SEARCHER_LARGE_LOSS = 1e6

-    # the random seed used in generating numerical hyperparamter configs (when CFO is not used)
+    # the random seed used in generating numerical hyperparameter configs (when CFO is not used)
    NUM_RANDOM_SEED = 111

    CHAMPION_TRIAL_NAME = "champion_trial"
@@ -205,7 +207,7 @@ class ChampionFrontierSearcher(BaseSearcher):
                    hyperparameter_config_groups.append(partial_new_configs)
                    # does not have searcher_trial_ids
                    searcher_trial_ids_groups.append([])
-            elif isinstance(config_domain, Float) or isinstance(config_domain, Categorical):
+            elif isinstance(config_domain, (Float, Categorical)):
                # otherwise we need to deal with them in group
                nonpoly_config[k] = v
                if k not in self._space_of_nonpoly_hp:
@@ -317,7 +319,7 @@ class ChampionFrontierSearcher(BaseSearcher):
        candidate_configs = [set(seed_interactions) | set(item) for item in space]
        final_candidate_configs = []
        for c in candidate_configs:
-            new_c = set([e for e in c if len(e) > 1])
+            new_c = {e for e in c if len(e) > 1}
            final_candidate_configs.append(new_c)
        return final_candidate_configs

--- a/flaml/tune/searcher/search_thread.py
+++ b/flaml/tune/searcher/search_thread.py
@@ -3,6 +3,7 @@
 #  * Licensed under the MIT License. See LICENSE file in the
 #  * project root for license information.
 from typing import Dict, Optional
+
 import numpy as np

 try:
@@ -15,14 +16,40 @@ try:
        from ray.tune.search import Searcher
 except (ImportError, AssertionError):
    from .suggestion import Searcher
-from .flow2 import FLOW2
-from ..space import add_cost_to_space, unflatten_hierarchical
-from ..result import TIME_TOTAL_S
 import logging

+from ..result import TIME_TOTAL_S
+from ..space import add_cost_to_space, unflatten_hierarchical
+from .flow2 import FLOW2
+
 logger = logging.getLogger(__name__)


+def _recursive_dict_update(target: Dict, source: Dict) -> None:
+    """Recursively update target dictionary with source dictionary.
+
+    Unlike dict.update(), this function merges nested dictionaries instead of
+    replacing them entirely. This is crucial for configurations with nested
+    structures (e.g., XGBoost params).
+
+    Args:
+        target: The dictionary to be updated (modified in place).
+        source: The dictionary containing values to merge into target.
+
+    Example:
+        >>> target = {'params': {'eta': 0.1, 'max_depth': 3}}
+        >>> source = {'params': {'verbosity': 0}}
+        >>> _recursive_dict_update(target, source)
+        >>> target
+        {'params': {'eta': 0.1, 'max_depth': 3, 'verbosity': 0}}
+    """
+    for key, value in source.items():
+        if isinstance(value, dict) and key in target and isinstance(target[key], dict):
+            _recursive_dict_update(target[key], value)
+        else:
+            target[key] = value
+
+
 class SearchThread:
    """Class of global or local search thread."""

@@ -63,7 +90,7 @@ class SearchThread:
            try:
                config = self._search_alg.suggest(trial_id)
                if isinstance(self._search_alg._space, dict):
-                    config.update(self._const)
+                    _recursive_dict_update(config, self._const)
                else:
                    # define by run
                    config, self.space = unflatten_hierarchical(config, self._space)
--- a/flaml/tune/searcher/suggestion.py
+++ b/flaml/tune/searcher/suggestion.py
@@ -15,15 +15,17 @@
 # This source file is adapted here because ray does not fully support Windows.

 # Copyright (c) Microsoft Corporation.
-import time
-import functools
-import warnings
 import copy
-import numpy as np
+import functools
 import logging
-from typing import Any, Dict, Optional, Union, List, Tuple, Callable
 import pickle
-from .variant_generator import parse_spec_vars
+import time
+import warnings
+from collections import defaultdict
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union
+
+import numpy as np
+
 from ..sample import (
    Categorical,
    Domain,
@@ -33,8 +35,75 @@ from ..sample import (
    Quantized,
    Uniform,
 )
+
+# If Ray is installed, flaml.tune may re-export Ray Tune sampling functions.
+# In that case, the search space contains Ray Tune Domain/Sampler objects,
+# which should be accepted by our Optuna search-space conversion.
+try:
+    from ray import __version__ as _ray_version  # type: ignore
+
+    if str(_ray_version).startswith("1."):
+        from ray.tune.sample import (  # type: ignore
+            Categorical as _RayCategorical,
+        )
+        from ray.tune.sample import (
+            Domain as _RayDomain,
+        )
+        from ray.tune.sample import (
+            Float as _RayFloat,
+        )
+        from ray.tune.sample import (
+            Integer as _RayInteger,
+        )
+        from ray.tune.sample import (
+            LogUniform as _RayLogUniform,
+        )
+        from ray.tune.sample import (
+            Quantized as _RayQuantized,
+        )
+        from ray.tune.sample import (
+            Uniform as _RayUniform,
+        )
+    else:
+        from ray.tune.search.sample import (  # type: ignore
+            Categorical as _RayCategorical,
+        )
+        from ray.tune.search.sample import (
+            Domain as _RayDomain,
+        )
+        from ray.tune.search.sample import (
+            Float as _RayFloat,
+        )
+        from ray.tune.search.sample import (
+            Integer as _RayInteger,
+        )
+        from ray.tune.search.sample import (
+            LogUniform as _RayLogUniform,
+        )
+        from ray.tune.search.sample import (
+            Quantized as _RayQuantized,
+        )
+        from ray.tune.search.sample import (
+            Uniform as _RayUniform,
+        )
+
+    _FLOAT_TYPES = (Float, _RayFloat)
+    _INTEGER_TYPES = (Integer, _RayInteger)
+    _CATEGORICAL_TYPES = (Categorical, _RayCategorical)
+    _DOMAIN_TYPES = (Domain, _RayDomain)
+    _QUANTIZED_TYPES = (Quantized, _RayQuantized)
+    _UNIFORM_TYPES = (Uniform, _RayUniform)
+    _LOGUNIFORM_TYPES = (LogUniform, _RayLogUniform)
+except Exception:  # pragma: no cover
+    _FLOAT_TYPES = (Float,)
+    _INTEGER_TYPES = (Integer,)
+    _CATEGORICAL_TYPES = (Categorical,)
+    _DOMAIN_TYPES = (Domain,)
+    _QUANTIZED_TYPES = (Quantized,)
+    _UNIFORM_TYPES = (Uniform,)
+    _LOGUNIFORM_TYPES = (LogUniform,)
 from ..trial import flatten_dict, unflatten_dict
-from collections import defaultdict
+from .variant_generator import parse_spec_vars

 logger = logging.getLogger(__name__)

@@ -183,13 +252,13 @@ class ConcurrencyLimiter(Searcher):
    """

    def __init__(self, searcher: Searcher, max_concurrent: int, batch: bool = False):
-        assert type(max_concurrent) is int and max_concurrent > 0
+        assert isinstance(max_concurrent, int) and max_concurrent > 0
        self.searcher = searcher
        self.max_concurrent = max_concurrent
        self.batch = batch
        self.live_trials = set()
        self.cached_results = {}
-        super(ConcurrencyLimiter, self).__init__(metric=self.searcher.metric, mode=self.searcher.mode)
+        super().__init__(metric=self.searcher.metric, mode=self.searcher.mode)

    def suggest(self, trial_id: str) -> Optional[Dict]:
        assert trial_id not in self.live_trials, f"Trial ID {trial_id} must be unique: already found in set."
@@ -252,8 +321,8 @@ try:
    import optuna as ot
    from optuna.distributions import BaseDistribution as OptunaDistribution
    from optuna.samplers import BaseSampler
-    from optuna.trial import TrialState as OptunaTrialState
    from optuna.trial import Trial as OptunaTrial
+    from optuna.trial import TrialState as OptunaTrialState
 except ImportError:
    ot = None
    OptunaDistribution = None
@@ -283,25 +352,21 @@ def validate_warmstart(
    """
    if points_to_evaluate:
        if not isinstance(points_to_evaluate, list):
-            raise TypeError("points_to_evaluate expected to be a list, got {}.".format(type(points_to_evaluate)))
+            raise TypeError(f"points_to_evaluate expected to be a list, got {type(points_to_evaluate)}.")
        for point in points_to_evaluate:
            if not isinstance(point, (dict, list)):
                raise TypeError(f"points_to_evaluate expected to include list or dict, " f"got {point}.")

            if validate_point_name_lengths and (not len(point) == len(parameter_names)):
-                raise ValueError(
-                    "Dim of point {}".format(point)
-                    + " and parameter_names {}".format(parameter_names)
-                    + " do not match."
-                )
+                raise ValueError(f"Dim of point {point}" + f" and parameter_names {parameter_names}" + " do not match.")

    if points_to_evaluate and evaluated_rewards:
        if not isinstance(evaluated_rewards, list):
-            raise TypeError("evaluated_rewards expected to be a list, got {}.".format(type(evaluated_rewards)))
+            raise TypeError(f"evaluated_rewards expected to be a list, got {type(evaluated_rewards)}.")
        if not len(evaluated_rewards) == len(points_to_evaluate):
            raise ValueError(
-                "Dim of evaluated_rewards {}".format(evaluated_rewards)
-                + " and points_to_evaluate {}".format(points_to_evaluate)
+                f"Dim of evaluated_rewards {evaluated_rewards}"
+                + f" and points_to_evaluate {points_to_evaluate}"
                + " do not match."
            )

@@ -545,7 +610,7 @@ class OptunaSearch(Searcher):
        evaluated_rewards: Optional[List] = None,
    ):
        assert ot is not None, "Optuna must be installed! Run `pip install optuna`."
-        super(OptunaSearch, self).__init__(metric=metric, mode=mode)
+        super().__init__(metric=metric, mode=mode)

        if isinstance(space, dict) and space:
            resolved_vars, domain_vars, grid_vars = parse_spec_vars(space)
@@ -559,7 +624,15 @@ class OptunaSearch(Searcher):
        self._space = space

        self._points_to_evaluate = points_to_evaluate or []
-        self._evaluated_rewards = evaluated_rewards
+        # rewards should be a list of floats, not a dict
+        # After Optuna > 3.5.0, there is a check for NaN in the list "any(math.isnan(x) for x in self._values)"
+        # which will raise an error when encountering a dict
+        if evaluated_rewards is not None:
+            self._evaluated_rewards = [
+                list(item.values())[0] if isinstance(item, dict) else item for item in evaluated_rewards
+            ]
+        else:
+            self._evaluated_rewards = evaluated_rewards

        self._study_name = "optuna"  # Fixed study name for in-memory storage

@@ -844,19 +917,22 @@ class OptunaSearch(Searcher):
        def resolve_value(domain: Domain) -> ot.distributions.BaseDistribution:
            quantize = None

-            sampler = domain.get_sampler()
-            if isinstance(sampler, Quantized):
+            # Ray Tune Domains and FLAML Domains both provide get_sampler(), but
+            # fall back to the .sampler attribute for robustness.
+            sampler = domain.get_sampler() if hasattr(domain, "get_sampler") else getattr(domain, "sampler", None)
+
+            if isinstance(sampler, _QUANTIZED_TYPES) or type(sampler).__name__ == "Quantized":
                quantize = sampler.q
-                sampler = sampler.sampler
-                if isinstance(sampler, LogUniform):
+                sampler = getattr(sampler, "sampler", None) or sampler.get_sampler()
+                if isinstance(sampler, _LOGUNIFORM_TYPES) or type(sampler).__name__ == "LogUniform":
                    logger.warning(
                        "Optuna does not handle quantization in loguniform "
                        "sampling. The parameter will be passed but it will "
                        "probably be ignored."
                    )

-            if isinstance(domain, Float):
-                if isinstance(sampler, LogUniform):
+            if isinstance(domain, _FLOAT_TYPES) or type(domain).__name__ == "Float":
+                if isinstance(sampler, _LOGUNIFORM_TYPES) or type(sampler).__name__ == "LogUniform":
                    if quantize:
                        logger.warning(
                            "Optuna does not support both quantization and "
@@ -864,17 +940,17 @@ class OptunaSearch(Searcher):
                        )
                    return ot.distributions.LogUniformDistribution(domain.lower, domain.upper)

-                elif isinstance(sampler, Uniform):
+                elif isinstance(sampler, _UNIFORM_TYPES) or type(sampler).__name__ == "Uniform":
                    if quantize:
                        return ot.distributions.DiscreteUniformDistribution(domain.lower, domain.upper, quantize)
                    return ot.distributions.UniformDistribution(domain.lower, domain.upper)

-            elif isinstance(domain, Integer):
-                if isinstance(sampler, LogUniform):
-                    return ot.distributions.IntLogUniformDistribution(
-                        domain.lower, domain.upper - 1, step=quantize or 1
-                    )
-                elif isinstance(sampler, Uniform):
+            elif isinstance(domain, _INTEGER_TYPES) or type(domain).__name__ == "Integer":
+                if isinstance(sampler, _LOGUNIFORM_TYPES) or type(sampler).__name__ == "LogUniform":
+                    # ``step`` argument Deprecated in v2.0.0. ``step`` argument should be 1 in Log Distribution
+                    # The removal of this feature is currently scheduled for v4.0.0,
+                    return ot.distributions.IntLogUniformDistribution(domain.lower, domain.upper - 1, step=1)
+                elif isinstance(sampler, _UNIFORM_TYPES) or type(sampler).__name__ == "Uniform":
                    # Upper bound should be inclusive for quantization and
                    # exclusive otherwise
                    return ot.distributions.IntUniformDistribution(
@@ -882,16 +958,16 @@ class OptunaSearch(Searcher):
                        domain.upper - int(bool(not quantize)),
                        step=quantize or 1,
                    )
-            elif isinstance(domain, Categorical):
-                if isinstance(sampler, Uniform):
+            elif isinstance(domain, _CATEGORICAL_TYPES) or type(domain).__name__ == "Categorical":
+                if isinstance(sampler, _UNIFORM_TYPES) or type(sampler).__name__ == "Uniform":
                    return ot.distributions.CategoricalDistribution(domain.categories)

            raise ValueError(
                "Optuna search does not support parameters of type "
-                "`{}` with samplers of type `{}`".format(type(domain).__name__, type(domain.sampler).__name__)
+                "`{}` with samplers of type `{}`".format(type(domain).__name__, type(sampler).__name__)
            )

        # Parameter name is e.g. "a/b/c" for nested dicts
        values = {"/".join(path): resolve_value(domain) for path, domain in domain_vars}

-        return values
+        return values
--- a/flaml/tune/searcher/variant_generator.py
+++ b/flaml/tune/searcher/variant_generator.py
@@ -17,9 +17,11 @@
 # Copyright (c) Microsoft Corporation.
 import copy
 import logging
-from typing import Any, Dict, Generator, List, Tuple
-import numpy
 import random
+from typing import Any, Dict, Generator, List, Tuple
+
+import numpy
+
 from ..sample import Categorical, Domain, RandomState

 try:
@@ -250,7 +252,7 @@ def _try_resolve(v) -> Tuple[bool, Any]:
        # Grid search values
        grid_values = v["grid_search"]
        if not isinstance(grid_values, list):
-            raise TuneError("Grid search expected list of values, got: {}".format(grid_values))
+            raise TuneError(f"Grid search expected list of values, got: {grid_values}")
        return False, Categorical(grid_values).grid()
    return True, v

@@ -300,13 +302,13 @@ def has_unresolved_values(spec: Dict) -> bool:

 class _UnresolvedAccessGuard(dict):
    def __init__(self, *args, **kwds):
-        super(_UnresolvedAccessGuard, self).__init__(*args, **kwds)
+        super().__init__(*args, **kwds)
        self.__dict__ = self

    def __getattribute__(self, item):
        value = dict.__getattribute__(self, item)
        if not _is_resolved(value):
-            raise RecursiveDependencyError("`{}` recursively depends on {}".format(item, value))
+            raise RecursiveDependencyError(f"`{item}` recursively depends on {value}")
        elif isinstance(value, dict):
            return _UnresolvedAccessGuard(value)
        else:
--- a/flaml/tune/space.py
+++ b/flaml/tune/space.py
@@ -11,9 +11,10 @@ try:
 except (ImportError, AssertionError):
    from . import sample
    from .searcher.variant_generator import generate_variants
-from typing import Dict, Optional, Any, Tuple, Generator, List, Union
-import numpy as np
 import logging
+from typing import Any, Dict, Generator, List, Optional, Tuple, Union
+
+import numpy as np

 logger = logging.getLogger(__name__)

@@ -260,7 +261,7 @@ def add_cost_to_space(space: Dict, low_cost_point: Dict, choice_cost: Dict):
                        low_cost[i] = point
                if len(low_cost) > len(domain.categories):
                    if domain.ordered:
-                        low_cost[-1] = int(np.where(ind == low_cost[-1])[0])
+                        low_cost[-1] = int(np.where(ind == low_cost[-1])[0].item())
                    domain.low_cost_point = low_cost[-1]
                return
        if low_cost:
@@ -489,7 +490,7 @@ def complete_config(
            elif domain.bounded:
                up, low, gauss_std = 1, 0, 1.0
            else:
-                up, low, gauss_std = np.Inf, -np.Inf, 1.0
+                up, low, gauss_std = np.inf, -np.inf, 1.0
            if domain.bounded:
                if isinstance(up, list):
                    up[-1] = min(up[-1], 1)
--- a/flaml/tune/spark/init.py
+++ b/flaml/tune/spark/init.py
@@ -1,8 +1,8 @@
 from flaml.tune.spark.utils import (
+    broadcast_code,
    check_spark,
    get_n_cpus,
    with_parameters,
-    broadcast_code,
 )

 __all__ = ["check_spark", "get_n_cpus", "with_parameters", "broadcast_code"]
--- a/flaml/tune/spark/utils.py
+++ b/flaml/tune/spark/utils.py
@@ -5,7 +5,6 @@ import threading
 import time
 from functools import lru_cache, partial

-
 logger = logging.getLogger(__name__)
 logger_formatter = logging.Formatter(
    "[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S"
@@ -13,10 +12,10 @@ logger_formatter = logging.Formatter(
 logger.propagate = False
 os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
 try:
+    import py4j
    import pyspark
    from pyspark.sql import SparkSession
    from pyspark.util import VersionUtils
-    import py4j
 except ImportError:
    _have_spark = False
    py4j = None
@@ -163,6 +162,10 @@ def broadcast_code(custom_code="", file_name="mylearner"):
    assert isinstance(MyLargeLGBM(), LGBMEstimator)
    ```
    """
+    # Check if Spark is available
+    spark_available, _ = check_spark()
+
+    # Write to local driver file system
    flaml_path = os.path.dirname(os.path.abspath(__file__))
    custom_code = textwrap.dedent(custom_code)
    custom_path = os.path.join(flaml_path, file_name + ".py")
@@ -170,6 +173,24 @@ def broadcast_code(custom_code="", file_name="mylearner"):
    with open(custom_path, "w") as f:
        f.write(custom_code)

+    # If using Spark, broadcast the code content to executors
+    if spark_available:
+        spark = SparkSession.builder.getOrCreate()
+        bc_code = spark.sparkContext.broadcast(custom_code)
+
+        # Execute a job to ensure the code is distributed to all executors
+        def _write_code(bc):
+            code = bc.value
+            import os
+
+            module_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), file_name + ".py")
+            os.makedirs(os.path.dirname(module_path), exist_ok=True)
+            with open(module_path, "w") as f:
+                f.write(code)
+            return True
+
+        spark.sparkContext.parallelize(range(1)).map(lambda _: _write_code(bc_code)).collect()
+
    return custom_path


@@ -286,6 +307,7 @@ class PySparkOvertimeMonitor:
    def __exit__(self, exc_type, exc_value, exc_traceback):
        """Exit the context manager.
        This will wait for the monitor thread to nicely exit."""
+        logger.debug(f"monitor exited: {exc_type}, {exc_value}, {exc_traceback}")
        if self._force_cancel and _have_spark:
            self._finished_flag = True
            self._monitor_daemon.join()
@@ -296,6 +318,11 @@ class PySparkOvertimeMonitor:
            if not exc_type:
                return True
            elif exc_type == py4j.protocol.Py4JJavaError:
+                logger.debug("Py4JJavaError Exception: %s", exc_value)
+                return True
+            elif exc_type == TypeError:
+                # When force cancel, joblib>1.2.0 will raise joblib.externals.loky.process_executor._ExceptionWithTraceback
+                logger.debug("TypeError Exception: %s", exc_value)
                return True
            else:
                return False
--- a/flaml/tune/trial.py
+++ b/flaml/tune/trial.py
@@ -15,10 +15,10 @@
 # This source file is adapted here because ray does not fully support Windows.

 # Copyright (c) Microsoft Corporation.
-import uuid
 import time
-from numbers import Number
+import uuid
 from collections import deque
+from numbers import Number


 def flatten_dict(dt, delimiter="/", prevent_delimiter=False):
@@ -110,7 +110,7 @@ class Trial:
                    }
                    self.metric_n_steps[metric] = {}
                    for n in self.n_steps:
-                        key = "last-{:d}-avg".format(n)
+                        key = f"last-{n:d}-avg"
                        self.metric_analysis[metric][key] = value
                        # Store n as string for correct restore.
                        self.metric_n_steps[metric][str(n)] = deque([value], maxlen=n)
@@ -124,7 +124,7 @@ class Trial:
                    self.metric_analysis[metric]["last"] = value

                    for n in self.n_steps:
-                        key = "last-{:d}-avg".format(n)
+                        key = f"last-{n:d}-avg"
                        self.metric_n_steps[metric][str(n)].append(value)
                        self.metric_analysis[metric][key] = sum(self.metric_n_steps[metric][str(n)]) / len(
                            self.metric_n_steps[metric][str(n)]
--- a/Show More
+++ b/Show More
				`@@ -0,0 +1 @@`
				`from .histgb import HistGradientBoostingEstimator`