Update version and readme (#1338 )

* Update version and readme * Update pr template
Update issue templates (#1337 )
2026-02-16 21:52:25 +08:00 · 2024-08-22 22:33:23 +00:00 · 2024-08-21 10:00:48 +00:00 · 2024-08-13 07:53:47 +00:00 · 2024-08-12 12:55:25 +00:00 · 2024-08-12 12:52:11 +00:00
120 changed files with 4907 additions and 1497 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -0,0 +1,73 @@
+### Description
+
+<!-- A clear and concise description of the issue or feature request. -->
+
+### Environment
+
+- FLAML version: <!-- Specify the FLAML version (e.g., v0.2.0) -->
+- Python version: <!-- Specify the Python version (e.g., 3.8) -->
+- Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
+
+### Steps to Reproduce (for bugs)
+
+<!-- Provide detailed steps to reproduce the issue. Include code snippets, configuration files, or any other relevant information. -->
+
+1. Step 1
+1. Step 2
+1. ...
+
+### Expected Behavior
+
+<!-- Describe what you expected to happen. -->
+
+### Actual Behavior
+
+<!-- Describe what actually happened. Include any error messages, stack traces, or unexpected behavior. -->
+
+### Screenshots / Logs (if applicable)
+
+<!-- If relevant, include screenshots or logs that help illustrate the issue. -->
+
+### Additional Information
+
+<!-- Include any additional information that might be helpful, such as specific configurations, data samples, or context about the environment. -->
+
+### Possible Solution (if you have one)
+
+<!-- If you have suggestions on how to address the issue, provide them here. -->
+
+### Is this a Bug or Feature Request?
+
+<!-- Choose one: Bug | Feature Request -->
+
+### Priority
+
+<!-- Choose one: High | Medium | Low -->
+
+### Difficulty
+
+<!-- Choose one: Easy | Moderate | Hard -->
+
+### Any related issues?
+
+<!-- If this is related to another issue, reference it here. -->
+
+### Any relevant discussions?
+
+<!-- If there are any discussions or forum threads related to this issue, provide links. -->
+
+### Checklist
+
+<!-- Please check the items that you have completed -->
+
+- [ ] I have searched for similar issues and didn't find any duplicates.
+- [ ] I have provided a clear and concise description of the issue.
+- [ ] I have included the necessary environment details.
+- [ ] I have outlined the steps to reproduce the issue.
+- [ ] I have included any relevant logs or screenshots.
+- [ ] I have indicated whether this is a bug or a feature request.
+- [ ] I have set the priority and difficulty levels.
+
+### Additional Comments
+
+<!-- Any additional comments or context that you think would be helpful. -->
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,53 @@
+name: Bug Report
+description: File a bug report
+title: "[Bug]: "
+labels: ["bug"]
+
+body:
+  - type: textarea
+    id: description
+    attributes:
+      label: Describe the bug
+      description: A clear and concise description of what the bug is.
+      placeholder: What went wrong?
+  - type: textarea
+    id: reproduce
+    attributes:
+      label: Steps to reproduce
+      description: |
+        Steps to reproduce the behavior:
+
+        1. Step 1
+        2. Step 2
+        3. ...
+        4. See error
+      placeholder: How can we replicate the issue?
+  - type: textarea
+    id: modelused
+    attributes:
+      label: Model Used
+      description: A description of the model that was used when the error was encountered
+      placeholder: gpt-4, mistral-7B etc
+  - type: textarea
+    id: expected_behavior
+    attributes:
+      label: Expected Behavior
+      description: A clear and concise description of what you expected to happen.
+      placeholder: What should have happened?
+  - type: textarea
+    id: screenshots
+    attributes:
+      label: Screenshots and logs
+      description: If applicable, add screenshots and logs to help explain your problem.
+      placeholder: Add screenshots here
+  - type: textarea
+    id: additional_information
+    attributes:
+      label: Additional Information
+      description: |
+        - FLAML Version: <!-- Specify the FLAML version (e.g., v0.2.0) -->
+        - Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
+        - Python Version: <!-- Specify the Python version (e.g., 3.8) -->
+        - Related Issues: <!-- Link to any related issues here (e.g., #1) -->
+        - Any other relevant information.
+      placeholder: Any additional details
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1 @@
+blank_issues_enabled: true
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,26 @@
+name: Feature Request
+description: File a feature request
+labels: ["enhancement"]
+title: "[Feature Request]: "
+
+body:
+  - type: textarea
+    id: problem_description
+    attributes:
+      label: Is your feature request related to a problem? Please describe.
+      description: A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+      placeholder: What problem are you trying to solve?
+
+  - type: textarea
+    id: solution_description
+    attributes:
+      label: Describe the solution you'd like
+      description: A clear and concise description of what you want to happen.
+      placeholder: How do you envision the solution?
+
+  - type: textarea
+    id: additional_context
+    attributes:
+      label: Additional context
+      description: Add any other context or screenshots about the feature request here.
+      placeholder: Any additional information
--- a/.github/ISSUE_TEMPLATE/general_issue.yml
+++ b/.github/ISSUE_TEMPLATE/general_issue.yml
@@ -0,0 +1,41 @@
+name: General Issue
+description: File a general issue
+title: "[Issue]: "
+labels: []
+
+body:
+  - type: textarea
+    id: description
+    attributes:
+      label: Describe the issue
+      description: A clear and concise description of what the issue is.
+      placeholder: What went wrong?
+  - type: textarea
+    id: reproduce
+    attributes:
+      label: Steps to reproduce
+      description: |
+        Steps to reproduce the behavior:
+
+        1. Step 1
+        2. Step 2
+        3. ...
+        4. See error
+      placeholder: How can we replicate the issue?
+  - type: textarea
+    id: screenshots
+    attributes:
+      label: Screenshots and logs
+      description: If applicable, add screenshots and logs to help explain your problem.
+      placeholder: Add screenshots here
+  - type: textarea
+    id: additional_information
+    attributes:
+      label: Additional Information
+      description: |
+        - FLAML Version: <!-- Specify the FLAML version (e.g., v0.2.0) -->
+        - Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
+        - Python Version: <!-- Specify the Python version (e.g., 3.8) -->
+        - Related Issues: <!-- Link to any related issues here (e.g., #1) -->
+        - Any other relevant information.
+      placeholder: Any additional details
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -12,7 +12,7 @@

 ## Checks

-<!-- - I've used [pre-commit](https://microsoft.github.io/FLAML/docs/Contribute#pre-commit) to lint the changes in this PR (note the same in integrated in our CI checks). -->
+- [ ] I've used [pre-commit](https://microsoft.github.io/FLAML/docs/Contribute#pre-commit) to lint the changes in this PR (note the same in integrated in our CI checks).
 - [ ] I've included any doc changes needed for https://microsoft.github.io/FLAML/. See https://microsoft.github.io/FLAML/docs/Contribute#documentation to build and test documentation locally.
 - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR.
 - [ ] I've made sure all auto checks have passed.
--- a/.github/workflows/python-package.yml
+++ b/.github/workflows/python-package.yml
@@ -30,19 +30,17 @@ jobs:
      fail-fast: false
      matrix:
        os: [ubuntu-latest, macos-latest, windows-2019]
-        python-version: ["3.8", "3.9", "3.10"]
+        python-version: ["3.8", "3.9", "3.10", "3.11"]
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v4
+        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
-      - name: On mac + python 3.10, install libomp to facilitate lgbm and xgboost install
-        if: matrix.os == 'macOS-latest' && matrix.python-version == '3.10'
+      - name: On mac, install libomp to facilitate lgbm and xgboost install
+        if: matrix.os == 'macOS-latest'
        run: |
-          # remove libomp version constraint after xgboost works with libomp>11.1.0 on python 3.10
-          wget https://raw.githubusercontent.com/Homebrew/homebrew-core/679923b4eb48a8dc7ecc1f05d06063cd79b3fc00/Formula/libomp.rb -O $(find $(brew --repository) -name libomp.rb)
-          brew unlink libomp
+          brew update
          brew install libomp
          export CC=/usr/bin/clang
          export CXX=/usr/bin/clang++
@@ -56,34 +54,34 @@ jobs:
          pip install -e .
          python -c "import flaml"
          pip install -e .[test]
-      - name: On Ubuntu python 3.8, install pyspark 3.2.3
-        if: matrix.python-version == '3.8' && matrix.os == 'ubuntu-latest'
+      - name: On Ubuntu python 3.10, install pyspark 3.4.1
+        if: matrix.python-version == '3.10' && matrix.os == 'ubuntu-latest'
        run: |
-          pip install pyspark==3.2.3
+          pip install pyspark==3.4.1
          pip list | grep "pyspark"
-      - name: If linux, install ray 2
-        if: matrix.os == 'ubuntu-latest'
+      - name: On Ubuntu python 3.11, install pyspark 3.5.1
+        if: matrix.python-version == '3.11' && matrix.os == 'ubuntu-latest'
+        run: |
+          pip install pyspark==3.5.1
+          pip list | grep "pyspark"
+      - name: If linux and python<3.11, install ray 2
+        if: matrix.os == 'ubuntu-latest' && matrix.python-version != '3.11'
        run: |
          pip install "ray[tune]<2.5.0"
-      - name: If mac, install ray and xgboost 1
-        if: matrix.os == 'macOS-latest'
+      - name: If mac and python 3.10, install ray and xgboost 1
+        if: matrix.os == 'macOS-latest' && matrix.python-version == '3.10'
        run: |
          pip install -e .[ray]
          # use macOS to test xgboost 1, but macOS also supports xgboost 2
          pip install "xgboost<2"
-      - name: If linux or mac, install prophet on python < 3.9
-        if: (matrix.os == 'macOS-latest' || matrix.os == 'ubuntu-latest') && matrix.python-version != '3.9' && matrix.python-version != '3.10'
+      - name: If linux, install prophet on python < 3.9
+        if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.8'
        run: |
          pip install -e .[forecast]
      - name: Install vw on python < 3.10
-        if: matrix.python-version != '3.10'
+        if: matrix.python-version == '3.8' || matrix.python-version == '3.9'
        run: |
          pip install -e .[vw]
-      - name: Uninstall pyspark on (python 3.9) or (python 3.8 + windows)
-        if: matrix.python-version == '3.9' || (matrix.python-version == '3.8' && matrix.os == 'windows-2019')
-        run: |
-          # Uninstall pyspark to test env without pyspark
-          pip uninstall -y pyspark
      - name: Test with pytest
        if: matrix.python-version != '3.10'
        run: |
--- a/.gitignore
+++ b/.gitignore
@@ -163,5 +163,24 @@ output/
 flaml/tune/spark/mylearner.py
 *.pkl

+data/
+benchmark/pmlb/csv_datasets
+benchmark/*.csv
+
+checkpoints/
+test/default
+test/housing.json
+test/nlp/default/transformer_ms/seq-classification.json
+
+flaml/fabric/fanova/_fanova.c
 # local config files
 *.config.local
+
+local_debug/
+patch.diff
+
+# Test things
+notebook/lightning_logs/
+lightning_logs/
+flaml/autogen/extensions/tmp/
+test/autogen/my_tmp/
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -22,10 +22,28 @@ repos:
    - id: trailing-whitespace
    - id: end-of-file-fixer
    - id: no-commit-to-branch
+
+  - repo: https://github.com/asottile/pyupgrade
+    rev: v2.31.1
+    hooks:
+      - id: pyupgrade
+        args: [--py38-plus]
+        name: Upgrade code
+
  - repo: https://github.com/psf/black
    rev: 23.3.0
    hooks:
    - id: black
+
+  - repo: https://github.com/executablebooks/mdformat
+    rev: 0.7.17
+    hooks:
+      - id: mdformat
+        additional_dependencies:
+          - mdformat-gfm
+          - mdformat-black
+          - mdformat_frontmatter
+
  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.0.261
    hooks:
--- a/2
+++ b/2
@@ -1,5 +1,5 @@
 # basic setup
-FROM python:3.7
+FROM mcr.microsoft.com/devcontainers/python:3.8
 RUN apt-get update && apt-get -y update
 RUN apt-get install -y sudo git npm

--- a/NOTICE.md
+++ b/NOTICE.md
@@ -1,221 +1,222 @@
-NOTICES
+# NOTICES

 This repository incorporates material as listed below or described in the code.

-#
 ## Component. Ray.

-Code in tune/[analysis.py, sample.py, trial.py, result.py],
-searcher/[suggestion.py, variant_generator.py], and scheduler/trial_scheduler.py is adapted from
+Code in tune/\[analysis.py, sample.py, trial.py, result.py\],
+searcher/\[suggestion.py, variant_generator.py\], and scheduler/trial_scheduler.py is adapted from
 https://github.com/ray-project/ray/blob/master/python/ray/tune/

-
-
 ## Open Source License/Copyright Notice.

- Apache License
-                           Version 2.0, January 2004
-                        http://www.apache.org/licenses/
+Apache License
+Version 2.0, January 2004
+http://www.apache.org/licenses/

-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

-   1. Definitions.
+1. Definitions.

-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
+   "License" shall mean the terms and conditions for use, reproduction,
+   and distribution as defined by Sections 1 through 9 of this document.

-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
+   "Licensor" shall mean the copyright owner or entity authorized by
+   the copyright owner that is granting the License.

-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
+   "Legal Entity" shall mean the union of the acting entity and all
+   other entities that control, are controlled by, or are under common
+   control with that entity. For the purposes of this definition,
+   "control" means (i) the power, direct or indirect, to cause the
+   direction or management of such entity, whether by contract or
+   otherwise, or (ii) ownership of fifty percent (50%) or more of the
+   outstanding shares, or (iii) beneficial ownership of such entity.

-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
+   "You" (or "Your") shall mean an individual or Legal Entity
+   exercising permissions granted by this License.

-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
+   "Source" form shall mean the preferred form for making modifications,
+   including but not limited to software source code, documentation
+   source, and configuration files.

-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
+   "Object" form shall mean any form resulting from mechanical
+   transformation or translation of a Source form, including but
+   not limited to compiled object code, generated documentation,
+   and conversions to other media types.

-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
+   "Work" shall mean the work of authorship, whether in Source or
+   Object form, made available under the License, as indicated by a
+   copyright notice that is included in or attached to the work
+   (an example is provided in the Appendix below).

-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
+   "Derivative Works" shall mean any work, whether in Source or Object
+   form, that is based on (or derived from) the Work and for which the
+   editorial revisions, annotations, elaborations, or other modifications
+   represent, as a whole, an original work of authorship. For the purposes
+   of this License, Derivative Works shall not include works that remain
+   separable from, or merely link (or bind by name) to the interfaces of,
+   the Work and Derivative Works thereof.

-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
+   "Contribution" shall mean any work of authorship, including
+   the original version of the Work and any modifications or additions
+   to that Work or Derivative Works thereof, that is intentionally
+   submitted to Licensor for inclusion in the Work by the copyright owner
+   or by an individual or Legal Entity authorized to submit on behalf of
+   the copyright owner. For the purposes of this definition, "submitted"
+   means any form of electronic, verbal, or written communication sent
+   to the Licensor or its representatives, including but not limited to
+   communication on electronic mailing lists, source code control systems,
+   and issue tracking systems that are managed by, or on behalf of, the
+   Licensor for the purpose of discussing and improving the Work, but
+   excluding communication that is conspicuously marked or otherwise
+   designated in writing by the copyright owner as "Not a Contribution."

-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
+   "Contributor" shall mean Licensor and any individual or Legal Entity
+   on behalf of whom a Contribution has been received by Licensor and
+   subsequently incorporated within the Work.

-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
+1. Grant of Copyright License. Subject to the terms and conditions of
+   this License, each Contributor hereby grants to You a perpetual,
+   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+   copyright license to reproduce, prepare Derivative Works of,
+   publicly display, publicly perform, sublicense, and distribute the
+   Work and such Derivative Works in Source or Object form.

-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
+1. Grant of Patent License. Subject to the terms and conditions of
+   this License, each Contributor hereby grants to You a perpetual,
+   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+   (except as stated in this section) patent license to make, have made,
+   use, offer to sell, sell, import, and otherwise transfer the Work,
+   where such license applies only to those patent claims licensable
+   by such Contributor that are necessarily infringed by their
+   Contribution(s) alone or by combination of their Contribution(s)
+   with the Work to which such Contribution(s) was submitted. If You
+   institute patent litigation against any entity (including a
+   cross-claim or counterclaim in a lawsuit) alleging that the Work
+   or a Contribution incorporated within the Work constitutes direct
+   or contributory patent infringement, then any patent licenses
+   granted to You under this License for that Work shall terminate
+   as of the date such litigation is filed.

-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
+1. Redistribution. You may reproduce and distribute copies of the
+   Work or Derivative Works thereof in any medium, with or without
+   modifications, and in Source or Object form, provided that You
+   meet the following conditions:

-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
+   (a) You must give any other recipients of the Work or
+   Derivative Works a copy of this License; and

-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
+   (b) You must cause any modified files to carry prominent notices
+   stating that You changed the files; and

-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
+   (c) You must retain, in the Source form of any Derivative Works
+   that You distribute, all copyright, patent, trademark, and
+   attribution notices from the Source form of the Work,
+   excluding those notices that do not pertain to any part of
+   the Derivative Works; and

-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
+   (d) If the Work includes a "NOTICE" text file as part of its
+   distribution, then any Derivative Works that You distribute must
+   include a readable copy of the attribution notices contained
+   within such NOTICE file, excluding those notices that do not
+   pertain to any part of the Derivative Works, in at least one
+   of the following places: within a NOTICE text file distributed
+   as part of the Derivative Works; within the Source form or
+   documentation, if provided along with the Derivative Works; or,
+   within a display generated by the Derivative Works, if and
+   wherever such third-party notices normally appear. The contents
+   of the NOTICE file are for informational purposes only and
+   do not modify the License. You may add Your own attribution
+   notices within Derivative Works that You distribute, alongside
+   or as an addendum to the NOTICE text from the Work, provided
+   that such additional attribution notices cannot be construed
+   as modifying the License.

-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
+   You may add Your own copyright statement to Your modifications and
+   may provide additional or different license terms and conditions
+   for use, reproduction, or distribution of Your modifications, or
+   for any such Derivative Works as a whole, provided Your use,
+   reproduction, and distribution of the Work otherwise complies with
+   the conditions stated in this License.

-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
+1. Submission of Contributions. Unless You explicitly state otherwise,
+   any Contribution intentionally submitted for inclusion in the Work
+   by You to the Licensor shall be under the terms and conditions of
+   this License, without any additional terms or conditions.
+   Notwithstanding the above, nothing herein shall supersede or modify
+   the terms of any separate license agreement you may have executed
+   with Licensor regarding such Contributions.

-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
+1. Trademarks. This License does not grant permission to use the trade
+   names, trademarks, service marks, or product names of the Licensor,
+   except as required for reasonable and customary use in describing the
+   origin of the Work and reproducing the content of the NOTICE file.

-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
+1. Disclaimer of Warranty. Unless required by applicable law or
+   agreed to in writing, Licensor provides the Work (and each
+   Contributor provides its Contributions) on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+   implied, including, without limitation, any warranties or conditions
+   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+   PARTICULAR PURPOSE. You are solely responsible for determining the
+   appropriateness of using or redistributing the Work and assume any
+   risks associated with Your exercise of permissions under this License.

-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
+1. Limitation of Liability. In no event and under no legal theory,
+   whether in tort (including negligence), contract, or otherwise,
+   unless required by applicable law (such as deliberate and grossly
+   negligent acts) or agreed to in writing, shall any Contributor be
+   liable to You for damages, including any direct, indirect, special,
+   incidental, or consequential damages of any character arising as a
+   result of this License or out of the use or inability to use the
+   Work (including but not limited to damages for loss of goodwill,
+   work stoppage, computer failure or malfunction, or any and all
+   other commercial damages or losses), even if such Contributor
+   has been advised of the possibility of such damages.

-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
+1. Accepting Warranty or Additional Liability. While redistributing
+   the Work or Derivative Works thereof, You may choose to offer,
+   and charge a fee for, acceptance of support, warranty, indemnity,
+   or other liability obligations and/or rights consistent with this
+   License. However, in accepting such obligations, You may act only
+   on Your own behalf and on Your sole responsibility, not on behalf
+   of any other Contributor, and only if You agree to indemnify,
+   defend, and hold each Contributor harmless for any liability
+   incurred by, or claims asserted against, such Contributor by reason
+   of your accepting any such warranty or additional liability.

-   END OF TERMS AND CONDITIONS
+END OF TERMS AND CONDITIONS

-   APPENDIX: How to apply the Apache License to your work.
+APPENDIX: How to apply the Apache License to your work.

-      To apply the Apache License to your work, attach the following
-      boilerplate notice, with the fields enclosed by brackets "{}"
-      replaced with your own identifying information. (Don't include
-      the brackets!)  The text should be enclosed in the appropriate
-      comment syntax for the file format. We also recommend that a
-      file or class name and description of purpose be included on the
-      same "printed page" as the copyright notice for easier
-      identification within third-party archives.
+```
+  To apply the Apache License to your work, attach the following
+  boilerplate notice, with the fields enclosed by brackets "{}"
+  replaced with your own identifying information. (Don't include
+  the brackets!)  The text should be enclosed in the appropriate
+  comment syntax for the file format. We also recommend that a
+  file or class name and description of purpose be included on the
+  same "printed page" as the copyright notice for easier
+  identification within third-party archives.
+```

-   Copyright {yyyy} {name of copyright owner}
+Copyright {yyyy} {name of copyright owner}

-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at

-       http://www.apache.org/licenses/LICENSE-2.0
+```
+   http://www.apache.org/licenses/LICENSE-2.0
+```

-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.

--------------------------------------------------------------------------------
+______________________________________________________________________

 Code in python/ray/rllib/{evolution_strategies, dqn} adapted from
 https://github.com/openai (MIT License)
@@ -240,7 +241,7 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 THE SOFTWARE.

--------------------------------------------------------------------------------
+______________________________________________________________________

 Code in python/ray/rllib/impala/vtrace.py from
 https://github.com/deepmind/scalable_agent
@@ -251,7 +252,9 @@ Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

-    https://www.apache.org/licenses/LICENSE-2.0
+```
+https://www.apache.org/licenses/LICENSE-2.0
+```

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
@@ -259,7 +262,8 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.

--------------------------------------------------------------------------------
+______________________________________________________________________
+
 Code in python/ray/rllib/ars is adapted from https://github.com/modestyachts/ARS

 Copyright (c) 2018, ARS contributors (Horia Mania, Aurelia Guy, Benjamin Recht)
@@ -269,11 +273,11 @@ Redistribution and use of ARS in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:

 1. Redistributions of source code must retain the above copyright notice, this
-list of conditions and the following disclaimer.
+   list of conditions and the following disclaimer.

-2. Redistributions in binary form must reproduce the above copyright notice,
-this list of conditions and the following disclaimer in the documentation and/or
-other materials provided with the distribution.
+1. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation and/or
+   other materials provided with the distribution.

 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
@@ -286,5 +290,6 @@ ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

------------------
-Code in python/ray/_private/prometheus_exporter.py is adapted from https://github.com/census-instrumentation/opencensus-python/blob/master/contrib/opencensus-ext-prometheus/opencensus/ext/prometheus/stats_exporter/__init__.py
+______________________________________________________________________
+
+Code in python/ray/\_private/prometheus_exporter.py is adapted from https://github.com/census-instrumentation/opencensus-python/blob/master/contrib/opencensus-ext-prometheus/opencensus/ext/prometheus/stats_exporter/__init__.py
--- a/README.md
+++ b/README.md
@@ -1,11 +1,11 @@
 [![PyPI version](https://badge.fury.io/py/FLAML.svg)](https://badge.fury.io/py/FLAML)
 ![Conda version](https://img.shields.io/conda/vn/conda-forge/flaml)
 [![Build](https://github.com/microsoft/FLAML/actions/workflows/python-package.yml/badge.svg)](https://github.com/microsoft/FLAML/actions/workflows/python-package.yml)
-![Python Version](https://img.shields.io/badge/3.8%20%7C%203.9%20%7C%203.10-blue)
+[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/FLAML)](https://pypi.org/project/FLAML/)
 [![Downloads](https://pepy.tech/badge/flaml)](https://pepy.tech/project/flaml)
 [![](https://img.shields.io/discord/1025786666260111483?logo=discord&style=flat)](https://discord.gg/Cppx2vSPVP)
-<!-- [![Join the chat at https://gitter.im/FLAMLer/community](https://badges.gitter.im/FLAMLer/community.svg)](https://gitter.im/FLAMLer/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) -->

+<!-- [![Join the chat at https://gitter.im/FLAMLer/community](https://badges.gitter.im/FLAMLer/community.svg)](https://gitter.im/FLAMLer/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) -->

 # A Fast Library for Automated Machine Learning & Tuning

@@ -14,6 +14,8 @@
    <br>
 </p>

+:fire: FLAML supports AutoML and Hyperparameter Tuning in [Microsoft Fabric Data Science](https://learn.microsoft.com/en-us/fabric/data-science/automated-machine-learning-fabric). In addition, we've introduced Python 3.11 support, along with a range of new estimators, and comprehensive integration with MLflow—thanks to contributions from the Microsoft Fabric product team.
+
 :fire: Heads-up: We have migrated [AutoGen](https://microsoft.github.io/autogen/) into a dedicated [github repository](https://github.com/microsoft/autogen). Alongside this move, we have also launched a dedicated [Discord](https://discord.gg/pAbnFJrkgZ) server and a [website](https://microsoft.github.io/autogen/) for comprehensive documentation.

 :fire: The automated multi-agent chat framework in [AutoGen](https://microsoft.github.io/autogen/) is in preview from v2.0.0.
@@ -22,17 +24,15 @@

 :fire: [autogen](https://microsoft.github.io/autogen/) is released with support for ChatGPT and GPT-4, based on [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673).

-:fire: FLAML supports Code-First AutoML & Tuning – Private Preview in [Microsoft Fabric Data Science](https://learn.microsoft.com/en-us/fabric/data-science/).
-
-
 ## What is FLAML
+
 FLAML is a lightweight Python library for efficient automation of machine
 learning and AI operations. It automates workflow based on large language models, machine learning models, etc.
 and optimizes their performance.

-* FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
-* For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
-* It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.
+- FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
+- For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
+- It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.

 FLAML is powered by a series of [research studies](https://microsoft.github.io/FLAML/docs/Research/) from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.

@@ -47,6 +47,7 @@ pip install flaml
 ```

 Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`autogen`](https://microsoft.github.io/autogen/) package.
+
 ```bash
 pip install "flaml[autogen]"
 ```
@@ -56,18 +57,24 @@ Each of the [`notebook examples`](https://github.com/microsoft/FLAML/tree/main/n

 ## Quickstart

-* (New) The [autogen](https://microsoft.github.io/autogen/) package enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
-It offers customizable and conversable agents which integrate LLMs, tools and human.
-By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
+- (New) The [autogen](https://microsoft.github.io/autogen/) package enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
+  It offers customizable and conversable agents which integrate LLMs, tools and human.
+  By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
+
 ```python
 from flaml import autogen
+
 assistant = autogen.AssistantAgent("assistant")
 user_proxy = autogen.UserProxyAgent("user_proxy")
-user_proxy.initiate_chat(assistant, message="Show me the YTD gain of 10 largest technology companies as of today.")
+user_proxy.initiate_chat(
+    assistant,
+    message="Show me the YTD gain of 10 largest technology companies as of today.",
+)
 # This initiates an automated chat between the two agents to solve the task
 ```

 Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
+
 ```python
 # perform tuning
 config, analysis = autogen.Completion.tune(
@@ -82,30 +89,32 @@ config, analysis = autogen.Completion.tune(
 # perform inference for a test instance
 response = autogen.Completion.create(context=test_instance, **config)
 ```
-* With three lines of code, you can start using this economical and fast
-AutoML engine as a [scikit-learn style estimator](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML).
+
+- With three lines of code, you can start using this economical and fast
+  AutoML engine as a [scikit-learn style estimator](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML).

 ```python
 from flaml import AutoML
+
 automl = AutoML()
 automl.fit(X_train, y_train, task="classification")
 ```

-* You can restrict the learners and use FLAML as a fast hyperparameter tuning
-tool for XGBoost, LightGBM, Random Forest etc. or a [customized learner](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#estimator-and-search-space).
+- You can restrict the learners and use FLAML as a fast hyperparameter tuning
+  tool for XGBoost, LightGBM, Random Forest etc. or a [customized learner](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#estimator-and-search-space).

 ```python
 automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])
 ```

-* You can also run generic hyperparameter tuning for a [custom function](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function).
+- You can also run generic hyperparameter tuning for a [custom function](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function).

 ```python
 from flaml import tune
 tune.run(evaluation_function, config={…}, low_cost_partial_config={…}, time_budget_s=3600)
 ```

-* [Zero-shot AutoML](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML) allows using the existing training API from lightgbm, xgboost etc. while getting the benefit of AutoML in choosing high-performance hyperparameter configurations per task.
+- [Zero-shot AutoML](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML) allows using the existing training API from lightgbm, xgboost etc. while getting the benefit of AutoML in choosing high-performance hyperparameter configurations per task.

 ```python
 from flaml.default import LGBMRegressor
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -4,7 +4,7 @@

 Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).

-If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below.
+If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](<https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)>), please report it to us as described below.

 ## Reporting Security Issues

@@ -18,13 +18,13 @@ You should receive a response within 24 hours. If for some reason you do not, pl

 Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:

-  * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
-  * Full paths of source file(s) related to the manifestation of the issue
-  * The location of the affected source code (tag/branch/commit or direct URL)
-  * Any special configuration required to reproduce the issue
-  * Step-by-step instructions to reproduce the issue
-  * Proof-of-concept or exploit code (if possible)
-  * Impact of the issue, including how an attacker might exploit the issue
+- Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
+- Full paths of source file(s) related to the manifestation of the issue
+- The location of the affected source code (tag/branch/commit or direct URL)
+- Any special configuration required to reproduce the issue
+- Step-by-step instructions to reproduce the issue
+- Proof-of-concept or exploit code (if possible)
+- Impact of the issue, including how an attacker might exploit the issue

 This information will help us triage your report more quickly.

--- a/flaml/init.py
+++ b/flaml/init.py
@@ -1,6 +1,11 @@
 import logging

-from flaml.automl import AutoML, logger_formatter
+try:
+    from flaml.automl import AutoML, logger_formatter
+
+    has_automl = True
+except ImportError:
+    has_automl = False
 from flaml.onlineml.autovw import AutoVW
 from flaml.tune.searcher import CFO, FLOW2, BlendSearch, BlendSearchTuner, RandomSearch
 from flaml.version import __version__
@@ -8,3 +13,6 @@ from flaml.version import __version__
 # Set the root logger.
 logger = logging.getLogger(__name__)
 logger.setLevel(logging.INFO)
+
+if not has_automl:
+    logger.warning("flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.")
--- a/flaml/autogen/agentchat/agent.py
+++ b/flaml/autogen/agentchat/agent.py
@@ -25,10 +25,10 @@ class Agent:
        return self._name

    def send(self, message: Union[Dict, str], recipient: "Agent", request_reply: Optional[bool] = None):
-        """(Aabstract method) Send a message to another agent."""
+        """(Abstract method) Send a message to another agent."""

    async def a_send(self, message: Union[Dict, str], recipient: "Agent", request_reply: Optional[bool] = None):
-        """(Aabstract async method) Send a message to another agent."""
+        """(Abstract async method) Send a message to another agent."""

    def receive(self, message: Union[Dict, str], sender: "Agent", request_reply: Optional[bool] = None):
        """(Abstract method) Receive a message from another agent."""
--- a/flaml/autogen/agentchat/assistant_agent.py
+++ b/flaml/autogen/agentchat/assistant_agent.py
@@ -4,24 +4,24 @@ from .conversable_agent import ConversableAgent


 class AssistantAgent(ConversableAgent):
-    """(In preview) Assistant agent, designed to solve a task with LLM.
+    """(In preview) Assistant agent, designed to solve tasks with LLM.

    AssistantAgent is a subclass of ConversableAgent configured with a default system message.
-    The default system message is designed to solve a task with LLM,
-    including suggesting python code blocks and debugging.
-    `human_input_mode` is default to "NEVER"
-    and `code_execution_config` is default to False.
-    This agent doesn't execute code by default, and expects the user to execute the code.
+    The default system message is designed to solve tasks with LLM,
+    including suggesting Python code blocks and debugging.
+    `human_input_mode` defaults to "NEVER"
+    and `code_execution_config` defaults to False.
+    This agent doesn't execute code by default and expects the user to execute the code.
    """

    DEFAULT_SYSTEM_MESSAGE = """You are a helpful AI assistant.
 Solve tasks using your coding and language skills.
-In the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.
+In the following cases, suggest Python code (in a Python coding block) or shell script (in an sh coding block) for the user to execute.
    1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.
    2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.
 Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
 When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
-If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
+If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use the 'print' function for the output when relevant. Check the execution result returned by the user.
 If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
 When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.
 Reply "TERMINATE" in the end when everything is done.
@@ -36,23 +36,23 @@ Reply "TERMINATE" in the end when everything is done.
        max_consecutive_auto_reply: Optional[int] = None,
        human_input_mode: Optional[str] = "NEVER",
        code_execution_config: Optional[Union[Dict, bool]] = False,
-        **kwargs,
+        **kwargs: Dict,
    ):
        """
        Args:
-            name (str): agent name.
-            system_message (str): system message for the ChatCompletion inference.
-                Please override this attribute if you want to reprogram the agent.
-            llm_config (dict): llm inference configuration.
-                Please refer to [autogen.Completion.create](/docs/reference/autogen/oai/completion#create)
+            name (str): Agent name.
+            system_message (Optional[str]): System message for the ChatCompletion inference.
+                Override this attribute if you want to reprogram the agent.
+            llm_config (Optional[Union[Dict, bool]]): LLM inference configuration.
+                Refer to [autogen.Completion.create](/docs/reference/autogen/oai/completion#create)
                for available options.
-            is_termination_msg (function): a function that takes a message in the form of a dictionary
+            is_termination_msg (Optional[Callable[[Dict], bool]]): A function that takes a message in the form of a dictionary
                and returns a boolean value indicating if this received message is a termination message.
                The dict can contain the following keys: "content", "role", "name", "function_call".
-            max_consecutive_auto_reply (int): the maximum number of consecutive auto replies.
-                default to None (no limit provided, class attribute MAX_CONSECUTIVE_AUTO_REPLY will be used as the limit in this case).
+            max_consecutive_auto_reply (Optional[int]): The maximum number of consecutive auto replies.
+                Defaults to None (no limit provided, class attribute MAX_CONSECUTIVE_AUTO_REPLY will be used as the limit in this case).
                The limit only plays a role when human_input_mode is not "ALWAYS".
-            **kwargs (dict): Please refer to other kwargs in
+            **kwargs (Dict): Additional keyword arguments. Refer to other kwargs in
                [ConversableAgent](conversable_agent#__init__).
        """
        super().__init__(
--- a/flaml/autogen/code_utils.py
+++ b/flaml/autogen/code_utils.py
@@ -125,7 +125,7 @@ def improve_function(file_name, func_name, objective, **config):
    """(work in progress) Improve the function to achieve the objective."""
    params = {**_IMPROVE_FUNCTION_CONFIG, **config}
    # read the entire file into a str
-    with open(file_name, "r") as f:
+    with open(file_name) as f:
        file_string = f.read()
    response = oai.Completion.create(
        {"func_name": func_name, "objective": objective, "file_string": file_string}, **params
@@ -158,7 +158,7 @@ def improve_code(files, objective, suggest_only=True, **config):
    code = ""
    for file_name in files:
        # read the entire file into a string
-        with open(file_name, "r") as f:
+        with open(file_name) as f:
            file_string = f.read()
        code += f"""{file_name}:
 {file_string}
--- a/flaml/autogen/math_utils.py
+++ b/flaml/autogen/math_utils.py
@@ -130,7 +130,7 @@ def _fix_a_slash_b(string: str) -> str:
    try:
        a = int(a_str)
        b = int(b_str)
-        assert string == "{}/{}".format(a, b)
+        assert string == f"{a}/{b}"
        new_string = "\\frac{" + str(a) + "}{" + str(b) + "}"
        return new_string
    except Exception:
--- a/flaml/autogen/retrieve_utils.py
+++ b/flaml/autogen/retrieve_utils.py
@@ -126,7 +126,7 @@ def split_files_to_chunks(
    """Split a list of files into chunks of max_tokens."""
    chunks = []
    for file in files:
-        with open(file, "r") as f:
+        with open(file) as f:
            text = f.read()
        chunks += split_text_to_chunks(text, max_tokens, chunk_mode, must_break_at_empty_line)
    return chunks
--- a/flaml/automl/automl.py
+++ b/flaml/automl/automl.py
@@ -7,6 +7,7 @@ from __future__ import annotations
 import json
 import logging
 import os
+import random
 import sys
 import time
 from functools import partial
@@ -16,7 +17,7 @@ import numpy as np

 from flaml import tune
 from flaml.automl.logger import logger, logger_formatter
-from flaml.automl.ml import train_estimator
+from flaml.automl.ml import huggingface_metric_to_mode, sklearn_metric_name_set, spark_metric_name_dict, train_estimator
 from flaml.automl.spark import DataFrame, Series, psDataFrame, psSeries
 from flaml.automl.state import AutoMLState, SearchState
 from flaml.automl.task.factory import task_factory
@@ -45,6 +46,7 @@ ERROR = (

 try:
    from sklearn.base import BaseEstimator
+    from sklearn.pipeline import Pipeline
 except ImportError:
    BaseEstimator = object
    ERROR = ERROR or ImportError("please install flaml[automl] option to use the flaml.automl package.")
@@ -54,6 +56,14 @@ try:
 except ImportError:
    mlflow = None

+try:
+    from flaml.fabric.mlflow import MLflowIntegration, get_mlflow_log_latency, infer_signature, is_autolog_enabled
+
+    internal_mlflow = True
+except ImportError:
+    internal_mlflow = False
+
+
 try:
    from ray import __version__ as ray_version

@@ -171,7 +181,7 @@ class AutoML(BaseEstimator):
                'better' only logs configs with better loss than previos iters
                'all' logs all the tried configs.
            model_history: A boolean of whether to keep the best
-                model per estimator. Make sure memory is large enough if setting to True.
+                model per estimator. Make sure memory is large enough if setting to True. Default False.
            log_training_metric: A boolean of whether to log the training
                metric for each model.
            mem_thres: A float of the memory size constraint in bytes.
@@ -212,9 +222,9 @@ class AutoML(BaseEstimator):
                    - if "data:path" use data-dependent defaults which are stored at path;
                    - if "static", use data-independent defaults.
                If dict, keys are the name of the estimators, and values are the starting
-                hyperparamter configurations for the corresponding estimators.
-                The value can be a single hyperparamter configuration dict or a list
-                of hyperparamter configuration dicts.
+                hyperparameter configurations for the corresponding estimators.
+                The value can be a single hyperparameter configuration dict or a list
+                of hyperparameter configuration dicts.
                In the following code example, we get starting_points from the
                `automl` object and use them in the `new_automl` object.
                e.g.,
@@ -247,7 +257,10 @@ class AutoML(BaseEstimator):
                search is considered to converge.
            force_cancel: boolean, default=False | Whether to forcely cancel Spark jobs if the
                search time exceeded the time budget.
-            append_log: boolean, default=False | Whether to directly append the log
+            mlflow_exp_name: str, default=None | The name of the mlflow experiment. This should be specified if
+                enable mlflow autologging on Spark. Otherwise it will log all the results into the experiment of the
+                same name as the basename of main entry file.
+            append_log: boolean, default=False | Whetehr to directly append the log
                records to the input log file if it exists.
            auto_augment: boolean, default=True | Whether to automatically
                augment rare classes.
@@ -320,9 +333,7 @@ class AutoML(BaseEstimator):
            }
        }
        ```
-            mlflow_logging: boolean, default=True | Whether to log the training results to mlflow.
-                This requires mlflow to be installed and to have an active mlflow run.
-                FLAML will create nested runs.
+            mlflow_logging: boolean, default=True | Whether to log the training results to mlflow. Not valid if mlflow is not installed.

        """
        if ERROR:
@@ -331,6 +342,8 @@ class AutoML(BaseEstimator):
        self._state = AutoMLState()
        self._state.learner_classes = {}
        self._settings = settings
+        self._automl_user_configurations = settings.copy()
+        self._settings.pop("automl_user_configurations", None)
        # no budget by default
        settings["time_budget"] = settings.get("time_budget", -1)
        settings["task"] = settings.get("task", "classification")
@@ -362,6 +375,7 @@ class AutoML(BaseEstimator):
        settings["preserve_checkpoint"] = settings.get("preserve_checkpoint", True)
        settings["early_stop"] = settings.get("early_stop", False)
        settings["force_cancel"] = settings.get("force_cancel", False)
+        settings["mlflow_exp_name"] = settings.get("mlflow_exp_name", None)
        settings["append_log"] = settings.get("append_log", False)
        settings["min_sample_size"] = settings.get("min_sample_size", MIN_SAMPLE_TRAIN)
        settings["use_ray"] = settings.get("use_ray", False)
@@ -377,6 +391,7 @@ class AutoML(BaseEstimator):
        settings["mlflow_logging"] = settings.get("mlflow_logging", True)

        self._estimator_type = "classifier" if settings["task"] in CLASSIFICATION else "regressor"
+        self.best_run_id = None

    def get_params(self, deep: bool = False) -> dict:
        return self._settings.copy()
@@ -475,14 +490,29 @@ class AutoML(BaseEstimator):
        with open(filename, "w") as f:
            json.dump(best, f)

+    @property
+    def supported_metrics(self):
+        """
+        Returns a tuple of supported metrics for the task.
+
+            Returns:
+                    metrics (Tuple): sklearn metrics from sklearn package;
+                                    huggingface metrics from datasets package;
+                                    spark metrics from pyspark package
+
+        """
+
+        return sklearn_metric_name_set, huggingface_metric_to_mode.keys(), spark_metric_name_dict
+
    @property
    def feature_transformer(self):
-        """Returns feature transformer which is used to preprocess data before applying training or inference."""
-        return getattr(self, "_transformer", None)
+        """Returns AutoML Transformer"""
+        data_precessor = getattr(self, "_transformer", None)
+        return data_precessor

    @property
    def label_transformer(self):
-        """Returns label transformer which is used to preprocess labels before scoring, and inverse transform labels after inference."""
+        """Returns AutoML label transformer"""
        return getattr(self, "_label_transformer", None)

    @property
@@ -521,8 +551,8 @@ class AutoML(BaseEstimator):

    def score(
        self,
-        X: Union[DataFrame, psDataFrame],
-        y: Union[Series, psSeries],
+        X: DataFrame | psDataFrame,
+        y: Series | psSeries,
        **kwargs,
    ):
        estimator = getattr(self, "_trained_estimator", None)
@@ -536,7 +566,7 @@ class AutoML(BaseEstimator):

    def predict(
        self,
-        X: Union[np.array, DataFrame, List[str], List[List[str]], psDataFrame],
+        X: np.array | DataFrame | list[str] | list[list[str]] | psDataFrame,
        **pred_kwargs,
    ):
        """Predict label from features.
@@ -611,7 +641,7 @@ class AutoML(BaseEstimator):
        """
        self._state.learner_classes[learner_name] = learner_class

-    def get_estimator_from_log(self, log_file_name: str, record_id: int, task: Union[str, Task]):
+    def get_estimator_from_log(self, log_file_name: str, record_id: int, task: str | Task):
        """Get the estimator from log file.

        Args:
@@ -653,7 +683,7 @@ class AutoML(BaseEstimator):
        dataframe=None,
        label=None,
        time_budget=np.inf,
-        task: Optional[Union[str, Task]] = None,
+        task: str | Task | None = None,
        eval_method=None,
        split_ratio=None,
        n_splits=None,
@@ -779,7 +809,7 @@ class AutoML(BaseEstimator):
                    max_epochs: int, default = 20 | Maximum number of epochs to run training,
                        only used by TemporalFusionTransformerEstimator.
                    batch_size: int, default = 64 | Batch size for training model, only
-                        used by TemporalFusionTransformerEstimator.
+                        used by TemporalFusionTransformerEstimator and TCNEstimator.
        """
        task = task or self._settings.get("task")
        if isinstance(task, str):
@@ -802,7 +832,7 @@ class AutoML(BaseEstimator):
        )
        task.validate_data(self, self._state, X_train, y_train, dataframe, label, groups=groups)

-        logger.info("log file name {}".format(log_file_name))
+        logger.info(f"log file name {log_file_name}")

        best_config = None
        best_val_loss = float("+inf")
@@ -855,9 +885,7 @@ class AutoML(BaseEstimator):
        else:
            self._state.fit_kwargs_by_estimator[best_estimator] = self._state.fit_kwargs

-        logger.info(
-            "estimator = {}, config = {}, #training instances = {}".format(best_estimator, best_config, sample_size)
-        )
+        logger.info(f"estimator = {best_estimator}, config = {best_config}, #training instances = {sample_size}")
        # Partially copied from fit() function
        # Initilize some attributes required for retrain_from_log
        self._split_type = task.decide_split_type(
@@ -1028,7 +1056,7 @@ class AutoML(BaseEstimator):
        return points

    @property
-    def resource_attr(self) -> Optional[str]:
+    def resource_attr(self) -> str | None:
        """Attribute of the resource dimension.

        Returns:
@@ -1038,7 +1066,7 @@ class AutoML(BaseEstimator):
        return "FLAML_sample_size" if self._sample else None

    @property
-    def min_resource(self) -> Optional[float]:
+    def min_resource(self) -> float | None:
        """Attribute for pruning.

        Returns:
@@ -1047,7 +1075,7 @@ class AutoML(BaseEstimator):
        return self._min_sample_size if self._sample else None

    @property
-    def max_resource(self) -> Optional[float]:
+    def max_resource(self) -> float | None:
        """Attribute for pruning.

        Returns:
@@ -1069,7 +1097,7 @@ class AutoML(BaseEstimator):
            pickle.dump(self, f, pickle.HIGHEST_PROTOCOL)

    @property
-    def trainable(self) -> Callable[[dict], Optional[float]]:
+    def trainable(self) -> Callable[[dict], float | None]:
        """Training function.
        Returns:
            A function that evaluates each config and returns the loss.
@@ -1155,7 +1183,7 @@ class AutoML(BaseEstimator):
        dataframe=None,
        label=None,
        metric=None,
-        task: Optional[Union[str, Task]] = None,
+        task: str | Task | None = None,
        n_jobs=None,
        # gpu_per_trial=0,
        log_file_name=None,
@@ -1203,6 +1231,7 @@ class AutoML(BaseEstimator):
        skip_transform=None,
        mlflow_logging=None,
        fit_kwargs_by_estimator=None,
+        mlflow_exp_name=None,
        **fit_kwargs,
    ):
        """Find a model for a given task.
@@ -1296,7 +1325,7 @@ class AutoML(BaseEstimator):
                'all' logs all the tried configs.
            model_history: A boolean of whether to keep the trained best
                model per estimator. Make sure memory is large enough if setting to True.
-                Default value is False: best_model_for_estimator would return a
+                Default value is False. If False, best_model_for_estimator would return a
                untrained model for non-best learner.
            log_training_metric: A boolean of whether to log the training
                metric for each model.
@@ -1348,9 +1377,9 @@ class AutoML(BaseEstimator):
                    - if "data:path" use data-dependent defaults which are stored at path;
                    - if "static", use data-independent defaults.
                If dict, keys are the name of the estimators, and values are the starting
-                hyperparamter configurations for the corresponding estimators.
-                The value can be a single hyperparamter configuration dict or a list
-                of hyperparamter configuration dicts.
+                hyperparameter configurations for the corresponding estimators.
+                The value can be a single hyperparameter configuration dict or a list
+                of hyperparameter configuration dicts.
                In the following code example, we get starting_points from the
                `automl` object and use them in the `new_automl` object.
                e.g.,
@@ -1382,7 +1411,10 @@ class AutoML(BaseEstimator):
            early_stop: boolean, default=False | Whether to stop early if the
                search is considered to converge.
            force_cancel: boolean, default=False | Whether to forcely cancel the PySpark job if overtime.
-            append_log: boolean, default=False | Whether to directly append the log
+            mlflow_exp_name: str, default=None | The name of the mlflow experiment. This should be specified if
+                enable mlflow autologging on Spark. Otherwise it will log all the results into the experiment of the
+                same name as the basename of main entry file.
+            append_log: boolean, default=False | Whetehr to directly append the log
                records to the input log file if it exists.
            auto_augment: boolean, default=True | Whether to automatically
                augment rare classes.
@@ -1467,9 +1499,7 @@ class AutoML(BaseEstimator):
            skip_transform: boolean, default=False | Whether to pre-process data prior to modeling.
            mlflow_logging: boolean, default=None | Whether to log the training results to mlflow.
                Default value is None, which means the logging decision is made based on
-                AutoML.__init__'s mlflow_logging argument.
-                This requires mlflow to be installed and to have an active mlflow run.
-                FLAML will create nested runs.
+                AutoML.__init__'s mlflow_logging argument. Not valid if mlflow is not installed.
            fit_kwargs_by_estimator: dict, default=None | The user specified keywords arguments, grouped by estimator name.
                For TransformersEstimator, available fit_kwargs can be found from
                [TrainingArgumentsForAuto](nlp/huggingface/training_args).
@@ -1519,7 +1549,7 @@ class AutoML(BaseEstimator):
                    max_epochs: int, default = 20 | Maximum number of epochs to run training,
                        only used by TemporalFusionTransformerEstimator.
                    batch_size: int, default = 64 | Batch size for training model, only
-                        used by TemporalFusionTransformerEstimator.
+                        used by TemporalFusionTransformerEstimator and TCNEstimator.
        """

        self._state._start_time_flag = self._start_time_flag = time.time()
@@ -1570,6 +1600,7 @@ class AutoML(BaseEstimator):
        )
        early_stop = self._settings.get("early_stop") if early_stop is None else early_stop
        force_cancel = self._settings.get("force_cancel") if force_cancel is None else force_cancel
+        mlflow_exp_name = self._settings.get("mlflow_exp_name") if mlflow_exp_name is None else mlflow_exp_name
        # no search budget is provided?
        no_budget = time_budget < 0 and max_iter is None and not early_stop
        append_log = self._settings.get("append_log") if append_log is None else append_log
@@ -1622,7 +1653,6 @@ class AutoML(BaseEstimator):
        self._use_ray = use_ray
        # use the following condition if we have an estimation of average_trial_time and average_trial_overhead
        # self._use_ray = use_ray or n_concurrent_trials > ( average_trial_time + average_trial_overhead) / (average_trial_time)
-
        if self._use_ray is not False:
            import ray

@@ -1656,11 +1686,29 @@ class AutoML(BaseEstimator):
        self._state.fit_kwargs = fit_kwargs
        custom_hp = custom_hp or self._settings.get("custom_hp")
        self._skip_transform = self._settings.get("skip_transform") if skip_transform is None else skip_transform
-        self._mlflow_logging = self._settings.get("mlflow_logging") if mlflow_logging is None else mlflow_logging
+        self._mlflow_logging = (
+            False
+            if mlflow is None
+            else self._settings.get("mlflow_logging")
+            if mlflow_logging is None
+            else mlflow_logging
+        )
        fit_kwargs_by_estimator = fit_kwargs_by_estimator or self._settings.get("fit_kwargs_by_estimator")
        self._state.fit_kwargs_by_estimator = fit_kwargs_by_estimator.copy()  # shallow copy of fit_kwargs_by_estimator
        self._state.weight_val = sample_weight_val
-
+        self._mlflow_exp_name = mlflow_exp_name
+        self.mlflow_integration = None
+        self.autolog_extra_tag = {
+            "extra_tag.sid": f"flaml_{flaml_version}_{int(time.time())}_{random.randint(1001, 9999)}"
+        }
+        if internal_mlflow and self._mlflow_logging and (mlflow.active_run() or is_autolog_enabled()):
+            try:
+                self.mlflow_integration = MLflowIntegration("automl", mlflow_exp_name, extra_tag=self.autolog_extra_tag)
+                self._mlflow_exp_name = self.mlflow_integration.experiment_name
+                if not (mlflow.active_run() is not None or is_autolog_enabled()):
+                    self.mlflow_integration.only_history = True
+            except KeyError:
+                print("Not in Fabric, Skipped")
        task.validate_data(
            self,
            self._state,
@@ -1688,7 +1736,7 @@ class AutoML(BaseEstimator):
            logger.info(f"Data split method: {self._split_type}")
        eval_method = self._decide_eval_method(eval_method, time_budget)
        self._state.eval_method = eval_method
-        logger.info("Evaluation method: {}".format(eval_method))
+        logger.info(f"Evaluation method: {eval_method}")
        self._state.cv_score_agg_func = cv_score_agg_func or self._settings.get("cv_score_agg_func")

        self._retrain_in_budget = retrain_full == "budget" and (eval_method == "holdout" and self._state.X_val is None)
@@ -1705,13 +1753,9 @@ class AutoML(BaseEstimator):
                if sample_size:
                    _sample_size_from_starting_points[_estimator] = sample_size
                elif _point_per_estimator and isinstance(_point_per_estimator, list):
-                    _sample_size_set = set(
-                        [
-                            config["FLAML_sample_size"]
-                            for config in _point_per_estimator
-                            if "FLAML_sample_size" in config
-                        ]
-                    )
+                    _sample_size_set = {
+                        config["FLAML_sample_size"] for config in _point_per_estimator if "FLAML_sample_size" in config
+                    }
                    if _sample_size_set:
                        _sample_size_from_starting_points[_estimator] = min(_sample_size_set)
                    if len(_sample_size_set) > 1:
@@ -1729,6 +1773,11 @@ class AutoML(BaseEstimator):
        self._min_sample_size_input = min_sample_size
        self._prepare_data(eval_method, split_ratio, n_splits)

+        # infer the signature of the input/output data
+        if self.mlflow_integration is not None:
+            self.estimator_signature = infer_signature(self._state.X_train, self._state.y_train)
+            self.pipeline_signature = infer_signature(X_train, y_train, dataframe, label)
+
        # TODO pull this to task as decide_sample_size
        if isinstance(self._min_sample_size, dict):
            self._sample = {
@@ -1827,6 +1876,11 @@ class AutoML(BaseEstimator):
            and (max_iter > 0 or retrain_full is True)
            or max_iter == 1
        )
+        if self.mlflow_integration is not None and all(
+            [self.mlflow_integration.parent_run_id is None, not self.mlflow_integration.only_history]
+        ):
+            # force not retrain if no active run
+            self._state.retrain_final = False
        # add custom learner
        for estimator_name in estimator_list:
            if estimator_name not in self._state.learner_classes:
@@ -1898,7 +1952,7 @@ class AutoML(BaseEstimator):
                max_iter=max_iter / len(estimator_list) if self._learner_selector == "roundrobin" else max_iter,
                budget=self._state.time_budget,
            )
-        logger.info("List of ML learners in AutoML Run: {}".format(estimator_list))
+        logger.info(f"List of ML learners in AutoML Run: {estimator_list}")
        self.estimator_list = estimator_list
        self._active_estimators = estimator_list.copy()
        self._ensemble = ensemble
@@ -1940,7 +1994,7 @@ class AutoML(BaseEstimator):
                )
            ):
                logger.warning(
-                    "Time taken to find the best model is {0:.0f}% of the "
+                    "Time taken to find the best model is {:.0f}% of the "
                    "provided time budget and not all estimators' hyperparameter "
                    "search converged. Consider increasing the time budget.".format(
                        self._time_taken_best_iter / self._state.time_budget * 100
@@ -1959,6 +2013,8 @@ class AutoML(BaseEstimator):
            )  # NOTE: this is after kwargs is updated to fit_kwargs_by_estimator
            del self._state.groups, self._state.groups_all, self._state.groups_val
        logger.setLevel(old_level)
+        if self.mlflow_integration is not None:
+            self.mlflow_integration.resume_mlflow()

    def _search_parallel(self):
        if self._use_ray is not False:
@@ -2055,6 +2111,14 @@ class AutoML(BaseEstimator):

        if self._use_spark:
            # use spark as parallel backend
+            mlflow_log_latency = (
+                get_mlflow_log_latency(model_history=self._state.model_history) if self.mlflow_integration else 0
+            )
+            (
+                logger.info(f"Estimated mlflow_log_latency: {mlflow_log_latency} seconds.")
+                if mlflow_log_latency > 0
+                else None
+            )
            analysis = tune.run(
                self.trainable,
                search_alg=search_alg,
@@ -2067,6 +2131,9 @@ class AutoML(BaseEstimator):
                use_ray=False,
                use_spark=True,
                force_cancel=self._force_cancel,
+                mlflow_exp_name=self._mlflow_exp_name,
+                automl_info=(mlflow_log_latency,),  # pass automl info to tune.run
+                extra_tag=self.autolog_extra_tag,
                # raise_on_failed_trial=False,
                # keep_checkpoints_num=1,
                # checkpoint_score_attr="min-val_loss",
@@ -2127,6 +2194,8 @@ class AutoML(BaseEstimator):
                    self._search_states[estimator].best_config = config
                if better or self._log_type == "all":
                    self._log_trial(search_state, estimator)
+                if self.mlflow_integration:
+                    self.mlflow_integration.record_state(self, search_state, estimator)

    def _log_trial(self, search_state, estimator):
        if self._training_log:
@@ -2140,36 +2209,6 @@ class AutoML(BaseEstimator):
                estimator,
                search_state.sample_size,
            )
-        if self._mlflow_logging and mlflow is not None and mlflow.active_run():
-            with mlflow.start_run(nested=True):
-                mlflow.log_metric("iter_counter", self._track_iter)
-                if (search_state.metric_for_logging is not None) and (
-                    "intermediate_results" in search_state.metric_for_logging
-                ):
-                    for each_entry in search_state.metric_for_logging["intermediate_results"]:
-                        with mlflow.start_run(nested=True):
-                            mlflow.log_metrics(each_entry)
-                            mlflow.log_metric("iter_counter", self._iter_per_learner[estimator])
-                    del search_state.metric_for_logging["intermediate_results"]
-                if search_state.metric_for_logging:
-                    mlflow.log_metrics(search_state.metric_for_logging)
-                mlflow.log_metric("trial_time", search_state.trial_time)
-                mlflow.log_metric("wall_clock_time", self._state.time_from_start)
-                mlflow.log_metric("validation_loss", search_state.val_loss)
-                mlflow.log_params(search_state.config)
-                mlflow.log_param("learner", estimator)
-                mlflow.log_param("sample_size", search_state.sample_size)
-                mlflow.log_metric("best_validation_loss", search_state.best_loss)
-                mlflow.log_param("best_config", search_state.best_config)
-                mlflow.log_param("best_learner", self._best_estimator)
-                mlflow.log_metric(
-                    self._state.metric if isinstance(self._state.metric, str) else self._state.error_metric,
-                    1 - search_state.val_loss
-                    if self._state.error_metric.startswith("1-")
-                    else -search_state.val_loss
-                    if self._state.error_metric.startswith("-")
-                    else search_state.val_loss,
-                )

    def _search_sequential(self):
        try:
@@ -2323,10 +2362,19 @@ class AutoML(BaseEstimator):
                verbose=max(self.verbose - 3, 0),
                use_ray=False,
                use_spark=False,
+                force_cancel=self._force_cancel,
+                mlflow_exp_name=self._mlflow_exp_name,
+                automl_info=(0,),  # pass automl info to tune.run
+                extra_tag=self.autolog_extra_tag,
            )
            time_used = time.time() - start_run_time
            better = False
-            if analysis.trials:
+            (
+                logger.debug(f"result in automl: {analysis.trials}, {analysis.trials[-1].last_result}")
+                if analysis.trials
+                else logger.debug("result in automl: [], None")
+            )
+            if analysis.trials and analysis.trials[-1].last_result:
                result = analysis.trials[-1].last_result
                search_state.update(result, time_used=time_used)
                if self._estimator_index is None:
@@ -2388,6 +2436,8 @@ class AutoML(BaseEstimator):
                    search_state.trained_estimator.cleanup()
                if better or self._log_type == "all":
                    self._log_trial(search_state, estimator)
+                if self.mlflow_integration:
+                    self.mlflow_integration.record_state(self, search_state, estimator)

                logger.info(
                    " at {:.1f}s,\testimator {}'s best error={:.4f},\tbest estimator {}'s best error={:.4f}".format(
@@ -2440,7 +2490,7 @@ class AutoML(BaseEstimator):
                    state.best_config,
                    self.data_size_full,
                )
-                logger.info("retrain {} for {:.1f}s".format(self._best_estimator, retrain_time))
+                logger.info(f"retrain {self._best_estimator} for {retrain_time:.1f}s")
                self._retrained_config[best_config_sig] = state.best_config_train_time = retrain_time
                est_retrain_time = 0
            self._state.time_from_start = time.time() - self._start_time_flag
@@ -2462,8 +2512,8 @@ class AutoML(BaseEstimator):
        self._time_taken_best_iter = 0
        self._config_history = {}
        self._max_iter_per_learner = 10000
-        self._iter_per_learner = dict([(e, 0) for e in self.estimator_list])
-        self._iter_per_learner_fullsize = dict([(e, 0) for e in self.estimator_list])
+        self._iter_per_learner = {e: 0 for e in self.estimator_list}
+        self._iter_per_learner_fullsize = {e: 0 for e in self.estimator_list}
        self._fullsize_reached = False
        self._trained_estimator = None
        self._best_estimator = None
@@ -2488,6 +2538,12 @@ class AutoML(BaseEstimator):
            self._training_log.checkpoint()
        self._state.time_from_start = time.time() - self._start_time_flag
        if self._best_estimator:
+            if self.mlflow_integration:
+                self.mlflow_integration.log_automl(self)
+                if mlflow.active_run() is None:
+                    if self.mlflow_integration.parent_run_id is not None and self.mlflow_integration.autolog:
+                        # ensure result of retrain autolog to parent run
+                        mlflow.start_run(run_id=self.mlflow_integration.parent_run_id)
            self._selected = self._search_states[self._best_estimator]
            self.modelcount = sum(search_state.total_iter for search_state in self._search_states.values())
            if self._trained_estimator:
@@ -2624,11 +2680,34 @@ class AutoML(BaseEstimator):
                        self._best_estimator,
                        state.best_config,
                        self.data_size_full,
+                        is_retrain=True,
                    )
-                    logger.info("retrain {} for {:.1f}s".format(self._best_estimator, retrain_time))
+                    logger.info(f"retrain {self._best_estimator} for {retrain_time:.1f}s")
                    state.best_config_train_time = retrain_time
                    if self._trained_estimator:
                        logger.info(f"retrained model: {self._trained_estimator.model}")
+                        if self.best_run_id is not None:
+                            logger.info(f"Best MLflow run name: {self.best_run_name}")
+                            logger.info(f"Best MLflow run id: {self.best_run_id}")
+                        if self.mlflow_integration is not None:
+                            # try log retrained model
+                            if all(
+                                [
+                                    self.mlflow_integration.manual_log,
+                                    not self.mlflow_integration.has_model,
+                                    self.mlflow_integration.parent_run_id is not None,
+                                ]
+                            ):
+                                if mlflow.active_run() is None:
+                                    mlflow.start_run(run_id=self.mlflow_integration.parent_run_id)
+                                self.mlflow_integration.log_model(
+                                    self._trained_estimator.model,
+                                    self.best_estimator,
+                                    signature=self.estimator_signature,
+                                )
+                                self.mlflow_integration.pickle_and_log_automl_artifacts(
+                                    self, self.model, self.best_estimator, signature=self.pipeline_signature
+                                )
                else:
                    logger.info("not retraining because the time budget is too small.")

@@ -2702,3 +2781,7 @@ class AutoML(BaseEstimator):
                q += inv[i] / s
                if p < q:
                    return estimator_list[i]
+
+    @property
+    def automl_pipeline(self):
+        return None
--- a/flaml/automl/ml.py
+++ b/flaml/automl/ml.py
@@ -13,6 +13,7 @@ from flaml.automl.model import BaseEstimator, TransformersEstimator
 from flaml.automl.spark import ERROR as SPARK_ERROR
 from flaml.automl.spark import DataFrame, Series, psDataFrame, psSeries
 from flaml.automl.task.task import Task
+from flaml.automl.time_series import TimeSeriesDataset

 try:
    from sklearn.metrics import (
@@ -33,7 +34,6 @@ except ImportError:
 if SPARK_ERROR is None:
    from flaml.automl.spark.metrics import spark_metric_loss_score

-from flaml.automl.time_series import TimeSeriesDataset

 logger = logging.getLogger(__name__)

@@ -89,6 +89,11 @@ huggingface_metric_to_mode = {
    "wer": "min",
 }
 huggingface_submetric_to_metric = {"rouge1": "rouge", "rouge2": "rouge"}
+spark_metric_name_dict = {
+    "Regression": ["r2", "rmse", "mse", "mae", "var"],
+    "Binary Classification": ["pr_auc", "roc_auc"],
+    "Multi-class Classification": ["accuracy", "log_loss", "f1", "micro_f1", "macro_f1"],
+}


 def metric_loss_score(
@@ -122,7 +127,7 @@ def metric_loss_score(
            import datasets

            datasets_metric_name = huggingface_submetric_to_metric.get(metric_name, metric_name.split(":")[0])
-            metric = datasets.load_metric(datasets_metric_name)
+            metric = datasets.load_metric(datasets_metric_name, trust_remote_code=True)
            metric_mode = huggingface_metric_to_mode[datasets_metric_name]

            if metric_name.startswith("seqeval"):
@@ -334,6 +339,14 @@ def compute_estimator(
    if fit_kwargs is None:
        fit_kwargs = {}

+    fe_params = {}
+    for param, value in config_dic.items():
+        if param.startswith("fe."):
+            fe_params[param] = value
+
+    for param, value in fe_params.items():
+        config_dic.pop(param)
+
    estimator_class = estimator_class or task.estimator_class_from_str(estimator_name)
    estimator = estimator_class(
        **config_dic,
@@ -401,12 +414,21 @@ def train_estimator(
    free_mem_ratio=0,
 ) -> Tuple[EstimatorSubclass, float]:
    start_time = time.time()
+    fe_params = {}
+    for param, value in config_dic.items():
+        if param.startswith("fe."):
+            fe_params[param] = value
+
+    for param, value in fe_params.items():
+        config_dic.pop(param)
+
    estimator_class = estimator_class or task.estimator_class_from_str(estimator_name)
    estimator = estimator_class(
        **config_dic,
        task=task,
        n_jobs=n_jobs,
    )
+
    if fit_kwargs is None:
        fit_kwargs = {}

@@ -567,14 +589,19 @@ def _eval_estimator(

        pred_time = (time.time() - pred_start) / num_val_rows

-        val_loss = metric_loss_score(
-            eval_metric,
-            y_processed_predict=val_pred_y,
-            y_processed_true=y_val,
-            labels=labels,
-            sample_weight=weight_val,
-            groups=groups_val,
-        )
+        try:
+            val_loss = metric_loss_score(
+                eval_metric,
+                y_processed_predict=val_pred_y,
+                y_processed_true=y_val,
+                labels=labels,
+                sample_weight=weight_val,
+                groups=groups_val,
+            )
+        except ValueError as e:
+            # `r2_score` and other metrics may raise a `ValueError` when a model returns `inf` or `nan` values. In this case, we set the val_loss to infinity.
+            val_loss = np.inf
+            logger.warning(f"ValueError {e} happened in `metric_loss_score`, set `val_loss` to `np.inf`")
        metric_for_logging = {"pred_time": pred_time}
        if log_training_metric:
            train_pred_y = get_y_pred(estimator, X_train, eval_metric, task)
--- a/flaml/automl/model.py
+++ b/flaml/automl/model.py
--- a/flaml/automl/nlp/README.md
+++ b/flaml/automl/nlp/README.md
@@ -4,16 +4,15 @@ This directory contains utility functions used by AutoNLP. Currently we support

 Please refer to this [link](https://microsoft.github.io/FLAML/docs/Examples/AutoML-NLP) for examples.

-
 # Troubleshooting fine-tuning HPO for pre-trained language models

 The frequent updates of transformers may lead to fluctuations in the results of tuning. To help users quickly troubleshoot the result of AutoNLP when a tuning failure occurs (e.g., failing to reproduce previous results), we have provided the following jupyter notebook:

-* [Troubleshooting HPO for fine-tuning pre-trained language models](https://github.com/microsoft/FLAML/blob/main/notebook/research/acl2021.ipynb)
+- [Troubleshooting HPO for fine-tuning pre-trained language models](https://github.com/microsoft/FLAML/blob/main/notebook/research/acl2021.ipynb)

 Our findings on troubleshooting fine-tuning the Electra and RoBERTa model for the GLUE dataset can be seen in the following paper published in ACL 2021:

-* [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://arxiv.org/abs/2106.09204). Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.
+- [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://arxiv.org/abs/2106.09204). Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.

 ```bibtex
@inproceedings{liu2021hpo,
--- a/flaml/automl/nlp/huggingface/data_collator.py
+++ b/flaml/automl/nlp/huggingface/data_collator.py
@@ -32,7 +32,7 @@ class DataCollatorForMultipleChoiceClassification(DataCollatorWithPadding):
            [{k: v[i] for k, v in feature.items()} for i in range(num_choices)] for feature in features
        ]
        flattened_features = list(chain(*flattened_features))
-        batch = super(DataCollatorForMultipleChoiceClassification, self).__call__(flattened_features)
+        batch = super().__call__(flattened_features)
        # Un-flatten
        batch = {k: v.view(batch_size, num_choices, -1) for k, v in batch.items()}
        # Add back labels
--- a/flaml/automl/nlp/huggingface/utils.py
+++ b/flaml/automl/nlp/huggingface/utils.py
@@ -245,7 +245,7 @@ def tokenize_row(
    return_column_name=False,
 ):
    if prefix:
-        this_row = tuple(["".join(x) for x in zip(prefix, this_row)])
+        this_row = tuple("".join(x) for x in zip(prefix, this_row))

    # tokenizer.pad_token = tokenizer.eos_token
    tokenized_example = tokenizer(
--- a/flaml/automl/nlp/utils.py
+++ b/flaml/automl/nlp/utils.py
@@ -32,7 +32,7 @@ def is_a_list_of_str(this_obj):

 def _clean_value(value: Any) -> str:
    if isinstance(value, float):
-        return "{:.5}".format(value)
+        return f"{value:.5}"
    else:
        return str(value).replace("/", "_")

@@ -86,7 +86,7 @@ class Counter:
    @staticmethod
    def get_trial_fold_name(local_dir, trial_config, trial_id):
        Counter.counter += 1
-        experiment_tag = "{0}_{1}".format(str(Counter.counter), format_vars(trial_config))
+        experiment_tag = f"{str(Counter.counter)}_{format_vars(trial_config)}"
        logdir = get_logdir_name(_generate_dirname(experiment_tag, trial_id=trial_id), local_dir)
        return logdir

--- a/flaml/automl/spark/configs.py
+++ b/flaml/automl/spark/configs.py
@@ -1,97 +0,0 @@
-ParamList_LightGBM_Base = [
-    "baggingFraction",
-    "baggingFreq",
-    "baggingSeed",
-    "binSampleCount",
-    "boostFromAverage",
-    "boostingType",
-    "catSmooth",
-    "categoricalSlotIndexes",
-    "categoricalSlotNames",
-    "catl2",
-    "chunkSize",
-    "dataRandomSeed",
-    "defaultListenPort",
-    "deterministic",
-    "driverListenPort",
-    "dropRate",
-    "dropSeed",
-    "earlyStoppingRound",
-    "executionMode",
-    "extraSeed" "featureFraction",
-    "featureFractionByNode",
-    "featureFractionSeed",
-    "featuresCol",
-    "featuresShapCol",
-    "fobj" "improvementTolerance",
-    "initScoreCol",
-    "isEnableSparse",
-    "isProvideTrainingMetric",
-    "labelCol",
-    "lambdaL1",
-    "lambdaL2",
-    "leafPredictionCol",
-    "learningRate",
-    "matrixType",
-    "maxBin",
-    "maxBinByFeature",
-    "maxCatThreshold",
-    "maxCatToOnehot",
-    "maxDeltaStep",
-    "maxDepth",
-    "maxDrop",
-    "metric",
-    "microBatchSize",
-    "minDataInLeaf",
-    "minDataPerBin",
-    "minDataPerGroup",
-    "minGainToSplit",
-    "minSumHessianInLeaf",
-    "modelString",
-    "monotoneConstraints",
-    "monotoneConstraintsMethod",
-    "monotonePenalty",
-    "negBaggingFraction",
-    "numBatches",
-    "numIterations",
-    "numLeaves",
-    "numTasks",
-    "numThreads",
-    "objectiveSeed",
-    "otherRate",
-    "parallelism",
-    "passThroughArgs",
-    "posBaggingFraction",
-    "predictDisableShapeCheck",
-    "predictionCol",
-    "repartitionByGroupingColumn",
-    "seed",
-    "skipDrop",
-    "slotNames",
-    "timeout",
-    "topK",
-    "topRate",
-    "uniformDrop",
-    "useBarrierExecutionMode",
-    "useMissing",
-    "useSingleDatasetMode",
-    "validationIndicatorCol",
-    "verbosity",
-    "weightCol",
-    "xGBoostDartMode",
-    "zeroAsMissing",
-    "objective",
-]
-ParamList_LightGBM_Classifier = ParamList_LightGBM_Base + [
-    "isUnbalance",
-    "probabilityCol",
-    "rawPredictionCol",
-    "thresholds",
-]
-ParamList_LightGBM_Regressor = ParamList_LightGBM_Base + ["tweedieVariancePower"]
-ParamList_LightGBM_Ranker = ParamList_LightGBM_Base + [
-    "groupCol",
-    "evalAt",
-    "labelGain",
-    "maxPosition",
-]
--- a/flaml/automl/state.py
+++ b/flaml/automl/state.py
@@ -65,6 +65,7 @@ class SearchState:
        custom_hp=None,
        max_iter=None,
        budget=None,
+        featurization="auto",
    ):
        self.init_eci = learner_class.cost_relative2lgbm() if budget >= 0 else 1
        self._search_space_domain = {}
@@ -82,6 +83,7 @@ class SearchState:
        else:
            data_size = data.shape
            search_space = learner_class.search_space(data_size=data_size, task=task)
+
        self.data_size = data_size

        if custom_hp is not None:
@@ -91,9 +93,7 @@ class SearchState:
            starting_point = AutoMLState.sanitize(starting_point)
            if max_iter > 1 and not self.valid_starting_point(starting_point, search_space):
                # If the number of iterations is larger than 1, remove invalid point
-                logger.warning(
-                    "Starting point {} removed because it is outside of the search space".format(starting_point)
-                )
+                logger.warning(f"Starting point {starting_point} removed because it is outside of the search space")
                starting_point = None
        elif isinstance(starting_point, list):
            starting_point = [AutoMLState.sanitize(x) for x in starting_point]
@@ -208,7 +208,7 @@ class SearchState:
        self.val_loss, self.config = obj, config

    def get_hist_config_sig(self, sample_size, config):
-        config_values = tuple([config[k] for k in self._hp_names if k in config])
+        config_values = tuple(config[k] for k in self._hp_names if k in config)
        config_sig = str(sample_size) + "_" + str(config_values)
        return config_sig

@@ -290,9 +290,11 @@ class AutoMLState:
        budget = (
            None
            if state.time_budget < 0
-            else state.time_budget - state.time_from_start
-            if sample_size == state.data_size[0]
-            else (state.time_budget - state.time_from_start) / 2 * sample_size / state.data_size[0]
+            else (
+                state.time_budget - state.time_from_start
+                if sample_size == state.data_size[0]
+                else (state.time_budget - state.time_from_start) / 2 * sample_size / state.data_size[0]
+            )
        )

        (
@@ -353,6 +355,7 @@ class AutoMLState:
        estimator: str,
        config_w_resource: dict,
        sample_size: Optional[int] = None,
+        is_retrain: bool = False,
    ):
        if not sample_size:
            sample_size = config_w_resource.get("FLAML_sample_size", len(self.y_train_all))
@@ -378,9 +381,8 @@ class AutoMLState:
            this_estimator_kwargs[
                "groups"
            ] = groups  # NOTE: _train_with_config is after kwargs is updated to fit_kwargs_by_estimator
-
+        this_estimator_kwargs.update({"is_retrain": is_retrain})
        budget = None if self.time_budget < 0 else self.time_budget - self.time_from_start
-
        estimator, train_time = train_estimator(
            X_train=sampled_X_train,
            y_train=sampled_y_train,
--- a/flaml/automl/task/generic_task.py
+++ b/flaml/automl/task/generic_task.py
@@ -16,12 +16,7 @@ from flaml.automl.spark.utils import (
    unique_pandas_on_spark,
    unique_value_first_index,
 )
-from flaml.automl.task.task import (
-    TS_FORECAST,
-    TS_FORECASTPANEL,
-    Task,
-    get_classification_objective,
-)
+from flaml.automl.task.task import TS_FORECAST, TS_FORECASTPANEL, Task, get_classification_objective
 from flaml.config import RANDOM_SEED

 try:
@@ -53,13 +48,24 @@ class GenericTask(Task):
            from flaml.automl.contrib.histgb import HistGradientBoostingEstimator
            from flaml.automl.model import (
                CatBoostEstimator,
+                ElasticNetEstimator,
                ExtraTreesEstimator,
                KNeighborsEstimator,
+                LassoLarsEstimator,
                LGBMEstimator,
                LRL1Classifier,
                LRL2Classifier,
                RandomForestEstimator,
+                SGDEstimator,
+                SparkAFTSurvivalRegressionEstimator,
+                SparkGBTEstimator,
+                SparkGLREstimator,
                SparkLGBMEstimator,
+                SparkLinearRegressionEstimator,
+                SparkLinearSVCEstimator,
+                SparkNaiveBayesEstimator,
+                SparkRandomForestEstimator,
+                SVCEstimator,
                TransformersEstimator,
                TransformersEstimatorModelSelection,
                XGBoostLimitDepthEstimator,
@@ -72,6 +78,7 @@ class GenericTask(Task):
                "rf": RandomForestEstimator,
                "lgbm": LGBMEstimator,
                "lgbm_spark": SparkLGBMEstimator,
+                "rf_spark": SparkRandomForestEstimator,
                "lrl1": LRL1Classifier,
                "lrl2": LRL2Classifier,
                "catboost": CatBoostEstimator,
@@ -80,6 +87,17 @@ class GenericTask(Task):
                "transformer": TransformersEstimator,
                "transformer_ms": TransformersEstimatorModelSelection,
                "histgb": HistGradientBoostingEstimator,
+                # Above are open-source, below are internal
+                "svc": SVCEstimator,
+                "sgd": SGDEstimator,
+                "nb_spark": SparkNaiveBayesEstimator,
+                "enet": ElasticNetEstimator,
+                "lassolars": LassoLarsEstimator,
+                "glr_spark": SparkGLREstimator,
+                "lr_spark": SparkLinearRegressionEstimator,
+                "svc_spark": SparkLinearSVCEstimator,
+                "gbt_spark": SparkGBTEstimator,
+                "aft_spark": SparkAFTSurvivalRegressionEstimator,
            }
        return self._estimators

@@ -271,8 +289,8 @@ class GenericTask(Task):
            seed=RANDOM_SEED,
        )
        columns_to_drop = [c for c in df_all_train.columns if c in [stratify_column, "sample_weight"]]
-        X_train = df_all_train.drop(columns_to_drop)
-        X_val = df_all_val.drop(columns_to_drop)
+        X_train = df_all_train.drop(columns=columns_to_drop)
+        X_val = df_all_val.drop(columns=columns_to_drop)
        y_train = df_all_train[stratify_column]
        y_val = df_all_val[stratify_column]

@@ -497,14 +515,37 @@ class GenericTask(Task):
                    last = first[i] + 1
                rest.extend(range(last, len(y_train_all)))
                X_first = X_train_all.iloc[first] if data_is_df else X_train_all[first]
-                X_rest = X_train_all.iloc[rest] if data_is_df else X_train_all[rest]
-                y_rest = (
-                    y_train_all[rest]
-                    if isinstance(y_train_all, np.ndarray)
-                    else iloc_pandas_on_spark(y_train_all, rest)
-                    if is_spark_dataframe
-                    else y_train_all.iloc[rest]
-                )
+                if len(first) < len(y_train_all) / 2:
+                    # Get X_rest and y_rest with drop, sparse matrix can't apply np.delete
+                    X_rest = (
+                        np.delete(X_train_all, first, axis=0)
+                        if isinstance(X_train_all, np.ndarray)
+                        else X_train_all.drop(first.tolist())
+                        if data_is_df
+                        else X_train_all[rest]
+                    )
+                    y_rest = (
+                        np.delete(y_train_all, first, axis=0)
+                        if isinstance(y_train_all, np.ndarray)
+                        else y_train_all.drop(first.tolist())
+                        if data_is_df
+                        else y_train_all[rest]
+                    )
+                else:
+                    X_rest = (
+                        iloc_pandas_on_spark(X_train_all, rest)
+                        if is_spark_dataframe
+                        else X_train_all.iloc[rest]
+                        if data_is_df
+                        else X_train_all[rest]
+                    )
+                    y_rest = (
+                        iloc_pandas_on_spark(y_train_all, rest)
+                        if is_spark_dataframe
+                        else y_train_all.iloc[rest]
+                        if data_is_df
+                        else y_train_all[rest]
+                    )
                stratify = y_rest if split_type == "stratified" else None
                X_train, X_val, y_train, y_val = self._train_test_split(
                    state, X_rest, y_rest, first, rest, split_ratio, stratify
@@ -513,6 +554,12 @@ class GenericTask(Task):
                y_train = concat(label_set, y_train) if data_is_df else np.concatenate([label_set, y_train])
                X_val = concat(X_first, X_val)
                y_val = concat(label_set, y_val) if data_is_df else np.concatenate([label_set, y_val])
+
+                if isinstance(y_train, (psDataFrame, pd.DataFrame)) and y_train.shape[1] == 1:
+                    y_train = y_train[y_train.columns[0]]
+                    y_val = y_val[y_val.columns[0]]
+                    y_train.name = y_val.name = y_rest.name
+
            elif self.is_regression():
                X_train, X_val, y_train, y_val = self._train_test_split(
                    state, X_train_all, y_train_all, split_ratio=split_ratio
@@ -810,27 +857,23 @@ class GenericTask(Task):
        elif self.is_ts_forecastpanel():
            estimator_list = ["tft"]
        else:
+            estimator_list = [
+                "lgbm",
+                "rf",
+                "xgboost",
+                "extra_tree",
+                "xgb_limitdepth",
+                "lgbm_spark",
+                "rf_spark",
+                "sgd",
+            ]
            try:
                import catboost

-                estimator_list = [
-                    "lgbm",
-                    "rf",
-                    "catboost",
-                    "xgboost",
-                    "extra_tree",
-                    "xgb_limitdepth",
-                    "lgbm_spark",
-                ]
+                estimator_list += ["catboost"]
            except ImportError:
-                estimator_list = [
-                    "lgbm",
-                    "rf",
-                    "xgboost",
-                    "extra_tree",
-                    "xgb_limitdepth",
-                    "lgbm_spark",
-                ]
+                pass
+
            # if self.is_ts_forecast():
            #     # catboost is removed because it has a `name` parameter, making it incompatible with hcrystalball
            #     if "catboost" in estimator_list:
@@ -862,9 +905,7 @@ class GenericTask(Task):
            return metric

        if self.is_nlp():
-            from flaml.automl.nlp.utils import (
-                load_default_huggingface_metric_for_task,
-            )
+            from flaml.automl.nlp.utils import load_default_huggingface_metric_for_task

            return load_default_huggingface_metric_for_task(self.name)
        elif self.is_binary():
--- a/flaml/automl/task/time_series_task.py
+++ b/flaml/automl/task/time_series_task.py
@@ -36,11 +36,17 @@ class TimeSeriesTask(Task):
                LGBM_TS,
                RF_TS,
                SARIMAX,
+                Average,
                CatBoost_TS,
                ExtraTrees_TS,
                HoltWinters,
+                LassoLars_TS,
+                Naive,
                Orbit,
                Prophet,
+                SeasonalAverage,
+                SeasonalNaive,
+                TCNEstimator,
                TemporalFusionTransformerEstimator,
                XGBoost_TS,
                XGBoostLimitDepth_TS,
@@ -57,8 +63,19 @@ class TimeSeriesTask(Task):
                "holt-winters": HoltWinters,
                "catboost": CatBoost_TS,
                "tft": TemporalFusionTransformerEstimator,
+                "lassolars": LassoLars_TS,
+                "tcn": TCNEstimator,
+                "snaive": SeasonalNaive,
+                "naive": Naive,
+                "savg": SeasonalAverage,
+                "avg": Average,
            }

+            if self._estimators["tcn"] is None:
+                # remove TCN if import failed
+                del self._estimators["tcn"]
+                logger.info("Couldn't import pytorch_lightning, skipping TCN estimator")
+
            try:
                from prophet import Prophet as foo

@@ -71,7 +88,7 @@ class TimeSeriesTask(Task):

                self._estimators["orbit"] = Orbit
            except ImportError:
-                logger.info("Couldn't import Prophet, skipping")
+                logger.info("Couldn't import orbit, skipping")

        return self._estimators

--- a/flaml/automl/time_series/init.py
+++ b/flaml/automl/time_series/init.py
@@ -1,16 +1,27 @@
 from .tft import TemporalFusionTransformerEstimator
-from .ts_data import TimeSeriesDataset
 from .ts_model import (
    ARIMA,
    LGBM_TS,
    RF_TS,
    SARIMAX,
+    Average,
    CatBoost_TS,
    ExtraTrees_TS,
    HoltWinters,
+    LassoLars_TS,
+    Naive,
    Orbit,
    Prophet,
+    SeasonalAverage,
+    SeasonalNaive,
    TimeSeriesEstimator,
    XGBoost_TS,
    XGBoostLimitDepth_TS,
 )
+
+try:
+    from .tcn import TCNEstimator
+except ImportError:
+    TCNEstimator = None
+
+from .ts_data import TimeSeriesDataset
--- a/flaml/automl/time_series/tcn.py
+++ b/flaml/automl/time_series/tcn.py
@@ -0,0 +1,285 @@
+# This file is adapted from
+# https://github.com/locuslab/TCN/blob/master/TCN/tcn.py
+# https://github.com/locuslab/TCN/blob/master/TCN/adding_problem/add_test.py
+
+import datetime
+import logging
+import time
+
+import pandas as pd
+import pytorch_lightning as pl
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
+from pytorch_lightning.loggers import TensorBoardLogger
+from torch.nn.utils import weight_norm
+from torch.utils.data import DataLoader, TensorDataset
+
+from flaml import tune
+from flaml.automl.data import add_time_idx_col
+from flaml.automl.logger import logger, logger_formatter
+from flaml.automl.time_series.ts_data import TimeSeriesDataset
+from flaml.automl.time_series.ts_model import TimeSeriesEstimator
+
+
+class Chomp1d(nn.Module):
+    def __init__(self, chomp_size):
+        super().__init__()
+        self.chomp_size = chomp_size
+
+    def forward(self, x):
+        return x[:, :, : -self.chomp_size].contiguous()
+
+
+class TemporalBlock(nn.Module):
+    def __init__(self, n_inputs, n_outputs, kernel_size, stride, dilation, padding, dropout=0.2):
+        super().__init__()
+        self.conv1 = weight_norm(
+            nn.Conv1d(n_inputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation)
+        )
+        self.chomp1 = Chomp1d(padding)
+        self.relu1 = nn.ReLU()
+        self.dropout1 = nn.Dropout(dropout)
+
+        self.conv2 = weight_norm(
+            nn.Conv1d(n_outputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation)
+        )
+        self.chomp2 = Chomp1d(padding)
+        self.relu2 = nn.ReLU()
+        self.dropout2 = nn.Dropout(dropout)
+
+        self.net = nn.Sequential(
+            self.conv1, self.chomp1, self.relu1, self.dropout1, self.conv2, self.chomp2, self.relu2, self.dropout2
+        )
+        self.downsample = nn.Conv1d(n_inputs, n_outputs, 1) if n_inputs != n_outputs else None
+        self.relu = nn.ReLU()
+        self.init_weights()
+
+    def init_weights(self):
+        self.conv1.weight.data.normal_(0, 0.01)
+        self.conv2.weight.data.normal_(0, 0.01)
+        if self.downsample is not None:
+            self.downsample.weight.data.normal_(0, 0.01)
+
+    def forward(self, x):
+        out = self.net(x)
+        res = x if self.downsample is None else self.downsample(x)
+        return self.relu(out + res)
+
+
+class TCNForecaster(nn.Module):
+    def __init__(
+        self,
+        input_feature_num,
+        num_outputs,
+        num_channels,
+        kernel_size=2,
+        dropout=0.2,
+    ):
+        super().__init__()
+        layers = []
+        num_levels = len(num_channels)
+        for i in range(num_levels):
+            dilation_size = 2**i
+            in_channels = input_feature_num if i == 0 else num_channels[i - 1]
+            out_channels = num_channels[i]
+            layers += [
+                TemporalBlock(
+                    in_channels,
+                    out_channels,
+                    kernel_size,
+                    stride=1,
+                    dilation=dilation_size,
+                    padding=(kernel_size - 1) * dilation_size,
+                    dropout=dropout,
+                )
+            ]
+
+        self.network = nn.Sequential(*layers)
+        self.linear = nn.Linear(num_channels[-1], num_outputs)
+
+    def forward(self, x):
+        y1 = self.network(x)
+        return self.linear(y1[:, :, -1])
+
+
+class TCNForecasterLightningModule(pl.LightningModule):
+    def __init__(self, model: TCNForecaster, learning_rate: float = 1e-3):
+        super().__init__()
+        self.model = model
+        self.learning_rate = learning_rate
+        self.loss_fn = nn.MSELoss()
+
+    def forward(self, x):
+        return self.model(x)
+
+    def step(self, batch, batch_idx):
+        x, y = batch
+        y_hat = self.model(x)
+        loss = self.loss_fn(y_hat, y)
+        return loss
+
+    def training_step(self, batch, batch_idx):
+        loss = self.step(batch, batch_idx)
+        self.log("train_loss", loss)
+        return loss
+
+    def validation_step(self, batch, batch_idx):
+        loss = self.step(batch, batch_idx)
+        self.log("val_loss", loss)
+        return loss
+
+    def configure_optimizers(self):
+        return torch.optim.Adam(self.parameters(), lr=self.learning_rate)
+
+
+class DataframeDataset(torch.utils.data.Dataset):
+    def __init__(self, dataframe, target_column, features_columns, sequence_length, train=True):
+        self.data = torch.tensor(dataframe[features_columns].to_numpy(), dtype=torch.float)
+        self.sequence_length = sequence_length
+        if train:
+            self.labels = torch.tensor(dataframe[target_column].to_numpy(), dtype=torch.float)
+        self.is_train = train
+
+    def __len__(self):
+        return len(self.data) - self.sequence_length + 1
+
+    def __getitem__(self, idx):
+        data = self.data[idx : idx + self.sequence_length]
+        data = data.permute(1, 0)
+        if self.is_train:
+            label = self.labels[idx : idx + self.sequence_length]
+            return data, label
+        else:
+            return data
+
+
+class TCNEstimator(TimeSeriesEstimator):
+    """The class for tuning TCN Forecaster"""
+
+    @classmethod
+    def search_space(cls, data, task, pred_horizon, **params):
+        space = {
+            "num_levels": {
+                "domain": tune.randint(lower=4, upper=20),  # hidden = 2^num_hidden
+                "init_value": 4,
+            },
+            "num_hidden": {
+                "domain": tune.randint(lower=4, upper=8),  # hidden = 2^num_hidden
+                "init_value": 5,
+            },
+            "kernel_size": {
+                "domain": tune.choice([2, 3, 5, 7]),  # common choices for kernel size
+                "init_value": 3,
+            },
+            "dropout": {
+                "domain": tune.uniform(lower=0.0, upper=0.5),  # standard range for dropout
+                "init_value": 0.1,
+            },
+            "learning_rate": {
+                "domain": tune.loguniform(lower=1e-4, upper=1e-1),  # typical range for learning rate
+                "init_value": 1e-3,
+            },
+        }
+        return space
+
+    def __init__(self, task="ts_forecast", n_jobs=1, **params):
+        super().__init__(task, **params)
+        logging.getLogger("pytorch_lightning").setLevel(logging.WARNING)
+
+    def fit(self, X_train: TimeSeriesDataset, y_train=None, budget=None, **kwargs):
+        start_time = time.time()
+        if budget is not None:
+            deltabudget = datetime.timedelta(seconds=budget)
+        else:
+            deltabudget = None
+        X_train = self.enrich(X_train)
+        super().fit(X_train, y_train, budget, **kwargs)
+
+        self.batch_size = kwargs.get("batch_size", 64)
+        self.horizon = kwargs.get("period", 1)
+        self.feature_cols = X_train.time_varying_known_reals
+        self.target_col = X_train.target_names[0]
+
+        train_dataset = DataframeDataset(
+            X_train.train_data,
+            self.target_col,
+            self.feature_cols,
+            self.horizon,
+        )
+        train_loader = DataLoader(train_dataset, batch_size=self.batch_size, shuffle=False)
+        if not X_train.test_data.empty:
+            val_dataset = DataframeDataset(
+                X_train.test_data,
+                self.target_col,
+                self.feature_cols,
+                self.horizon,
+            )
+        else:
+            val_dataset = DataframeDataset(
+                X_train.train_data.sample(frac=0.2, random_state=kwargs.get("random_state", 0)),
+                self.target_col,
+                self.feature_cols,
+                self.horizon,
+            )
+
+        val_loader = DataLoader(val_dataset, batch_size=self.batch_size, shuffle=False)
+
+        model = TCNForecaster(
+            len(self.feature_cols),
+            self.horizon,
+            [2 ** self.params["num_hidden"]] * self.params["num_levels"],
+            self.params["kernel_size"],
+            self.params["dropout"],
+        )
+
+        pl_module = TCNForecasterLightningModule(model, self.params["learning_rate"])
+
+        # Training loop
+        # gpus is deprecated in v1.7 and removed in v2.0
+        # accelerator="auto" can cast all condition.
+        trainer = pl.Trainer(
+            max_epochs=kwargs.get("max_epochs", 10),
+            accelerator="auto",
+            callbacks=[
+                EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min"),
+                LearningRateMonitor(),
+            ],
+            logger=TensorBoardLogger(kwargs.get("log_dir", "logs/lightning_logs")),  # logging results to a tensorboard
+            max_time=deltabudget,
+            enable_model_summary=False,
+            enable_progress_bar=False,
+        )
+        trainer.fit(
+            pl_module,
+            train_dataloaders=train_loader,
+            val_dataloaders=val_loader,
+        )
+        best_model = trainer.model
+        self._model = best_model
+        train_time = time.time() - start_time
+        return train_time
+
+    def predict(self, X):
+        X = self.enrich(X)
+        if isinstance(X, TimeSeriesDataset):
+            df = X.X_val
+        else:
+            df = X
+        dataset = DataframeDataset(
+            df,
+            self.target_col,
+            self.feature_cols,
+            self.horizon,
+            train=False,
+        )
+        data_loader = DataLoader(dataset, batch_size=self.batch_size, shuffle=False)
+        self._model.eval()
+        raw_preds = []
+        for batch_x in data_loader:
+            raw_pred = self._model(batch_x)
+            raw_preds.append(raw_pred)
+        raw_preds = torch.cat(raw_preds, dim=0)
+        preds = pd.Series(raw_preds.detach().numpy().ravel())
+        return preds
--- a/flaml/automl/time_series/ts_data.py
+++ b/flaml/automl/time_series/ts_data.py
@@ -26,6 +26,8 @@ except ImportError:
    DataFrame = Series = None


+# dataclass will remove empty default value even with field(default_factory=lambda: [])
+# Change into default=None to place the attr
@dataclass
 class TimeSeriesDataset:
    train_data: pd.DataFrame
@@ -34,10 +36,10 @@ class TimeSeriesDataset:
    target_names: List[str]
    frequency: str
    test_data: pd.DataFrame
-    time_varying_known_categoricals: List[str] = field(default_factory=lambda: [])
-    time_varying_known_reals: List[str] = field(default_factory=lambda: [])
-    time_varying_unknown_categoricals: List[str] = field(default_factory=lambda: [])
-    time_varying_unknown_reals: List[str] = field(default_factory=lambda: [])
+    time_varying_known_categoricals: List[str] = field(default=None)
+    time_varying_known_reals: List[str] = field(default=None)
+    time_varying_unknown_categoricals: List[str] = field(default=None)
+    time_varying_unknown_reals: List[str] = field(default=None)

    def __init__(
        self,
@@ -403,7 +405,7 @@ class DataTransformerTS:
                    self.cat_columns.append(column)
            elif X[column].nunique(dropna=True) < 2:
                self.drop_columns.append(column)
-            elif X[column].dtype.name == "datetime64[ns]":
+            elif X[column].dtype.name in ["datetime64[ns]", "datetime64[s]"]:
                pass  # these will be processed at model level,
                # so they can also be done in the predict method
            else:
--- a/flaml/automl/time_series/ts_model.py
+++ b/flaml/automl/time_series/ts_model.py
@@ -26,6 +26,7 @@ from flaml.automl.data import TS_TIMESTAMP_COL, TS_VALUE_COL
 from flaml.automl.model import (
    CatBoostEstimator,
    ExtraTreesEstimator,
+    LassoLarsEstimator,
    LGBMEstimator,
    RandomForestEstimator,
    SKLearnEstimator,
@@ -611,15 +612,13 @@ class HoltWinters(StatsModelsEstimator):
        ):  # this would prevent heuristic initialization to work properly
            self.params["seasonal"] = None
        if (
-            self.params["seasonal"] == "mul" and (train_df.y == 0).sum() > 0
+            self.params["seasonal"] == "mul" and (train_df[target_col] == 0).sum() > 0
        ):  # cannot have multiplicative seasonality in this case
            self.params["seasonal"] = "add"
-        if self.params["trend"] == "mul" and (train_df.y == 0).sum() > 0:
+        if self.params["trend"] == "mul" and (train_df[target_col] == 0).sum() > 0:
            self.params["trend"] = "add"
-
        if not self.params["seasonal"] or self.params["trend"] not in ["mul", "add"]:
            self.params["damped_trend"] = False
-
        model = HWExponentialSmoothing(
            train_df[[target_col]],
            damped_trend=self.params["damped_trend"],
@@ -633,6 +632,125 @@ class HoltWinters(StatsModelsEstimator):
        return train_time


+class SimpleForecaster(StatsModelsEstimator):
+    """Base class for Naive Forecaster like Seasonal Naive, Naive, Seasonal Average, Average"""
+
+    @classmethod
+    def _search_space(cls, data: TimeSeriesDataset, task: Task, pred_horizon: int, **params):
+        return {
+            "season": {
+                "domain": tune.randint(1, pred_horizon),
+                "init_value": pred_horizon,
+            }
+        }
+
+    def joint_preprocess(self, X_train, y_train=None):
+        X_train = self.enrich(X_train)
+
+        self.regressors = []
+
+        if isinstance(X_train, TimeSeriesDataset):
+            data = X_train
+            target_col = data.target_names[0]
+            # this class only supports univariate regression
+            train_df = data.train_data[self.regressors + [target_col]]
+            train_df.index = to_datetime(data.train_data[data.time_col])
+        else:
+            target_col = TS_VALUE_COL
+            train_df = self._join(X_train, y_train)
+
+        self.time_col = data.time_col
+        self.target_names = data.target_names
+
+        train_df = self._preprocess(train_df)
+        return train_df, target_col
+
+    def fit(self, X_train, y_train=None, budget=None, **kwargs):
+        import warnings
+
+        warnings.filterwarnings("ignore")
+        from statsmodels.tsa.holtwinters import SimpleExpSmoothing
+
+        self.season = self.params.get("season", 1)
+        current_time = time.time()
+        super().fit(X_train, y_train, budget=budget, **kwargs)
+
+        train_df, target_col = self.joint_preprocess(X_train, y_train)
+
+        model = SimpleExpSmoothing(
+            train_df[[target_col]],
+        )
+        with suppress_stdout_stderr():
+            model = model.fit(smoothing_level=self.smoothing_level)
+        train_time = time.time() - current_time
+        self._model = model
+        return train_time
+
+
+class SeasonalNaive(SimpleForecaster):
+    smoothing_level = 1.0
+
+    def predict(self, X, **kwargs):
+        if isinstance(X, int):
+            forecasts = []
+            for i in range(X):
+                forecast = self._model.forecast(steps=self.season)[0]
+                forecasts.append(forecast)
+            return pd.Series(forecasts)
+        else:
+            return super().predict(X, **kwargs)
+
+
+class Naive(SimpleForecaster):
+    smoothing_level = 0.0
+
+    @classmethod
+    def _search_space(cls, data: TimeSeriesDataset, task: Task, pred_horizon: int, **params):
+        return {}
+
+    def predict(self, X, **kwargs):
+        if isinstance(X, int):
+            last_observation = self._model.params["initial_level"]
+            return pd.Series([last_observation] * X)
+        else:
+            return super().predict(X, **kwargs)
+
+
+class SeasonalAverage(SimpleForecaster):
+    def fit(self, X_train, y_train=None, budget=None, **kwargs):
+        from statsmodels.tsa.ar_model import AutoReg, ar_select_order
+
+        start_time = time.time()
+
+        self.season = kwargs.get("season", 1)  # seasonality period
+        train_df, target_col = self.joint_preprocess(X_train, y_train)
+        selection_res = ar_select_order(train_df[target_col], maxlag=self.season)
+
+        # Fit autoregressive model with optimal order
+        model = AutoReg(train_df[target_col], lags=selection_res.ar_lags)
+        self._model = model.fit()
+        end_time = time.time()
+
+        return end_time - start_time
+
+
+class Average(SimpleForecaster):
+    @classmethod
+    def _search_space(cls, data: TimeSeriesDataset, task: Task, pred_horizon: int, **params):
+        return {}
+
+    def fit(self, X_train, y_train=None, budget=None, **kwargs):
+        from statsmodels.tsa.ar_model import AutoReg
+
+        start_time = time.time()
+        train_df, target_col = self.joint_preprocess(X_train, y_train)
+        model = AutoReg(train_df[target_col], lags=0)
+        self._model = model.fit()
+        end_time = time.time()
+
+        return end_time - start_time
+
+
 class TS_SKLearn(TimeSeriesEstimator):
    """The class for tuning SKLearn Regressors for time-series forecasting"""

@@ -759,3 +877,7 @@ class XGBoostLimitDepth_TS(TS_SKLearn):
 # catboost regressor is invalid because it has a `name` parameter, making it incompatible with hcrystalball
 class CatBoost_TS(TS_SKLearn):
    base_class = CatBoostEstimator
+
+
+class LassoLars_TS(TS_SKLearn):
+    base_class = LassoLarsEstimator
--- a/flaml/automl/training_log.py
+++ b/flaml/automl/training_log.py
@@ -11,7 +11,7 @@ from typing import IO
 logger = logging.getLogger("flaml.automl")


-class TrainingLogRecord(object):
+class TrainingLogRecord:
    def __init__(
        self,
        record_id: int,
@@ -52,7 +52,7 @@ class TrainingLogCheckPoint(TrainingLogRecord):
        self.curr_best_record_id = curr_best_record_id


-class TrainingLogWriter(object):
+class TrainingLogWriter:
    def __init__(self, output_filename: str):
        self.output_filename = output_filename
        self.file = None
@@ -79,7 +79,7 @@ class TrainingLogWriter(object):
        sample_size,
    ):
        if self.file is None:
-            raise IOError("Call open() to open the output file first.")
+            raise OSError("Call open() to open the output file first.")
        if validation_loss is None:
            raise ValueError("TEST LOSS NONE ERROR!!!")
        record = TrainingLogRecord(
@@ -109,7 +109,7 @@ class TrainingLogWriter(object):

    def checkpoint(self):
        if self.file is None:
-            raise IOError("Call open() to open the output file first.")
+            raise OSError("Call open() to open the output file first.")
        if self.current_best_loss_record_id is None:
            logger.warning("flaml.training_log: checkpoint() called before any record is written, skipped.")
            return
@@ -124,7 +124,7 @@ class TrainingLogWriter(object):
        self.file = None  # for pickle


-class TrainingLogReader(object):
+class TrainingLogReader:
    def __init__(self, filename: str):
        self.filename = filename
        self.file = None
@@ -134,7 +134,7 @@ class TrainingLogReader(object):

    def records(self):
        if self.file is None:
-            raise IOError("Call open() before reading log file.")
+            raise OSError("Call open() before reading log file.")
        for line in self.file:
            data = json.loads(line)
            if len(data) == 1:
@@ -149,7 +149,7 @@ class TrainingLogReader(object):

    def get_record(self, record_id) -> TrainingLogRecord:
        if self.file is None:
-            raise IOError("Call open() before reading log file.")
+            raise OSError("Call open() before reading log file.")
        for rec in self.records():
            if rec.record_id == record_id:
                return rec
--- a/flaml/default/README.md
+++ b/flaml/default/README.md
@@ -14,7 +14,6 @@ estimator.fit(X_train, y_train)
 estimator.predict(X_test, y_test)
 ```

-
 1. Use AutoML.fit(). set `starting_points="data"` and `max_iter=0`.

 ```python
@@ -36,10 +35,17 @@ automl.fit(X_train, y_train, **automl_settings)
 from flaml.default import preprocess_and_suggest_hyperparams

 X, y = load_iris(return_X_y=True, as_frame=True)
-X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
-hyperparams, estimator_class, X_transformed, y_transformed, feature_transformer, label_transformer = preprocess_and_suggest_hyperparams(
-    "classification", X_train, y_train, "lgbm"
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.33, random_state=42
 )
+(
+    hyperparams,
+    estimator_class,
+    X_transformed,
+    y_transformed,
+    feature_transformer,
+    label_transformer,
+) = preprocess_and_suggest_hyperparams("classification", X_train, y_train, "lgbm")
 model = estimator_class(**hyperparams)  # estimator_class is LGBMClassifier
 model.fit(X_transformed, y_train)  # LGBMClassifier can handle raw labels
 X_test = feature_transformer.transform(X_test)  # preprocess test data
@@ -172,7 +178,7 @@ Change "binary" into "multiclass" or "regression" for the other tasks.

 For more technical details, please check our research paper.

-* [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. arXiv preprint arXiv:2202.09927 (2022).
+- [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. arXiv preprint arXiv:2202.09927 (2022).

 ```bibtex
@article{Kayali2022default,
--- a/flaml/default/portfolio.py
+++ b/flaml/default/portfolio.py
@@ -69,7 +69,7 @@ def build_portfolio(meta_features, regret, strategy):

 def load_json(filename):
    """Returns the contents of json file filename."""
-    with open(filename, "r") as f:
+    with open(filename) as f:
        return json.load(f)


--- a/flaml/default/suggest.py
+++ b/flaml/default/suggest.py
@@ -43,7 +43,7 @@ def meta_feature(task, X_train, y_train, meta_feature_names):
                # 'numpy.ndarray' object has no attribute 'select_dtypes'
                this_feature.append(1)  # all features are numeric
        else:
-            raise ValueError("Feature {} not implemented. ".format(each_feature_name))
+            raise ValueError(f"Feature {each_feature_name} not implemented. ")

    return this_feature

@@ -57,7 +57,7 @@ def load_config_predictor(estimator_name, task, location=None):
    task = "multiclass" if task == "multi" else task  # TODO: multi -> multiclass?
    try:
        location = location or LOCATION
-        with open(f"{location}/{estimator_name}/{task}.json", "r") as f:
+        with open(f"{location}/{estimator_name}/{task}.json") as f:
            CONFIG_PREDICTORS[key] = predictor = json.load(f)
    except FileNotFoundError:
        raise FileNotFoundError(f"Portfolio has not been built for {estimator_name} on {task} task.")
--- a/flaml/fabric/init.py
+++ b/flaml/fabric/init.py
--- a/flaml/fabric/mlflow.py
+++ b/flaml/fabric/mlflow.py
@@ -0,0 +1,689 @@
+import json
+import os
+import pickle
+import random
+import sys
+import time
+from typing import MutableMapping
+
+import mlflow
+import pandas as pd
+from mlflow.entities import Metric, Param, RunTag
+from mlflow.exceptions import MlflowException
+from mlflow.utils.autologging_utils import AUTOLOGGING_INTEGRATIONS, autologging_is_disabled
+from scipy.sparse import issparse
+from sklearn import tree
+
+try:
+    from pyspark.ml import Pipeline as SparkPipeline
+except ImportError:
+
+    class SparkPipeline:
+        pass
+
+
+# from mlflow.store.tracking import SEARCH_MAX_RESULTS_THRESHOLD
+from sklearn.pipeline import Pipeline
+
+from flaml.automl.logger import logger
+from flaml.automl.spark import DataFrame, Series, psDataFrame, psSeries
+from flaml.version import __version__
+
+SEARCH_MAX_RESULTS = 5000  # Each train should not have more than 5000 trials
+IS_RENAME_CHILD_RUN = os.environ.get("FLAML_IS_RENAME_CHILD_RUN", "false").lower() == "true"
+
+
+def flatten_dict(d: MutableMapping, sep: str = ".") -> MutableMapping:
+    if len(d) == 0:
+        return d
+    [flat_dict] = pd.json_normalize(d, sep=sep).to_dict(orient="records")
+    keys = list(flat_dict.keys())
+    for key in keys:
+        if not isinstance(flat_dict[key], (int, float)):
+            flat_dict.pop(key)
+    return flat_dict
+
+
+def is_autolog_enabled():
+    return not all(autologging_is_disabled(k) for k in AUTOLOGGING_INTEGRATIONS.keys())
+
+
+def get_mlflow_log_latency(model_history=False):
+    st = time.time()
+    with mlflow.start_run(nested=True, run_name="get_mlflow_log_latency") as run:
+        if model_history:
+            sk_model = tree.DecisionTreeClassifier()
+            mlflow.sklearn.log_model(sk_model, "sk_models")
+            mlflow.sklearn.log_model(Pipeline([("estimator", sk_model)]), "sk_pipeline")
+            pickle_fpath = f"tmp_{int(time.time()*1000)}"
+            with open(pickle_fpath, "wb") as f:
+                pickle.dump(sk_model, f)
+            mlflow.log_artifact(pickle_fpath, "sk_model1")
+            mlflow.log_artifact(pickle_fpath, "sk_model2")
+            os.remove(pickle_fpath)
+        mlflow.set_tag("synapseml.ui.visible", "false")  # not shown inline in fabric
+    mlflow.delete_run(run.info.run_id)
+    et = time.time()
+    return et - st
+
+
+def infer_signature(X_train=None, y_train=None, dataframe=None, label=None):
+    if X_train is not None:
+        if issparse(X_train):
+            X_train = X_train.tocsr()
+        elif isinstance(X_train, psDataFrame):
+            X_train = X_train.to_spark(index_col="tmp_index_col")
+            y_train = None
+        try:
+            signature = mlflow.models.infer_signature(X_train, y_train)
+            return signature
+        except (TypeError, MlflowException, Exception) as e:
+            logger.debug(
+                f"Failed to infer signature from X_train {type(X_train)} and y_train {type(y_train)}, error: {e}"
+            )
+    else:
+        if dataframe is not None and label is not None:
+            X = dataframe.drop(columns=label)
+            y = dataframe[label]
+            if isinstance(dataframe, psDataFrame):
+                X = X.to_spark(index_col="tmp_index_col")
+                y = None
+            try:
+                signature = mlflow.models.infer_signature(X, y)
+                return signature
+            except (TypeError, MlflowException, Exception) as e:
+                logger.debug(
+                    f"Failed to infer signature from dataframe {type(dataframe)} and label {label}, error: {e}"
+                )
+
+
+def _mlflow_wrapper(evaluation_func, mlflow_exp_id, mlflow_config=None, extra_tags=None, autolog=False):
+    def wrapped(*args, **kwargs):
+        if mlflow_config is not None:
+            from synapse.ml.mlflow import set_mlflow_env_config
+
+            set_mlflow_env_config(mlflow_config)
+        import mlflow
+
+        if mlflow_exp_id is not None:
+            mlflow.set_experiment(experiment_id=mlflow_exp_id)
+        if autolog:
+            if mlflow.__version__ > "2.5.0" and extra_tags is not None:
+                mlflow.autolog(silent=True, extra_tags=extra_tags)
+            else:
+                mlflow.autolog(silent=True)
+            logger.debug("activated mlflow autologging on executor")
+        else:
+            mlflow.autolog(disable=True, silent=True)
+        # with mlflow.start_run(nested=True):
+        result = evaluation_func(*args, **kwargs)
+        return result
+
+    return wrapped
+
+
+def _get_notebook_name():
+    return None
+
+
+class MLflowIntegration:
+    def __init__(self, experiment_type="automl", mlflow_exp_name=None, extra_tag=None):
+        try:
+            from synapse.ml.mlflow import get_mlflow_env_config
+
+            self.driver_mlflow_env_config = get_mlflow_env_config()
+            self._on_internal = True
+            self._notebook_name = _get_notebook_name()
+        except ModuleNotFoundError:
+            self.driver_mlflow_env_config = None
+            self._on_internal = False
+            self._notebook_name = None
+
+        self.autolog = False
+        self.manual_log = False
+        self.parent_run_id = None
+        self.parent_run_name = None
+        self.log_type = "null"
+        self.resume_params = {}
+        self.train_func = None
+        self.best_iteration = None
+        self.best_run_id = None
+        self.child_counter = 0
+        self.infos = []
+        self.manual_run_ids = []
+        self.has_summary = False
+        self.has_model = False
+        self.only_history = False
+        self._do_log_model = True
+
+        self.extra_tag = (
+            extra_tag
+            if extra_tag is not None
+            else {"extra_tag.sid": f"flaml_{__version__}_{int(time.time())}_{random.randint(1001, 9999)}"}
+        )
+        self.start_time = time.time()
+        self.mlflow_client = mlflow.tracking.MlflowClient()
+        parent_run_info = mlflow.active_run().info if mlflow.active_run() is not None else None
+        if parent_run_info:
+            self.experiment_id = parent_run_info.experiment_id
+            self.parent_run_id = parent_run_info.run_id
+            # attribute run_name is not available before mlflow 2.0.1
+            self.parent_run_name = parent_run_info.run_name if hasattr(parent_run_info, "run_name") else "flaml_run"
+            if self.parent_run_name == "":
+                self.parent_run_name = mlflow.active_run().data.tags["mlflow.runName"]
+        else:
+            if mlflow_exp_name is None:
+                if mlflow.tracking.fluent._active_experiment_id is None:
+                    mlflow_exp_name = self._notebook_name if self._notebook_name else "flaml_default_experiment"
+                    mlflow.set_experiment(experiment_name=mlflow_exp_name)
+            else:
+                mlflow.set_experiment(experiment_name=mlflow_exp_name)
+            self.experiment_id = mlflow.tracking.fluent._active_experiment_id
+        self.experiment_name = mlflow.get_experiment(self.experiment_id).name
+        self.experiment_type = experiment_type
+        self.update_autolog_state()
+
+        if self.autolog:
+            # only end user created parent run in autolog scenario
+            mlflow.end_run()
+
+    def set_mlflow_config(self):
+        if self.driver_mlflow_env_config is not None:
+            from synapse.ml.mlflow import set_mlflow_env_config
+
+            set_mlflow_env_config(self.driver_mlflow_env_config)
+
+    def wrap_evaluation_function(self, evaluation_function):
+        wrapped_evaluation_function = _mlflow_wrapper(
+            evaluation_function, self.experiment_id, self.driver_mlflow_env_config, self.extra_tag, self.autolog
+        )
+        return wrapped_evaluation_function
+
+    def set_best_iter(self, result):
+        # result: AutoML or ExperimentAnalysis
+        try:
+            self.best_iteration = result.best_iteration
+        except AttributeError:
+            self.best_iteration = None
+
+    def update_autolog_state(
+        self,
+    ):
+        # Currently we disable autologging for better control in AutoML
+        _autolog = is_autolog_enabled()
+        self._do_log_model = AUTOLOGGING_INTEGRATIONS["mlflow"].get("log_models", True)
+        if self.experiment_type == "automl":
+            self.autolog = False
+            self.manual_log = mlflow.active_run() is not None or _autolog
+            self.log_type = "manual"
+            if _autolog:
+                logger.debug("Disabling autologging")
+                self.resume_params = AUTOLOGGING_INTEGRATIONS["mlflow"].copy()
+                mlflow.autolog(disable=True, silent=True, log_models=self._do_log_model)
+                self.log_type = "r_autolog"  # 'r' for replace autolog with manual log
+
+        elif self.experiment_type == "tune":
+            self.autolog = _autolog
+            self.manual_log = not self.autolog and mlflow.active_run() is not None
+
+            if self.autolog:
+                self.log_type = "autolog"
+
+            if self.manual_log:
+                self.log_type = "manual"
+        else:
+            raise ValueError(f"Unknown experiment type: {self.experiment_type}")
+
+    def copy_mlflow_run(self, src_id, target_id, components=["param", "metric", "tag"]):
+        src_run = self.mlflow_client.get_run(src_id)
+        if "param" in components:
+            for param_name, param_value in src_run.data.params.items():
+                try:
+                    self.mlflow_client.log_param(target_id, param_name, param_value)
+                except mlflow.exceptions.MlflowException:
+                    pass
+
+        timestamp = int(time.time() * 1000)
+
+        if "metric" in components:
+            _metrics = [Metric(key, value, timestamp, 0) for key, value in src_run.data.metrics.items()]
+        else:
+            _metrics = []
+
+        if "tag" in components:
+            _tags = [
+                RunTag(key, str(value))
+                for key, value in src_run.data.tags.items()
+                if key.startswith("flaml") or key.startswith("synapseml")
+            ]
+        else:
+            _tags = []
+        self.mlflow_client.log_batch(run_id=target_id, metrics=_metrics, params=[], tags=_tags)
+
+    def record_trial(self, result, trial, metric):
+        if isinstance(result, dict):
+            metrics = flatten_dict(result)
+            metric_name = str(list(metrics.keys()))
+        else:
+            metrics = {metric: result}
+            metric_name = metric
+
+        if "ml" in trial.config.keys():
+            params = trial.config["ml"]
+        else:
+            params = trial.config
+
+        info = {
+            "metrics": metrics,
+            "params": params,
+            "tags": {
+                "flaml.best_run": False,
+                "flaml.iteration_number": self.child_counter,
+                "flaml.version": __version__,
+                "flaml.meric": metric_name,
+                "flaml.run_source": "flaml-tune",
+                "flaml.log_type": self.log_type,
+            },
+            "submetrics": {
+                "values": [],
+            },
+        }
+
+        self.infos.append(info)
+
+        if not self.autolog and not self.manual_log:
+            return
+
+        if self.manual_log:
+            with mlflow.start_run(
+                nested=True, run_name=f"{self.parent_run_name}_child_{self.child_counter}"
+            ) as child_run:
+                self._log_info_to_run(info, child_run.info.run_id, log_params=True)
+                self.manual_run_ids.append(child_run.info.run_id)
+        self.child_counter += 1
+
+    def log_tune(self, analysis, metric):
+        self.set_best_iter(analysis)
+        if self.autolog:
+            if self.parent_run_id is not None:
+                mlflow.start_run(run_id=self.parent_run_id, experiment_id=self.experiment_id)
+                mlflow.log_metric("num_child_runs", len(self.infos))
+            self.adopt_children(analysis)
+
+        if self.manual_log:
+            if "ml" in analysis.best_config.keys():
+                mlflow.log_params(analysis.best_config["ml"])
+            else:
+                mlflow.log_params(analysis.best_config)
+            mlflow.log_metric("best_" + metric, analysis.best_result[metric])
+            best_mlflow_run_id = self.manual_run_ids[analysis.best_iteration]
+            best_mlflow_run_name = self.mlflow_client.get_run(best_mlflow_run_id).info.run_name
+            analysis.best_run_id = best_mlflow_run_id
+            analysis.best_run_name = best_mlflow_run_name
+            self.mlflow_client.set_tag(best_mlflow_run_id, "flaml.best_run", True)
+            self.best_run_id = best_mlflow_run_id
+            if not self.has_summary:
+                self.copy_mlflow_run(best_mlflow_run_id, self.parent_run_id)
+                self.has_summary = True
+
+    def log_model(self, model, estimator, signature=None):
+        if not self._do_log_model:
+            return
+        logger.debug(f"logging model {estimator}")
+        if estimator.endswith("_spark"):
+            mlflow.spark.log_model(model, estimator, signature=signature)
+            mlflow.spark.log_model(model, "model", signature=signature)
+        elif estimator in ["lgbm"]:
+            mlflow.lightgbm.log_model(model, estimator, signature=signature)
+        elif estimator in ["transformer", "transformer_ms"]:
+            mlflow.transformers.log_model(model, estimator, signature=signature)
+        elif estimator in ["arima", "sarimax", "holt-winters", "snaive", "naive", "savg", "avg", "ets"]:
+            mlflow.statsmodels.log_model(model, estimator, signature=signature)
+        elif estimator in ["tcn", "tft"]:
+            mlflow.pytorch.log_model(model, estimator, signature=signature)
+        elif estimator in ["prophet"]:
+            mlflow.prophet.log_model(model, estimator, signature=signature)
+        elif estimator in ["orbit"]:
+            pass
+        else:
+            mlflow.sklearn.log_model(model, estimator, signature=signature)
+
+    def _pickle_and_log_artifact(self, obj, artifact_name, pickle_fpath="temp_.pkl"):
+        if not self._do_log_model:
+            return
+        with open(pickle_fpath, "wb") as f:
+            pickle.dump(obj, f)
+        mlflow.log_artifact(pickle_fpath, artifact_name)
+
+    def pickle_and_log_automl_artifacts(self, automl, model, estimator, signature=None):
+        """log automl artifacts to mlflow
+        load back with `automl = mlflow.pyfunc.load_model(model_run_id_or_uri)`, then do prediction with `automl.predict(X)`
+        """
+        logger.debug(f"logging automl artifacts {estimator}")
+        self._pickle_and_log_artifact(automl.feature_transformer, "feature_transformer", "feature_transformer.pkl")
+        self._pickle_and_log_artifact(automl.label_transformer, "label_transformer", "label_transformer.pkl")
+        # Test test_mlflow 1 and 4 will get error: TypeError: cannot pickle '_io.TextIOWrapper' object
+        # try:
+        #     self._pickle_and_log_artifact(automl, "automl", "automl.pkl")
+        # except TypeError:
+        #     pass
+        if estimator.endswith("_spark"):
+            # spark pipeline is not supported yet
+            return
+        feature_transformer = automl.feature_transformer
+        if isinstance(feature_transformer, Pipeline):
+            pipeline = feature_transformer
+            pipeline.steps.append(("estimator", model))
+        elif isinstance(feature_transformer, SparkPipeline):
+            pipeline = feature_transformer
+            pipeline.stages.append(model)
+        elif not estimator.endswith("_spark"):
+            steps = [("feature_transformer", feature_transformer)]
+            steps.append(("estimator", model))
+            pipeline = Pipeline(steps)
+        else:
+            stages = [feature_transformer]
+            stages.append(model)
+            pipeline = SparkPipeline(stages=stages)
+        if isinstance(pipeline, SparkPipeline):
+            logger.debug(f"logging spark pipeline {estimator}")
+            mlflow.spark.log_model(pipeline, "automl_pipeline", signature=signature)
+        else:
+            # Add a log named "model" to fit default settings
+            logger.debug(f"logging sklearn pipeline {estimator}")
+            mlflow.sklearn.log_model(pipeline, "automl_pipeline", signature=signature)
+            mlflow.sklearn.log_model(pipeline, "model", signature=signature)
+
+    def record_state(self, automl, search_state, estimator):
+        _st = time.time()
+        automl_metric_name = (
+            automl._state.metric if isinstance(automl._state.metric, str) else automl._state.error_metric
+        )
+
+        if automl._state.error_metric.startswith("1-"):
+            automl_metric_value = 1 - search_state.val_loss
+        elif automl._state.error_metric.startswith("-"):
+            automl_metric_value = -search_state.val_loss
+        else:
+            automl_metric_value = search_state.val_loss
+
+        if "ml" in search_state.config:
+            config = search_state.config["ml"]
+        else:
+            config = search_state.config
+
+        info = {
+            "metrics": {
+                "iter_counter": automl._track_iter,
+                "trial_time": search_state.trial_time,
+                "wall_clock_time": automl._state.time_from_start,
+                "validation_loss": search_state.val_loss,
+                "best_validation_loss": search_state.best_loss,
+                automl_metric_name: automl_metric_value,
+            },
+            "tags": {
+                "flaml.best_run": False,
+                "flaml.estimator_name": estimator,
+                "flaml.estimator_class": search_state.learner_class.__name__,
+                "flaml.iteration_number": automl._track_iter,
+                "flaml.version": __version__,
+                "flaml.learner": estimator,
+                "flaml.sample_size": search_state.sample_size,
+                "flaml.meric": automl_metric_name,
+                "flaml.run_source": "flaml-automl",
+                "flaml.log_type": self.log_type,
+                "flaml.automl_user_configurations": json.dumps(automl._automl_user_configurations),
+            },
+            "params": {
+                "sample_size": search_state.sample_size,
+                "learner": estimator,
+                **config,
+            },
+            "submetrics": {
+                "iter_counter": automl._iter_per_learner[estimator],
+                "values": [],
+            },
+        }
+
+        if (search_state.metric_for_logging is not None) and (
+            "intermediate_results" in search_state.metric_for_logging
+        ):
+            info["submetrics"]["values"] = search_state.metric_for_logging["intermediate_results"]
+
+        self.infos.append(info)
+
+        if not self.autolog and not self.manual_log:
+            return
+        if self.manual_log:
+            if self.parent_run_name is not None:
+                run_name = f"{self.parent_run_name}_child_{self.child_counter}"
+            else:
+                run_name = None
+            with mlflow.start_run(nested=True, run_name=run_name) as child_run:
+                self._log_info_to_run(info, child_run.info.run_id, log_params=True)
+                if automl._state.model_history:
+                    self.log_model(
+                        search_state.trained_estimator._model, estimator, signature=automl.estimator_signature
+                    )
+                    self.pickle_and_log_automl_artifacts(
+                        automl, search_state.trained_estimator, estimator, signature=automl.pipeline_signature
+                    )
+                self.manual_run_ids.append(child_run.info.run_id)
+            self.child_counter += 1
+
+    def log_automl(self, automl):
+        self.set_best_iter(automl)
+        if self.autolog:
+            if self.parent_run_id is not None:
+                mlflow.start_run(run_id=self.parent_run_id, experiment_id=self.experiment_id)
+                mlflow.log_metric("best_validation_loss", automl._state.best_loss)
+                mlflow.log_metric("best_iteration", automl._best_iteration)
+                mlflow.log_metric("num_child_runs", len(self.infos))
+                if automl._trained_estimator is not None and not self.has_model:
+                    self.log_model(
+                        automl._trained_estimator._model, automl.best_estimator, signature=automl.estimator_signature
+                    )
+                    self.pickle_and_log_automl_artifacts(
+                        automl, automl.model, automl.best_estimator, signature=automl.pipeline_signature
+                    )
+                    self.has_model = True
+
+            self.adopt_children(automl)
+
+        if self.manual_log:
+            best_mlflow_run_id = self.manual_run_ids[automl._best_iteration]
+            best_run_name = self.mlflow_client.get_run(best_mlflow_run_id).info.run_name
+            automl.best_run_id = best_mlflow_run_id
+            automl.best_run_name = best_run_name
+            self.mlflow_client.set_tag(best_mlflow_run_id, "flaml.best_run", True)
+            self.best_run_id = best_mlflow_run_id
+            if self.parent_run_id is not None:
+                conf = automl._config_history[automl._best_iteration][1].copy()
+                if "ml" in conf.keys():
+                    conf = conf["ml"]
+
+                mlflow.log_params(conf)
+                mlflow.log_param("best_learner", automl._best_estimator)
+                if not self.has_summary:
+                    logger.info(f"logging best model {automl.best_estimator}")
+                    self.copy_mlflow_run(best_mlflow_run_id, self.parent_run_id)
+                    self.has_summary = True
+                    if automl._trained_estimator is not None and not self.has_model:
+                        self.log_model(
+                            automl._trained_estimator._model,
+                            automl.best_estimator,
+                            signature=automl.estimator_signature,
+                        )
+                        self.pickle_and_log_automl_artifacts(
+                            automl, automl.model, automl.best_estimator, signature=automl.pipeline_signature
+                        )
+                        self.has_model = True
+
+    def resume_mlflow(self):
+        if len(self.resume_params) > 0:
+            mlflow.autolog(**self.resume_params)
+
+    def _log_info_to_run(self, info, run_id, log_params=False):
+        _metrics = [Metric(key, value, int(time.time() * 1000), 0) for key, value in info["metrics"].items()]
+        _tags = [RunTag(key, str(value)) for key, value in info["tags"].items()]
+        _params = [
+            Param(key, str(value))
+            for key, value in info["params"].items()
+            if log_params or key in ["sample_size", "learner"]
+        ]
+        self.mlflow_client.log_batch(run_id=run_id, metrics=_metrics, params=_params, tags=_tags)
+
+        if len(info["submetrics"]["values"]) > 0:
+            for each_entry in info["submetrics"]["values"]:
+                with mlflow.start_run(nested=True) as run:
+                    each_entry.update({"iter_counter": info["submetrics"]["iter_counter"]})
+                    _metrics = [Metric(key, value, int(time.time() * 1000), 0) for key, value in each_entry.items()]
+                    _tags = [RunTag("mlflow.parentRunId", run_id)]
+                    self.mlflow_client.log_batch(run_id=run.info.run_id, metrics=_metrics, params=[], tags=_tags)
+            del info["submetrics"]["values"]
+
+    def adopt_children(self, result=None):
+        """
+        Set autologging child runs to nested by fetching them after all child runs are completed.
+        Note that this may cause disorder when concurrently starting multiple AutoML processes
+        with the same experiment name if the MLflow version is less than or equal to "2.5.0".
+        """
+        if self.autolog:
+            best_iteration = self.best_iteration
+            if best_iteration is None:
+                logger.warning("best_iteration is None, cannot identify best run")
+            raw_autolog_child_runs = mlflow.search_runs(
+                experiment_ids=[self.experiment_id],
+                order_by=["attributes.start_time DESC"],
+                max_results=SEARCH_MAX_RESULTS,
+                output_format="list",
+                filter_string=(
+                    f"tags.extra_tag.sid = '{self.extra_tag['extra_tag.sid']}'" if mlflow.__version__ > "2.5.0" else ""
+                ),
+            )
+            self.child_counter = 0
+
+            # From latest to earliest, remove duplicate cross-validation runs
+            _exist_child_run_params = []  # for deduplication of cross-validation child runs
+            _to_keep_autolog_child_runs = []
+            for autolog_child_run in raw_autolog_child_runs:
+                child_start_time = autolog_child_run.info.start_time / 1000
+
+                if child_start_time < self.start_time:
+                    continue
+
+                _current_child_run_params = autolog_child_run.data.params
+                # remove n_estimators as some models will train with small n_estimators to estimate time budget
+                if self.experiment_type == "automl":
+                    _current_child_run_params.pop("n_estimators", None)
+                if _current_child_run_params in _exist_child_run_params:
+                    # remove duplicate cross-validation run
+                    self.mlflow_client.delete_run(autolog_child_run.info.run_id)
+                    continue
+                else:
+                    _exist_child_run_params.append(_current_child_run_params)
+                    _to_keep_autolog_child_runs.append(autolog_child_run)
+
+            # From earliest to latest, set tags and child_counter
+            autolog_child_runs = _to_keep_autolog_child_runs[::-1]
+            for autolog_child_run in autolog_child_runs:
+                child_run_id = autolog_child_run.info.run_id
+                child_run_parent_id = autolog_child_run.data.tags.get("mlflow.parentRunId", None)
+                child_start_time = autolog_child_run.info.start_time / 1000
+
+                if child_start_time < self.start_time:
+                    continue
+
+                if all(
+                    [
+                        len(autolog_child_run.data.params) == 0,
+                        len(autolog_child_run.data.metrics) == 0,
+                        child_run_id != self.parent_run_id,
+                    ]
+                ):
+                    # remove empty run
+                    # empty run could be created by mlflow autologging
+                    self.mlflow_client.delete_run(autolog_child_run.info.run_id)
+                    continue
+
+                if all(
+                    [
+                        child_run_id != self.parent_run_id,
+                        child_run_parent_id is None or child_run_parent_id == self.parent_run_id,
+                    ]
+                ):
+                    if self.parent_run_id is not None:
+                        self.mlflow_client.set_tag(
+                            child_run_id,
+                            "mlflow.parentRunId",
+                            self.parent_run_id,
+                        )
+                        if IS_RENAME_CHILD_RUN:
+                            self.mlflow_client.set_tag(
+                                child_run_id,
+                                "mlflow.runName",
+                                f"{self.parent_run_name}_child_{self.child_counter}",
+                            )
+                        self.mlflow_client.set_tag(child_run_id, "flaml.child_counter", self.child_counter)
+
+                    # merge autolog child run and corresponding manual run
+                    flaml_info = self.infos[self.child_counter]
+                    child_run = self.mlflow_client.get_run(child_run_id)
+                    self._log_info_to_run(flaml_info, child_run_id, log_params=False)
+
+                    if self.experiment_type == "automl":
+                        if "learner" not in child_run.data.params:
+                            self.mlflow_client.log_param(child_run_id, "learner", flaml_info["params"]["learner"])
+                        if "sample_size" not in child_run.data.params:
+                            self.mlflow_client.log_param(
+                                child_run_id, "sample_size", flaml_info["params"]["sample_size"]
+                            )
+
+                    if self.child_counter == best_iteration:
+                        self.mlflow_client.set_tag(child_run_id, "flaml.best_run", True)
+                        if result is not None:
+                            result.best_run_id = child_run_id
+                            result.best_run_name = child_run.info.run_name
+                            self.best_run_id = child_run_id
+                        if self.parent_run_id is not None and not self.has_summary:
+                            self.copy_mlflow_run(child_run_id, self.parent_run_id)
+                            self.has_summary = True
+                    self.child_counter += 1
+
+    def retrain(self, train_func, config):
+        """retrain with given config, added for logging the best config and model to parent run.
+        No more needed after v2.0.2post2 as we no longer log best config and model to parent run.
+        """
+        if self.autolog:
+            self.set_mlflow_config()
+            self.has_summary = True
+            with mlflow.start_run(run_id=self.parent_run_id):
+                train_func(config)
+
+    def __del__(self):
+        # mlflow.end_run()  # this will end the parent run when re-fit an AutoML instance. Bug 2922020: Inconsistent Run Creation Output
+        self.resume_mlflow()
+
+
+def register_automl_pipeline(automl, model_name=None, signature=None):
+    pipeline = automl.automl_pipeline
+    if pipeline is None:
+        logger.warning("pipeline not found, cannot register it")
+        return
+    if model_name is None:
+        model_name = automl._mlflow_exp_name + "_pipeline"
+    if automl.best_run_id is None:
+        mlflow.sklearn.log_model(
+            pipeline,
+            "automl_pipeline",
+            registered_model_name=model_name,
+            signature=automl.pipeline_signature if signature is None else signature,
+        )
+        mvs = mlflow.search_model_versions(
+            filter_string=f"name='{model_name}'", order_by=["attribute.version_number ASC"], max_results=1
+        )
+        return mvs[0]
+    else:
+        best_run = mlflow.get_run(automl.best_run_id)
+        model_uri = f"runs:/{best_run.info.run_id}/automl_pipeline"
+        return mlflow.register_model(model_uri, model_name)
--- a/flaml/onlineml/README.md
+++ b/flaml/onlineml/README.md
@@ -4,7 +4,8 @@ FLAML includes *ChaCha* which is an automatic hyperparameter tuning solution for

 For more technical details about *ChaCha*, please check our paper.

-* [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.
+- [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.
+
 ```
@inproceedings{wu2021chacha,
    title={ChaCha for online AutoML},
@@ -23,8 +24,9 @@ An example of online namespace interactions tuning in VW:
 ```python
 # require: pip install flaml[vw]
 from flaml import AutoVW
-'''create an AutoVW instance for tuning namespace interactions'''
-autovw = AutoVW(max_live_model_num=5, search_space={'interactions': AutoVW.AUTOMATIC})
+
+"""create an AutoVW instance for tuning namespace interactions"""
+autovw = AutoVW(max_live_model_num=5, search_space={"interactions": AutoVW.AUTOMATIC})
 ```

 An example of online tuning of both namespace interactions and learning rate in VW:
@@ -33,12 +35,18 @@ An example of online tuning of both namespace interactions and learning rate in
 # require: pip install flaml[vw]
 from flaml import AutoVW
 from flaml.tune import loguniform
-''' create an AutoVW instance for tuning namespace interactions and learning rate'''
+
+""" create an AutoVW instance for tuning namespace interactions and learning rate"""
 # set up the search space and init config
-search_space_nilr = {'interactions': AutoVW.AUTOMATIC, 'learning_rate': loguniform(lower=2e-10, upper=1.0)}
-init_config_nilr = {'interactions': set(), 'learning_rate': 0.5}
+search_space_nilr = {
+    "interactions": AutoVW.AUTOMATIC,
+    "learning_rate": loguniform(lower=2e-10, upper=1.0),
+}
+init_config_nilr = {"interactions": set(), "learning_rate": 0.5}
 # create an AutoVW instance
-autovw = AutoVW(max_live_model_num=5, search_space=search_space_nilr, init_config=init_config_nilr)
+autovw = AutoVW(
+    max_live_model_num=5, search_space=search_space_nilr, init_config=init_config_nilr
+)
 ```

 A user can use the resulting AutoVW instances `autovw` in a similar way to a vanilla Vowpal Wabbit instance, i.e., `pyvw.vw`, to perform online learning by iteratively calling its `predict(data_example)` and `learn(data_example)` functions at each data example.
--- a/flaml/tune/README.md
+++ b/flaml/tune/README.md
@@ -5,45 +5,47 @@ It can be used standalone, or together with ray tune or nni. Please find detaile

 Below are some quick examples.

-* Example for sequential tuning (recommended when compute resource is limited and each trial can consume all the resources):
+- Example for sequential tuning (recommended when compute resource is limited and each trial can consume all the resources):

 ```python
 # require: pip install flaml[blendsearch]
 from flaml import tune
 import time

+
 def evaluate_config(config):
-    '''evaluate a hyperparameter configuration'''
+    """evaluate a hyperparameter configuration"""
    # we uss a toy example with 2 hyperparameters
-    metric = (round(config['x'])-85000)**2 - config['x']/config['y']
+    metric = (round(config["x"]) - 85000) ** 2 - config["x"] / config["y"]
    # usually the evaluation takes an non-neglible cost
    # and the cost could be related to certain hyperparameters
    # in this example, we assume it's proportional to x
-    time.sleep(config['x']/100000)
+    time.sleep(config["x"] / 100000)
    # use tune.report to report the metric to optimize
    tune.report(metric=metric)

+
 analysis = tune.run(
-    evaluate_config,    # the function to evaluate a config
+    evaluate_config,  # the function to evaluate a config
    config={
-        'x': tune.lograndint(lower=1, upper=100000),
-        'y': tune.randint(lower=1, upper=100000)
-    }, # the search space
-    low_cost_partial_config={'x':1},    # a initial (partial) config with low cost
-    metric='metric',    # the name of the metric used for optimization
-    mode='min',         # the optimization mode, 'min' or 'max'
-    num_samples=-1,    # the maximal number of configs to try, -1 means infinite
-    time_budget_s=60,   # the time budget in seconds
-    local_dir='logs/',  # the local directory to store logs
+        "x": tune.lograndint(lower=1, upper=100000),
+        "y": tune.randint(lower=1, upper=100000),
+    },  # the search space
+    low_cost_partial_config={"x": 1},  # a initial (partial) config with low cost
+    metric="metric",  # the name of the metric used for optimization
+    mode="min",  # the optimization mode, 'min' or 'max'
+    num_samples=-1,  # the maximal number of configs to try, -1 means infinite
+    time_budget_s=60,  # the time budget in seconds
+    local_dir="logs/",  # the local directory to store logs
    # verbose=0,          # verbosity
    # use_ray=True, # uncomment when performing parallel tuning using ray
-    )
+)

 print(analysis.best_trial.last_result)  # the best trial's result
-print(analysis.best_config) # the best config
+print(analysis.best_config)  # the best config
 ```

-* Example for using ray tune's API:
+- Example for using ray tune's API:

 ```python
 # require: pip install flaml[blendsearch,ray]
@@ -51,36 +53,39 @@ from ray import tune as raytune
 from flaml import CFO, BlendSearch
 import time

+
 def evaluate_config(config):
-    '''evaluate a hyperparameter configuration'''
+    """evaluate a hyperparameter configuration"""
    # we use a toy example with 2 hyperparameters
-    metric = (round(config['x'])-85000)**2 - config['x']/config['y']
+    metric = (round(config["x"]) - 85000) ** 2 - config["x"] / config["y"]
    # usually the evaluation takes a non-neglible cost
    # and the cost could be related to certain hyperparameters
    # in this example, we assume it's proportional to x
-    time.sleep(config['x']/100000)
+    time.sleep(config["x"] / 100000)
    # use tune.report to report the metric to optimize
    tune.report(metric=metric)

+
 # provide a time budget (in seconds) for the tuning process
 time_budget_s = 60
 # provide the search space
 config_search_space = {
-        'x': tune.lograndint(lower=1, upper=100000),
-        'y': tune.randint(lower=1, upper=100000)
-    }
+    "x": tune.lograndint(lower=1, upper=100000),
+    "y": tune.randint(lower=1, upper=100000),
+}
 # provide the low cost partial config
-low_cost_partial_config={'x':1}
+low_cost_partial_config = {"x": 1}

 # set up CFO
 cfo = CFO(low_cost_partial_config=low_cost_partial_config)

 # set up BlendSearch
 blendsearch = BlendSearch(
-    metric="metric", mode="min",
+    metric="metric",
+    mode="min",
    space=config_search_space,
    low_cost_partial_config=low_cost_partial_config,
-    time_budget_s=time_budget_s
+    time_budget_s=time_budget_s,
 )
 # NOTE: when using BlendSearch as a search_alg in ray tune, you need to
 # configure the 'time_budget_s' for BlendSearch accordingly such that
@@ -89,28 +94,28 @@ blendsearch = BlendSearch(
 # automatically in flaml.

 analysis = raytune.run(
-    evaluate_config,    # the function to evaluate a config
+    evaluate_config,  # the function to evaluate a config
    config=config_search_space,
-    metric='metric',    # the name of the metric used for optimization
-    mode='min',         # the optimization mode, 'min' or 'max'
-    num_samples=-1,     # the maximal number of configs to try, -1 means infinite
-    time_budget_s=time_budget_s,   # the time budget in seconds
-    local_dir='logs/',  # the local directory to store logs
-    search_alg=blendsearch  # or cfo
+    metric="metric",  # the name of the metric used for optimization
+    mode="min",  # the optimization mode, 'min' or 'max'
+    num_samples=-1,  # the maximal number of configs to try, -1 means infinite
+    time_budget_s=time_budget_s,  # the time budget in seconds
+    local_dir="logs/",  # the local directory to store logs
+    search_alg=blendsearch,  # or cfo
 )

 print(analysis.best_trial.last_result)  # the best trial's result
 print(analysis.best_config)  # the best config
 ```

-* Example for using NNI: An example of using BlendSearch with NNI can be seen in [test](https://github.com/microsoft/FLAML/tree/main/test/nni). CFO can be used as well in a similar manner. To run the example, first make sure you have [NNI](https://nni.readthedocs.io/en/stable/) installed, then run:
+- Example for using NNI: An example of using BlendSearch with NNI can be seen in [test](https://github.com/microsoft/FLAML/tree/main/test/nni). CFO can be used as well in a similar manner. To run the example, first make sure you have [NNI](https://nni.readthedocs.io/en/stable/) installed, then run:

 ```shell
 $nnictl create --config ./config.yml
 ```

-* For more examples, please check out
-[notebooks](https://github.com/microsoft/FLAML/tree/main/notebook/).
+- For more examples, please check out
+  [notebooks](https://github.com/microsoft/FLAML/tree/main/notebook/).

 `flaml` offers two HPO methods: CFO and BlendSearch.
 `flaml.tune` uses BlendSearch by default.
@@ -185,16 +190,16 @@ tune.run(...
 )
 ```

-* Recommended scenario: cost-related hyperparameters exist, a low-cost
-initial point is known, and the search space is complex such that local search
-is prone to be stuck at local optima.
+- Recommended scenario: cost-related hyperparameters exist, a low-cost
+  initial point is known, and the search space is complex such that local search
+  is prone to be stuck at local optima.

-* Suggestion about using larger search space in BlendSearch:
-In hyperparameter optimization, a larger search space is desirable because it is more likely to include the optimal configuration (or one of the optimal configurations) in hindsight. However the performance (especially anytime performance) of most existing HPO methods is undesirable if the cost of the configurations in the search space has a large variation. Thus hand-crafted small search spaces (with relatively homogeneous cost) are often used in practice for these methods, which is subject to idiosyncrasy. BlendSearch combines the benefits of local search and global search, which enables a smart (economical) way of deciding where to explore in the search space even though it is larger than necessary. This allows users to specify a larger search space in BlendSearch, which is often easier and a better practice than narrowing down the search space by hand.
+- Suggestion about using larger search space in BlendSearch:
+  In hyperparameter optimization, a larger search space is desirable because it is more likely to include the optimal configuration (or one of the optimal configurations) in hindsight. However the performance (especially anytime performance) of most existing HPO methods is undesirable if the cost of the configurations in the search space has a large variation. Thus hand-crafted small search spaces (with relatively homogeneous cost) are often used in practice for these methods, which is subject to idiosyncrasy. BlendSearch combines the benefits of local search and global search, which enables a smart (economical) way of deciding where to explore in the search space even though it is larger than necessary. This allows users to specify a larger search space in BlendSearch, which is often easier and a better practice than narrowing down the search space by hand.

 For more technical details, please check our papers.

-* [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.
+- [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.

 ```bibtex
@inproceedings{wu2021cfo,
@@ -205,7 +210,7 @@ For more technical details, please check our papers.
 }
 ```

-* [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.
+- [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.

 ```bibtex
@inproceedings{wang2021blendsearch,
--- a/flaml/tune/searcher/flow2.py
+++ b/flaml/tune/searcher/flow2.py
@@ -109,7 +109,7 @@ class FLOW2(Searcher):
        else:
            mode = "min"

-        super(FLOW2, self).__init__(metric=metric, mode=mode)
+        super().__init__(metric=metric, mode=mode)
        # internally minimizes, so "max" => -1
        if mode == "max":
            self.metric_op = -1.0
@@ -350,7 +350,7 @@ class FLOW2(Searcher):
            else:
                assert (
                    self.lexico_objectives["tolerances"][k_metric][-1] == "%"
-                ), "String tolerance of {} should use %% as the suffix".format(k_metric)
+                ), f"String tolerance of {k_metric} should use %% as the suffix"
                tolerance_bound = self._f_best[k_metric] * (
                    1 + 0.01 * float(self.lexico_objectives["tolerances"][k_metric].replace("%", ""))
                )
@@ -385,7 +385,7 @@ class FLOW2(Searcher):
                else:
                    assert (
                        self.lexico_objectives["tolerances"][k_metric][-1] == "%"
-                    ), "String tolerance of {} should use %% as the suffix".format(k_metric)
+                    ), f"String tolerance of {k_metric} should use %% as the suffix"
                    tolerance_bound = self._f_best[k_metric] * (
                        1 + 0.01 * float(self.lexico_objectives["tolerances"][k_metric].replace("%", ""))
                    )
--- a/flaml/tune/searcher/online_searcher.py
+++ b/flaml/tune/searcher/online_searcher.py
@@ -66,7 +66,7 @@ class ChampionFrontierSearcher(BaseSearcher):
    POLY_EXPANSION_ADDITION_NUM = 1
    # the order of polynomial expansions to add based on the given seed interactions
    EXPANSION_ORDER = 2
-    # the number of new challengers with new numerical hyperparamter configs
+    # the number of new challengers with new numerical hyperparameter configs
    NUMERICAL_NUM = 2

    # In order to use CFO, a loss name and loss values of configs are need
@@ -80,7 +80,7 @@ class ChampionFrontierSearcher(BaseSearcher):
    CFO_SEARCHER_METRIC_NAME = "pseudo_loss"
    CFO_SEARCHER_LARGE_LOSS = 1e6

-    # the random seed used in generating numerical hyperparamter configs (when CFO is not used)
+    # the random seed used in generating numerical hyperparameter configs (when CFO is not used)
    NUM_RANDOM_SEED = 111

    CHAMPION_TRIAL_NAME = "champion_trial"
@@ -319,7 +319,7 @@ class ChampionFrontierSearcher(BaseSearcher):
        candidate_configs = [set(seed_interactions) | set(item) for item in space]
        final_candidate_configs = []
        for c in candidate_configs:
-            new_c = set([e for e in c if len(e) > 1])
+            new_c = {e for e in c if len(e) > 1}
            final_candidate_configs.append(new_c)
        return final_candidate_configs

--- a/flaml/tune/searcher/suggestion.py
+++ b/flaml/tune/searcher/suggestion.py
@@ -191,7 +191,7 @@ class ConcurrencyLimiter(Searcher):
        self.batch = batch
        self.live_trials = set()
        self.cached_results = {}
-        super(ConcurrencyLimiter, self).__init__(metric=self.searcher.metric, mode=self.searcher.mode)
+        super().__init__(metric=self.searcher.metric, mode=self.searcher.mode)

    def suggest(self, trial_id: str) -> Optional[Dict]:
        assert trial_id not in self.live_trials, f"Trial ID {trial_id} must be unique: already found in set."
@@ -285,25 +285,21 @@ def validate_warmstart(
    """
    if points_to_evaluate:
        if not isinstance(points_to_evaluate, list):
-            raise TypeError("points_to_evaluate expected to be a list, got {}.".format(type(points_to_evaluate)))
+            raise TypeError(f"points_to_evaluate expected to be a list, got {type(points_to_evaluate)}.")
        for point in points_to_evaluate:
            if not isinstance(point, (dict, list)):
                raise TypeError(f"points_to_evaluate expected to include list or dict, " f"got {point}.")

            if validate_point_name_lengths and (not len(point) == len(parameter_names)):
-                raise ValueError(
-                    "Dim of point {}".format(point)
-                    + " and parameter_names {}".format(parameter_names)
-                    + " do not match."
-                )
+                raise ValueError(f"Dim of point {point}" + f" and parameter_names {parameter_names}" + " do not match.")

    if points_to_evaluate and evaluated_rewards:
        if not isinstance(evaluated_rewards, list):
-            raise TypeError("evaluated_rewards expected to be a list, got {}.".format(type(evaluated_rewards)))
+            raise TypeError(f"evaluated_rewards expected to be a list, got {type(evaluated_rewards)}.")
        if not len(evaluated_rewards) == len(points_to_evaluate):
            raise ValueError(
-                "Dim of evaluated_rewards {}".format(evaluated_rewards)
-                + " and points_to_evaluate {}".format(points_to_evaluate)
+                f"Dim of evaluated_rewards {evaluated_rewards}"
+                + f" and points_to_evaluate {points_to_evaluate}"
                + " do not match."
            )

@@ -547,7 +543,7 @@ class OptunaSearch(Searcher):
        evaluated_rewards: Optional[List] = None,
    ):
        assert ot is not None, "Optuna must be installed! Run `pip install optuna`."
-        super(OptunaSearch, self).__init__(metric=metric, mode=mode)
+        super().__init__(metric=metric, mode=mode)

        if isinstance(space, dict) and space:
            resolved_vars, domain_vars, grid_vars = parse_spec_vars(space)
@@ -561,7 +557,15 @@ class OptunaSearch(Searcher):
        self._space = space

        self._points_to_evaluate = points_to_evaluate or []
-        self._evaluated_rewards = evaluated_rewards
+        # rewards should be a list of floats, not a dict
+        # After Optuna > 3.5.0, there is a check for NaN in the list "any(math.isnan(x) for x in self._values)"
+        # which will raise an error when encountering a dict
+        if evaluated_rewards is not None:
+            self._evaluated_rewards = [
+                list(item.values())[0] if isinstance(item, dict) else item for item in evaluated_rewards
+            ]
+        else:
+            self._evaluated_rewards = evaluated_rewards

        self._study_name = "optuna"  # Fixed study name for in-memory storage

@@ -873,9 +877,9 @@ class OptunaSearch(Searcher):

            elif isinstance(domain, Integer):
                if isinstance(sampler, LogUniform):
-                    return ot.distributions.IntLogUniformDistribution(
-                        domain.lower, domain.upper - 1, step=quantize or 1
-                    )
+                    # ``step`` argument Deprecated in v2.0.0. ``step`` argument should be 1 in Log Distribution
+                    # The removal of this feature is currently scheduled for v4.0.0,
+                    return ot.distributions.IntLogUniformDistribution(domain.lower, domain.upper - 1, step=1)
                elif isinstance(sampler, Uniform):
                    # Upper bound should be inclusive for quantization and
                    # exclusive otherwise
--- a/flaml/tune/searcher/variant_generator.py
+++ b/flaml/tune/searcher/variant_generator.py
@@ -252,7 +252,7 @@ def _try_resolve(v) -> Tuple[bool, Any]:
        # Grid search values
        grid_values = v["grid_search"]
        if not isinstance(grid_values, list):
-            raise TuneError("Grid search expected list of values, got: {}".format(grid_values))
+            raise TuneError(f"Grid search expected list of values, got: {grid_values}")
        return False, Categorical(grid_values).grid()
    return True, v

@@ -302,13 +302,13 @@ def has_unresolved_values(spec: Dict) -> bool:

 class _UnresolvedAccessGuard(dict):
    def __init__(self, *args, **kwds):
-        super(_UnresolvedAccessGuard, self).__init__(*args, **kwds)
+        super().__init__(*args, **kwds)
        self.__dict__ = self

    def __getattribute__(self, item):
        value = dict.__getattribute__(self, item)
        if not _is_resolved(value):
-            raise RecursiveDependencyError("`{}` recursively depends on {}".format(item, value))
+            raise RecursiveDependencyError(f"`{item}` recursively depends on {value}")
        elif isinstance(value, dict):
            return _UnresolvedAccessGuard(value)
        else:
--- a/flaml/tune/trial.py
+++ b/flaml/tune/trial.py
@@ -110,7 +110,7 @@ class Trial:
                    }
                    self.metric_n_steps[metric] = {}
                    for n in self.n_steps:
-                        key = "last-{:d}-avg".format(n)
+                        key = f"last-{n:d}-avg"
                        self.metric_analysis[metric][key] = value
                        # Store n as string for correct restore.
                        self.metric_n_steps[metric][str(n)] = deque([value], maxlen=n)
@@ -124,7 +124,7 @@ class Trial:
                    self.metric_analysis[metric]["last"] = value

                    for n in self.n_steps:
-                        key = "last-{:d}-avg".format(n)
+                        key = f"last-{n:d}-avg"
                        self.metric_n_steps[metric][str(n)].append(value)
                        self.metric_analysis[metric][key] = sum(self.metric_n_steps[metric][str(n)]) / len(
                            self.metric_n_steps[metric][str(n)]
--- a/flaml/tune/tune.py
+++ b/flaml/tune/tune.py
@@ -29,6 +29,18 @@ from flaml.tune.spark.utils import PySparkOvertimeMonitor, check_spark
 from .result import DEFAULT_METRIC
 from .trial import Trial

+try:
+    import mlflow
+except ImportError:
+    mlflow = None
+try:
+    from flaml.fabric.mlflow import MLflowIntegration, is_autolog_enabled
+
+    internal_mlflow = True
+except ImportError:
+    internal_mlflow = False
+
+
 logger = logging.getLogger(__name__)
 logger.propagate = False
 _use_ray = True
@@ -44,6 +56,7 @@ class ExperimentAnalysis(EA):
    """Class for storing the experiment results."""

    def __init__(self, trials, metric, mode, lexico_objectives=None):
+        self.best_run_id = None
        try:
            super().__init__(self, None, trials, metric, mode)
            self.lexico_objectives = lexico_objectives
@@ -128,6 +141,16 @@ class ExperimentAnalysis(EA):
        else:
            return self.best_trial.last_result

+    @property
+    def best_iteration(self) -> List[str]:
+        """Help better navigate"""
+        best_trial = self.best_trial
+        best_trial_id = best_trial.trial_id
+        for i, trial in enumerate(self.trials):
+            if trial.trial_id == best_trial_id:
+                return i
+        return None
+

 def report(_metric=None, **kwargs):
    """A function called by the HPO application to report final or intermediate
@@ -234,6 +257,9 @@ def run(
    lexico_objectives: Optional[dict] = None,
    force_cancel: Optional[bool] = False,
    n_concurrent_trials: Optional[int] = 0,
+    mlflow_exp_name: Optional[str] = None,
+    automl_info: Optional[Tuple[float]] = None,
+    extra_tag: Optional[dict] = None,
    **ray_args,
 ):
    """The function-based way of performing HPO.
@@ -424,6 +450,10 @@ def run(
    }
    ```
        force_cancel: boolean, default=False | Whether to forcely cancel the PySpark job if overtime.
+        mlflow_exp_name: str, default=None | The name of the mlflow experiment. This should be specified if
+            enable mlflow autologging on Spark. Otherwise it will log all the results into the experiment of the
+            same name as the basename of main entry file.
+        automl_info: tuple, default=None | The information of the automl run. It should be a tuple of (mlflow_log_latency,).
        n_concurrent_trials: int, default=0 | The number of concurrent trials when perform hyperparameter
            tuning with Spark. Only valid when use_spark=True and spark is required:
            `pip install flaml[spark]`. Please check
@@ -431,6 +461,7 @@ def run(
            for more details about installing Spark. When tune.run() is called from AutoML, it will be
            overwritten by the value of `n_concurrent_trials` in AutoML. When <= 0, the concurrent trials
            will be set to the number of executors.
+        extra_tag: dict, default=None | Extra tags to be added to the mlflow runs created by autologging.
        **ray_args: keyword arguments to pass to ray.tune.run().
            Only valid when use_ray=True.
    """
@@ -438,10 +469,12 @@ def run(
    global _verbose
    global _running_trial
    global _training_iteration
+    global internal_mlflow
    old_use_ray = _use_ray
    old_verbose = _verbose
    old_running_trial = _running_trial
    old_training_iteration = _training_iteration
+
    if log_file_name:
        dir_name = os.path.dirname(log_file_name)
        if dir_name:
@@ -486,6 +519,13 @@ def run(
        else:
            logger.setLevel(logging.CRITICAL)

+    if internal_mlflow and not automl_info and (mlflow.active_run() or is_autolog_enabled()):
+        mlflow_integration = MLflowIntegration("tune", mlflow_exp_name, extra_tag)
+        evaluation_function = mlflow_integration.wrap_evaluation_function(evaluation_function)
+        _internal_mlflow = not automl_info  # True if mlflow_integration will be used for logging
+    else:
+        _internal_mlflow = False
+
    from .searcher.blendsearch import CFO, BlendSearch, RandomSearch

    if lexico_objectives is not None:
@@ -531,7 +571,7 @@ def run(
                    import optuna as _

                    SearchAlgorithm = BlendSearch
-                    logger.info("Using search algorithm {}.".format(SearchAlgorithm.__name__))
+                    logger.info(f"Using search algorithm {SearchAlgorithm.__name__}.")
                except ImportError:
                    if search_alg == "BlendSearch":
                        raise ValueError("To use BlendSearch, run: pip install flaml[blendsearch]")
@@ -540,7 +580,7 @@ def run(
                        logger.warning("Using CFO for search. To use BlendSearch, run: pip install flaml[blendsearch]")
            else:
                SearchAlgorithm = locals()[search_alg]
-                logger.info("Using search algorithm {}.".format(SearchAlgorithm.__name__))
+                logger.info(f"Using search algorithm {SearchAlgorithm.__name__}.")
            metric = metric or DEFAULT_METRIC
        search_alg = SearchAlgorithm(
            metric=metric,
@@ -713,11 +753,15 @@ def run(
                        time_budget_s = np.inf
                    num_failures = 0
                    upperbound_num_failures = (len(evaluated_rewards) if evaluated_rewards else 0) + max_failure
+                    logger.debug(f"automl_info: {automl_info}")
                    while (
                        time.time() - time_start < time_budget_s
                        and (num_samples < 0 or num_trials < num_samples)
                        and num_failures < upperbound_num_failures
                    ):
+                        if automl_info and automl_info[0] > 0 and time_budget_s < np.inf:
+                            time_budget_s -= automl_info[0]
+                            logger.debug(f"Remaining time budget with mlflow log latency: {time_budget_s} seconds.")
                        while len(_runner.running_trials) < n_concurrent_trials:
                            # suggest trials for spark
                            trial_next = _runner.step()
@@ -750,6 +794,9 @@ def run(
                            trial_to_run = trials_to_run[0]
                            _runner.running_trial = trial_to_run
                            if result is not None:
+                                if _internal_mlflow:
+                                    mlflow_integration.record_trial(result, trial_to_run, metric)
+
                                if isinstance(result, dict):
                                    if result:
                                        logger.info(f"Brief result: {result}")
@@ -758,7 +805,7 @@ def run(
                                        # When the result returned is an empty dict, set the trial status to error
                                        trial_to_run.set_status(Trial.ERROR)
                                else:
-                                    logger.info("Brief result: {}".format({metric: result}))
+                                    logger.info("Brief result: {metric: result}")
                                    report(_metric=result)
                            _runner.stop_trial(trial_to_run)
                        num_failures = 0
@@ -768,6 +815,20 @@ def run(
                        mode=mode,
                        lexico_objectives=lexico_objectives,
                    )
+                    analysis.search_space = config
+
+                    if _internal_mlflow:
+                        mlflow_integration.log_tune(analysis, metric)
+                        # try:
+                        #     _best_config = analysis.best_config
+                        # except Exception:
+                        #     _best_config = None
+                        # if _best_config:
+                        #     parallel(
+                        #         delayed(mlflow_integration.retrain)(evaluation_function, analysis.best_config)
+                        #         for dummy in [0]
+                        #     )
+
                    return analysis
                finally:
                    # recover the global variables in case of nested run
@@ -779,6 +840,8 @@ def run(
                        _runner = old_runner
                        logger.handlers = old_handlers
                        logger.setLevel(old_level)
+                    if _internal_mlflow:
+                        mlflow_integration.adopt_children()

    # simple sequential run without using tune.run() from ray
    time_start = time.time()
@@ -812,7 +875,11 @@ def run(
                result = None
                with PySparkOvertimeMonitor(time_start, time_budget_s, force_cancel):
                    result = evaluation_function(trial_to_run.config)
+                logger.debug(f"result in tune: {trial_to_run}, {result}")
                if result is not None:
+                    if _internal_mlflow:
+                        mlflow_integration.record_trial(result, trial_to_run, metric)
+
                    if isinstance(result, dict):
                        if result:
                            report(**result)
@@ -838,6 +905,19 @@ def run(
            mode=mode,
            lexico_objectives=lexico_objectives,
        )
+        analysis.search_space = config
+        if _internal_mlflow:
+            mlflow_integration.log_tune(analysis, metric)
+            if analysis.best_run_id is not None:
+                logger.info(f"Best MLflow run name: {analysis.best_run_name}")
+                logger.info(f"Best MLflow run id: {analysis.best_run_id}")
+            # try:
+            #     _best_config = analysis.best_config
+            # except Exception:
+            #     _best_config = None
+            # if _best_config:
+            #     mlflow_integration.retrain(evaluation_function, analysis.best_config)
+
        return analysis
    finally:
        # recover the global variables in case of nested run
@@ -849,6 +929,8 @@ def run(
            _runner = old_runner
            logger.handlers = old_handlers
            logger.setLevel(old_level)
+        if _internal_mlflow:
+            mlflow_integration.adopt_children()


 class Tuner:
--- a/flaml/version.py
+++ b/flaml/version.py
@@ -1 +1 @@
-__version__ = "2.1.2"
+__version__ = "2.3.0"
--- a/notebook/autogen_agentchat_RetrieveChat.ipynb
+++ b/notebook/autogen_agentchat_RetrieveChat.ipynb
@@ -2604,9 +2604,9 @@
      "  - if \"data:path\" use data-dependent defaults which are stored at path;\n",
      "  - if \"static\", use data-independent defaults.\n",
      "  If dict, keys are the name of the estimators, and values are the starting\n",
-      "  hyperparamter configurations for the corresponding estimators.\n",
-      "  The value can be a single hyperparamter configuration dict or a list\n",
-      "  of hyperparamter configuration dicts.\n",
+      "  hyperparameter configurations for the corresponding estimators.\n",
+      "  The value can be a single hyperparameter configuration dict or a list\n",
+      "  of hyperparameter configuration dicts.\n",
      "  In the following code example, we get starting_points from the\n",
      "  `automl` object and use them in the `new_automl` object.\n",
      "  e.g.,\n",
--- a/notebook/autogen_chatgpt_gpt4.ipynb
+++ b/notebook/autogen_chatgpt_gpt4.ipynb
@@ -174,7 +174,7 @@
    "import datasets\n",
    "\n",
    "seed = 41\n",
-    "data = datasets.load_dataset(\"competition_math\")\n",
+    "data = datasets.load_dataset(\"competition_math\", trust_remote_code=True)\n",
    "train_data = data[\"train\"].shuffle(seed=seed)\n",
    "test_data = data[\"test\"].shuffle(seed=seed)\n",
    "n_tune_data = 20\n",
@@ -390,7 +390,7 @@
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "\u001b[32m[I 2023-08-01 22:38:01,549]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
+      "\u001B[32m[I 2023-08-01 22:38:01,549]\u001B[0m A new study created in memory with name: optuna\u001B[0m\n"
     ]
    },
    {
--- a/notebook/autogen_openai_completion.ipynb
+++ b/notebook/autogen_openai_completion.ipynb
@@ -196,7 +196,7 @@
    "import datasets\n",
    "\n",
    "seed = 41\n",
-    "data = datasets.load_dataset(\"openai_humaneval\")[\"test\"].shuffle(seed=seed)\n",
+    "data = datasets.load_dataset(\"openai_humaneval\", trust_remote_code=True)[\"test\"].shuffle(seed=seed)\n",
    "n_tune_data = 20\n",
    "tune_data = [\n",
    "    {\n",
@@ -444,8 +444,8 @@
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "\u001b[32m[I 2023-07-30 04:19:08,150]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n",
-      "\u001b[32m[I 2023-07-30 04:19:08,153]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
+      "\u001B[32m[I 2023-07-30 04:19:08,150]\u001B[0m A new study created in memory with name: optuna\u001B[0m\n",
+      "\u001B[32m[I 2023-07-30 04:19:08,153]\u001B[0m A new study created in memory with name: optuna\u001B[0m\n"
     ]
    },
    {
--- a/notebook/autovw.ipynb
+++ b/notebook/autovw.ipynb
@@ -240,7 +240,7 @@
    "from flaml import AutoVW\n",
    "\n",
    "'''create an AutoVW instance for tuning namespace interactions'''\n",
-    "# configure both hyperparamters to tune, e.g., 'interactions', and fixed arguments about the online learner,\n",
+    "# configure both hyperparameters to tune, e.g., 'interactions', and fixed arguments about the online learner,\n",
    "# e.g., 'quiet' in the search_space argument.\n",
    "autovw_ni = AutoVW(max_live_model_num=5, search_space={'interactions': AutoVW.AUTOMATIC, 'quiet': ''})\n",
    "\n",
--- a/notebook/research/autogen_code.ipynb
+++ b/notebook/research/autogen_code.ipynb
@@ -152,7 +152,7 @@
    "import datasets\n",
    "\n",
    "seed = 41\n",
-    "data = datasets.load_dataset(\"openai_humaneval\")[\"test\"].shuffle(seed=seed)\n",
+    "data = datasets.load_dataset(\"openai_humaneval\", trust_remote_code=True)[\"test\"].shuffle(seed=seed)\n",
    "data = data.select(range(len(data))).rename_column(\"prompt\", \"definition\").remove_columns([\"task_id\", \"canonical_solution\"])"
   ]
  },
--- a/notebook/research/math_level5counting.ipynb
+++ b/notebook/research/math_level5counting.ipynb
@@ -121,7 +121,7 @@
    "import datasets\n",
    "\n",
    "seed = 41\n",
-    "data = datasets.load_dataset(\"competition_math\")\n",
+    "data = datasets.load_dataset(\"competition_math\", trust_remote_code=True)\n",
    "train_data = data[\"train\"].shuffle(seed=seed)\n",
    "test_data = data[\"test\"].shuffle(seed=seed)\n",
    "n_tune_data = 20\n",
--- a/notebook/tune_huggingface.ipynb
+++ b/notebook/tune_huggingface.ipynb
@@ -112,9 +112,7 @@
     ]
    }
   ],
-   "source": [
-    "raw_dataset = datasets.load_dataset(\"glue\", TASK)"
-   ]
+   "source": "raw_dataset = datasets.load_dataset(\"glue\", TASK, trust_remote_code=True)"
  },
  {
   "cell_type": "code",
@@ -425,9 +423,7 @@
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
-   "source": [
-    "metric = datasets.load_metric(\"glue\", TASK)"
-   ]
+   "source": "metric = datasets.load_metric(\"glue\", TASK, trust_remote_code=True)"
  },
  {
   "cell_type": "code",
@@ -646,7 +642,7 @@
    "def train_distilbert(config: dict):\n",
    "\n",
    "    # Load CoLA dataset and apply tokenizer\n",
-    "    cola_raw = datasets.load_dataset(\"glue\", TASK)\n",
+    "    cola_raw = datasets.load_dataset(\"glue\", TASK, trust_remote_code=True)\n",
    "    cola_encoded = cola_raw.map(tokenize, batched=True)\n",
    "    train_dataset, eval_dataset = cola_encoded[\"train\"], cola_encoded[\"validation\"]\n",
    "\n",
@@ -654,7 +650,7 @@
    "        MODEL_CHECKPOINT, num_labels=NUM_LABELS\n",
    "    )\n",
    "\n",
-    "    metric = datasets.load_metric(\"glue\", TASK)\n",
+    "    metric = datasets.load_metric(\"glue\", TASK, trust_remote_code=True)\n",
    "    def compute_metrics(eval_pred):\n",
    "        predictions, labels = eval_pred\n",
    "        predictions = np.argmax(predictions, axis=1)\n",
@@ -847,7 +843,7 @@
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m Reusing dataset glue (/home/ec2-user/.cache/huggingface/datasets/glue/cola/1.0.0/7c99657241149a24692c402a5c3f34d4c9f1df5ac2e4c3759fadea38f6cb29c4)\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m Reusing dataset glue (/home/ec2-user/.cache/huggingface/datasets/glue/cola/1.0.0/7c99657241149a24692c402a5c3f34d4c9f1df5ac2e4c3759fadea38f6cb29c4)\n",
      "  0%|          | 0/9 [00:00<?, ?ba/s]\n",
      " 22%|██▏       | 2/9 [00:00<00:00, 19.41ba/s]\n",
      " 56%|█████▌    | 5/9 [00:00<00:00, 20.98ba/s]\n",
@@ -856,25 +852,25 @@
      "100%|██████████| 2/2 [00:00<00:00, 42.79ba/s]\n",
      "  0%|          | 0/2 [00:00<?, ?ba/s]\n",
      "100%|██████████| 2/2 [00:00<00:00, 41.48ba/s]\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_transform.weight', 'vocab_transform.bias', 'vocab_layer_norm.weight', 'vocab_layer_norm.bias', 'vocab_projector.weight', 'vocab_projector.bias']\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m - This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m - This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight', 'classifier.bias']\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_transform.weight', 'vocab_transform.bias', 'vocab_layer_norm.weight', 'vocab_layer_norm.bias', 'vocab_projector.weight', 'vocab_projector.bias']\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m - This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m - This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight', 'classifier.bias']\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m To disable this warning, you can either:\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m \t- Avoid using `tokenizers` before the fork if possible\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m \t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m To disable this warning, you can either:\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m \t- Avoid using `tokenizers` before the fork if possible\n",
-      "\u001b[2m\u001b[36m(pid=11344)\u001b[0m \t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m To disable this warning, you can either:\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m \t- Avoid using `tokenizers` before the fork if possible\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m \t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m To disable this warning, you can either:\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m \t- Avoid using `tokenizers` before the fork if possible\n",
+      "\u001B[2m\u001B[36m(pid=11344)\u001B[0m \t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    }
   ],
--- a/notebook/tune_synapseml.ipynb
+++ b/notebook/tune_synapseml.ipynb
@@ -1032,7 +1032,7 @@
      },
      "source": [
        "## 5. Check results\n",
-        "In this step, we retrain the model using the \"best\" hyperparamters on the full training dataset, and use the test dataset to compare evaluation metrics for the initial and \"best\" model."
+        "In this step, we retrain the model using the \"best\" hyperparameters on the full training dataset, and use the test dataset to compare evaluation metrics for the initial and \"best\" model."
      ]
    },
    {
--- a/setup.py
+++ b/setup.py
@@ -4,7 +4,7 @@ import setuptools

 here = os.path.abspath(os.path.dirname(__file__))

-with open("README.md", "r", encoding="UTF-8") as fh:
+with open("README.md", encoding="UTF-8") as fh:
    long_description = fh.read()


@@ -37,10 +37,10 @@ setuptools.setup(
    extras_require={
        "automl": [
            "lightgbm>=2.3.1",
-            "xgboost>=0.90",
+            "xgboost>=0.90,<3.0.0",
            "scipy>=1.4.1",
            "pandas>=1.1.4",
-            "scikit-learn>=0.24",
+            "scikit-learn>=1.0.0",
        ],
        "notebook": [
            "jupyter",
@@ -48,36 +48,41 @@ setuptools.setup(
        "spark": [
            "pyspark>=3.2.0",
            "joblibspark>=0.5.0",
+            "joblib<=1.3.2",
        ],
        "test": [
+            "jupyter",
            "lightgbm>=2.3.1",
-            "xgboost>=0.90",
+            "xgboost>=0.90,<2.0.0",
            "scipy>=1.4.1",
-            "pandas>=1.1.4",
-            "scikit-learn>=0.24",
+            "pandas>=1.1.4,<2.0.0; python_version<'3.10'",
+            "pandas>=1.1.4; python_version>='3.10'",
+            "scikit-learn>=1.0.0",
            "thop",
            "pytest>=6.1.1",
            "coverage>=5.3",
            "pre-commit",
            "torch",
            "torchvision",
-            "catboost>=0.26,<1.2",
+            "catboost>=0.26,<1.2; python_version<'3.11'",
+            "catboost>=0.26; python_version>='3.11'",
            "rgf-python",
-            "optuna==2.8.0",
+            "optuna>=2.8.0,<=3.6.1",
            "openml",
            "statsmodels>=0.12.2",
            "psutil==5.8.0",
            "dataclasses",
            "transformers[torch]==4.26",
            "datasets",
-            "nltk",
+            "nltk<=3.8.1",  # 3.8.2 doesn't work with mlflow
            "rouge_score",
            "hcrystalball==0.1.10",
            "seqeval",
-            "pytorch-forecasting>=0.9.0,<=0.10.1",
-            "mlflow",
-            "pyspark>=3.2.0",
+            "pytorch-forecasting>=0.9.0,<=0.10.1; python_version<'3.11'",
+            # "pytorch-forecasting==0.10.1; python_version=='3.11'",
+            "mlflow==2.15.1",
            "joblibspark>=0.5.0",
+            "joblib<=1.3.2",
            "nbconvert",
            "nbformat",
            "ipykernel",
@@ -88,10 +93,14 @@ setuptools.setup(
            "pydantic==1.10.9",
            "sympy",
            "wolframalpha",
+            "dill",  # a drop in replacement of pickle
+        ],
+        "catboost": [
+            "catboost>=0.26,<1.2; python_version<'3.11'",
+            "catboost>=0.26,<=1.2.5; python_version>='3.11'",
        ],
-        "catboost": ["catboost>=0.26"],
        "blendsearch": [
-            "optuna==2.8.0",
+            "optuna>=2.8.0,<=3.6.1",
            "packaging",
        ],
        "ray": [
@@ -110,14 +119,14 @@ setuptools.setup(
        "hf": [
            "transformers[torch]==4.26",
            "datasets",
-            "nltk",
+            "nltk<=3.8.1",
            "rouge_score",
            "seqeval",
        ],
        "nlp": [  # for backward compatibility; hf is the new option name
            "transformers[torch]==4.26",
            "datasets",
-            "nltk",
+            "nltk<=3.8.1",
            "rouge_score",
            "seqeval",
        ],
@@ -132,7 +141,8 @@ setuptools.setup(
            "prophet>=1.0.1",
            "statsmodels>=0.12.2",
            "hcrystalball==0.1.10",
-            "pytorch-forecasting>=0.9.0",
+            "pytorch-forecasting>=0.9.0; python_version<'3.11'",
+            # "pytorch-forecasting==0.10.1; python_version=='3.11'",
            "pytorch-lightning==1.9.0",
            "tensorboardX==2.6",
        ],
@@ -150,15 +160,20 @@ setuptools.setup(
        ],
        "synapse": [
            "joblibspark>=0.5.0",
-            "optuna==2.8.0",
+            "optuna>=2.8.0,<=3.6.1",
            "pyspark>=3.2.0",
        ],
        "autozero": ["scikit-learn", "pandas", "packaging"],
    },
    classifiers=[
-        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
+        # Specify the Python versions you support here.
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.8",
+        "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3.11",
    ],
-    python_requires=">=3.6",
+    python_requires=">=3.8",
 )
--- a/test/autogen/agentchat/test_assistant_agent.py
+++ b/test/autogen/agentchat/test_assistant_agent.py
@@ -178,7 +178,7 @@ def test_tsp(human_input_mode="NEVER", max_consecutive_auto_reply=10):
    class TSPUserProxyAgent(UserProxyAgent):
        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
-            with open(f"{here}/tsp_prompt.txt", "r") as f:
+            with open(f"{here}/tsp_prompt.txt") as f:
                self._prompt = f.read()

        def generate_init_message(self, question) -> str:
--- a/test/autogen/oai/test_completion.py
+++ b/test/autogen/oai/test_completion.py
@@ -187,7 +187,7 @@ def test_humaneval(num_samples=1):
    )

    seed = 41
-    data = datasets.load_dataset("openai_humaneval")["test"].shuffle(seed=seed)
+    data = datasets.load_dataset("openai_humaneval", trust_remote_code=True)["test"].shuffle(seed=seed)
    n_tune_data = 20
    tune_data = [
        {
@@ -334,7 +334,7 @@ def test_math(num_samples=-1):
        return

    seed = 41
-    data = datasets.load_dataset("competition_math")
+    data = datasets.load_dataset("competition_math", trust_remote_code=True)
    train_data = data["train"].shuffle(seed=seed)
    test_data = data["test"].shuffle(seed=seed)
    n_tune_data = 20
@@ -356,7 +356,7 @@ def test_math(num_samples=-1):
    ]
    print(
        "max tokens in tuning data's canonical solutions",
-        max([len(x["solution"].split()) for x in tune_data]),
+        max(len(x["solution"].split()) for x in tune_data),
    )
    print(len(tune_data), len(test_data))
    # prompt template
--- a/test/automl/test_classification.py
+++ b/test/automl/test_classification.py
@@ -295,7 +295,10 @@ class TestClassification(unittest.TestCase):
        import sys

        current_xgboost_version = xgb.__version__
-        subprocess.check_call([sys.executable, "-m", "pip", "install", "xgboost==1.3.3", "--user"])
+        try:
+            subprocess.check_call([sys.executable, "-m", "pip", "install", "xgboost==1.3.3", "--user"])
+        except subprocess.CalledProcessError:
+            return
        automl = AutoML()
        automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
        print(automl.feature_names_in_)
--- a/test/automl/test_constraints.py
+++ b/test/automl/test_constraints.py
@@ -23,7 +23,7 @@ def test_metric_constraints():
        "log_type": "all",
        "retrain_full": "budget",
        "keep_search_state": True,
-        "time_budget": 2,
+        "time_budget": 5,
        "pred_time_limit": 5.1e-05,
    }

@@ -125,14 +125,12 @@ def test_metric_constraints_custom():
    print(automl.estimator_list)
    print(automl.search_space)
    print(automl.points_to_evaluate)
-    print("Best minimization objective on validation data: {0:.4g}".format(automl.best_loss))
+    print(f"Best minimization objective on validation data: {automl.best_loss:.4g}")
    print(
-        "pred_time of the best config on validation data: {0:.4g}".format(
-            automl.metrics_for_best_config[1]["pred_time"]
-        )
+        "pred_time of the best config on validation data: {:.4g}".format(automl.metrics_for_best_config[1]["pred_time"])
    )
    print(
-        "val_train_loss_gap of the best config on validation data: {0:.4g}".format(
+        "val_train_loss_gap of the best config on validation data: {:.4g}".format(
            automl.metrics_for_best_config[1]["val_train_loss_gap"]
        )
    )
--- a/test/automl/test_extra_models.py
+++ b/test/automl/test_extra_models.py
@@ -0,0 +1,310 @@
+import os
+import sys
+import unittest
+import warnings
+from collections import defaultdict
+
+import mlflow
+import numpy as np
+import pandas as pd
+import pytest
+import scipy
+from packaging.version import Version
+from sklearn.datasets import load_breast_cancer, load_diabetes, load_iris
+from sklearn.model_selection import train_test_split
+
+from flaml import AutoML
+from flaml.automl.ml import sklearn_metric_loss_score
+from flaml.tune.spark.utils import check_spark
+
+leaderboard = defaultdict(dict)
+
+warnings.simplefilter(action="ignore")
+if sys.platform == "darwin" or "nt" in os.name:
+    # skip this test if the platform is not linux
+    skip_spark = True
+else:
+    try:
+        import pyspark
+        from pyspark.ml.evaluation import MulticlassClassificationEvaluator, RegressionEvaluator
+        from pyspark.ml.feature import VectorAssembler
+
+        from flaml.automl.spark.utils import to_pandas_on_spark
+
+        spark = (
+            pyspark.sql.SparkSession.builder.appName("MyApp")
+            .master("local[2]")
+            .config(
+                "spark.jars.packages",
+                (
+                    "com.microsoft.azure:synapseml_2.12:1.0.2,"
+                    "org.apache.hadoop:hadoop-azure:3.3.5,"
+                    "com.microsoft.azure:azure-storage:8.6.6,"
+                    f"org.mlflow:mlflow-spark_2.12:{mlflow.__version__}"
+                    if Version(mlflow.__version__) >= Version("2.9.0")
+                    else f"org.mlflow:mlflow-spark:{mlflow.__version__}"
+                ),
+            )
+            .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven")
+            .config("spark.sql.debug.maxToStringFields", "100")
+            .config("spark.driver.extraJavaOptions", "-Xss1m")
+            .config("spark.executor.extraJavaOptions", "-Xss1m")
+            .getOrCreate()
+        )
+        spark.sparkContext._conf.set(
+            "spark.mlflow.pysparkml.autolog.logModelAllowlistFile",
+            "https://mmlspark.blob.core.windows.net/publicwasb/log_model_allowlist.txt",
+        )
+        # spark.sparkContext.setLogLevel("ERROR")
+        spark_available, _ = check_spark()
+        skip_spark = not spark_available
+    except ImportError:
+        skip_spark = True
+
+
+def _test_regular_models(estimator_list, task):
+    if isinstance(estimator_list, str):
+        estimator_list = [estimator_list]
+    if task == "classification":
+        load_dataset_func = load_iris
+        metric = "accuracy"
+    else:
+        load_dataset_func = load_diabetes
+        metric = "r2"
+
+    x, y = load_dataset_func(return_X_y=True, as_frame=True)
+    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=7654321)
+
+    automl_experiment = AutoML()
+    automl_settings = {
+        "max_iter": 5,
+        "task": task,
+        "estimator_list": estimator_list,
+        "metric": metric,
+    }
+    automl_experiment.fit(X_train=x_train, y_train=y_train, **automl_settings)
+    predictions = automl_experiment.predict(x_test)
+    score = sklearn_metric_loss_score(metric, predictions, y_test)
+    for estimator_name in estimator_list:
+        leaderboard[task][estimator_name] = score
+
+
+def _test_spark_models(estimator_list, task):
+    if isinstance(estimator_list, str):
+        estimator_list = [estimator_list]
+    if task == "classification":
+        load_dataset_func = load_iris
+        evaluator = MulticlassClassificationEvaluator(
+            labelCol="target", predictionCol="prediction", metricName="accuracy"
+        )
+        metric = "accuracy"
+
+    elif task == "regression":
+        load_dataset_func = load_diabetes
+        evaluator = RegressionEvaluator(labelCol="target", predictionCol="prediction", metricName="r2")
+        metric = "r2"
+
+    elif task == "binary":
+        load_dataset_func = load_breast_cancer
+        evaluator = MulticlassClassificationEvaluator(
+            labelCol="target", predictionCol="prediction", metricName="accuracy"
+        )
+        metric = "accuracy"
+
+    final_cols = ["target", "features"]
+    extra_args = {}
+
+    if estimator_list is not None and "aft_spark" in estimator_list:
+        # survival analysis task
+        pd_df = pd.read_csv(
+            "https://raw.githubusercontent.com/CamDavidsonPilon/lifelines/master/lifelines/datasets/rossi.csv"
+        )
+        pd_df.rename(columns={"week": "target"}, inplace=True)
+        final_cols += ["arrest"]
+        extra_args["censorCol"] = "arrest"
+    else:
+        pd_df = load_dataset_func(as_frame=True).frame
+
+    rename = {}
+    for attr in pd_df.columns:
+        rename[attr] = attr.replace(" ", "_")
+    pd_df = pd_df.rename(columns=rename)
+    df = spark.createDataFrame(pd_df)
+    df = df.repartition(4)
+    train, test = df.randomSplit([0.8, 0.2], seed=7654321)
+    feature_cols = [col for col in df.columns if col not in ["target", "arrest"]]
+    featurizer = VectorAssembler(inputCols=feature_cols, outputCol="features")
+    train_data = featurizer.transform(train)[final_cols]
+    test_data = featurizer.transform(test)[final_cols]
+    automl = AutoML()
+    settings = {
+        "max_iter": 1,
+        "estimator_list": estimator_list,  # ML learner we intend to test
+        "task": task,  # task type
+        "metric": metric,  # metric to optimize
+    }
+    settings.update(extra_args)
+    df = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))
+
+    automl.fit(
+        dataframe=df,
+        label="target",
+        **settings,
+    )
+
+    model = automl.model.estimator
+    predictions = model.transform(test_data)
+    predictions.show(5)
+
+    score = evaluator.evaluate(predictions)
+    if estimator_list is not None:
+        for estimator_name in estimator_list:
+            leaderboard[task][estimator_name] = score
+
+
+def _test_sparse_matrix_classification(estimator):
+    automl_experiment = AutoML()
+    automl_settings = {
+        "estimator_list": [estimator],
+        "time_budget": 2,
+        "metric": "auto",
+        "task": "classification",
+        "log_file_name": "test/sparse_classification.log",
+        "split_type": "uniform",
+        "n_jobs": 1,
+        "model_history": True,
+    }
+    X_train = scipy.sparse.random(1554, 21, dtype=int)
+    y_train = np.random.randint(3, size=1554)
+    automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)
+
+
+def load_multi_dataset():
+    """multivariate time series forecasting dataset"""
+    import pandas as pd
+
+    # pd.set_option("display.max_rows", None, "display.max_columns", None)
+    df = pd.read_csv(
+        "https://raw.githubusercontent.com/srivatsan88/YouTubeLI/master/dataset/nyc_energy_consumption.csv"
+    )
+    # preprocessing data
+    df["timeStamp"] = pd.to_datetime(df["timeStamp"])
+    df = df.set_index("timeStamp")
+    df = df.resample("D").mean()
+    df["temp"] = df["temp"].fillna(method="ffill")
+    df["precip"] = df["precip"].fillna(method="ffill")
+    df = df[:-2]  # last two rows are NaN for 'demand' column so remove them
+    df = df.reset_index()
+
+    return df
+
+
+def _test_forecast(estimator_list, budget=10):
+    if isinstance(estimator_list, str):
+        estimator_list = [estimator_list]
+    df = load_multi_dataset()
+    # split data into train and test
+    time_horizon = 180
+    num_samples = df.shape[0]
+    split_idx = num_samples - time_horizon
+    train_df = df[:split_idx]
+    test_df = df[split_idx:]
+    # test dataframe must contain values for the regressors / multivariate variables
+    X_test = test_df[["timeStamp", "precip", "temp"]]
+    y_test = test_df["demand"]
+    # return
+    automl = AutoML()
+    settings = {
+        "time_budget": budget,  # total running time in seconds
+        "metric": "mape",  # primary metric
+        "task": "ts_forecast",  # task type
+        "log_file_name": "test/energy_forecast_numerical.log",  # flaml log file
+        "log_dir": "logs/forecast_logs",  # tcn/tft log folder
+        "eval_method": "holdout",
+        "log_type": "all",
+        "label": "demand",
+        "estimator_list": estimator_list,
+    }
+    """The main flaml automl API"""
+    automl.fit(dataframe=train_df, **settings, period=time_horizon)
+    print(automl.best_config)
+    pred_y = automl.predict(X_test)
+    mape = sklearn_metric_loss_score("mape", pred_y, y_test)
+    for estimator_name in estimator_list:
+        leaderboard["forecast"][estimator_name] = mape
+
+
+class TestExtraModel(unittest.TestCase):
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_rf_spark(self):
+        tasks = ["classification", "regression"]
+        for task in tasks:
+            _test_spark_models("rf_spark", task)
+
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_nb_spark(self):
+        _test_spark_models("nb_spark", "classification")
+
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_glr(self):
+        _test_spark_models("glr_spark", "regression")
+
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_lr(self):
+        _test_spark_models("lr_spark", "regression")
+
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_svc_spark(self):
+        _test_spark_models("svc_spark", "binary")
+
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_gbt_spark(self):
+        tasks = ["binary", "regression"]
+        for task in tasks:
+            _test_spark_models("gbt_spark", task)
+
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_aft(self):
+        _test_spark_models("aft_spark", "regression")
+
+    @unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+    def test_default_spark(self):
+        _test_spark_models(None, "classification")
+
+    def test_svc(self):
+        _test_regular_models("svc", "classification")
+        _test_sparse_matrix_classification("svc")
+
+    def test_sgd(self):
+        tasks = ["classification", "regression"]
+        for task in tasks:
+            _test_regular_models("sgd", task)
+        _test_sparse_matrix_classification("sgd")
+
+    def test_enet(self):
+        _test_regular_models("enet", "regression")
+
+    def test_lassolars(self):
+        _test_regular_models("lassolars", "regression")
+        _test_forecast("lassolars")
+
+    def test_seasonal_naive(self):
+        _test_forecast("snaive")
+
+    def test_naive(self):
+        _test_forecast("naive")
+
+    def test_seasonal_avg(self):
+        _test_forecast("savg")
+
+    def test_avg(self):
+        _test_forecast("avg")
+
+    @unittest.skipIf(skip_spark, reason="Skip on Mac or Windows")
+    def test_tcn(self):
+        _test_forecast("tcn")
+
+
+if __name__ == "__main__":
+    unittest.main()
+    print(leaderboard)
--- a/test/automl/test_forecast.py
+++ b/test/automl/test_forecast.py
@@ -1,7 +1,10 @@
 import datetime
+import os
+import sys

 import numpy as np
 import pandas as pd
+import pytest

 from flaml import AutoML
 from flaml.automl.task.time_series_task import TimeSeriesTask
@@ -93,8 +96,9 @@ def test_forecast_automl(budget=10, estimators_when_no_prophet=["arima", "sarima
        )


+@pytest.mark.skipif(sys.platform == "darwin" or "nt" in os.name, reason="skip on mac or windows")
 def test_models(budget=3):
-    n = 100
+    n = 200
    X = pd.DataFrame(
        {
            "A": pd.date_range(start="1900-01-01", periods=n, freq="D"),
@@ -109,14 +113,14 @@ def test_models(budget=3):
            continue  # TFT is covered by its own test
        automl = AutoML()
        automl.fit(
-            X_train=X[:72],  # a single column of timestamp
-            y_train=y[:72],  # value for each timestamp
+            X_train=X[:144],  # a single column of timestamp
+            y_train=y[:144],  # value for each timestamp
            estimator_list=[est],
            period=12,  # time horizon to forecast, e.g., 12 months
            task="ts_forecast",
            time_budget=budget,  # time budget in seconds
        )
-        automl.predict(X[72:])
+        automl.predict(X[144:])


 def test_numpy():
@@ -149,6 +153,10 @@ def test_numpy():
    print(automl.predict(12))


+@pytest.mark.skipif(
+    sys.platform in ["darwin"],
+    reason="do not run on mac os",
+)
 def test_numpy_large():
    import numpy as np
    import pandas as pd
@@ -495,6 +503,10 @@ def get_stalliion_data():
    return data, special_days


+@pytest.mark.skipif(
+    "3.11" in sys.version,
+    reason="do not run on py 3.11",
+)
 def test_forecast_panel(budget=5):
    data, special_days = get_stalliion_data()
    time_horizon = 6  # predict six months
@@ -561,7 +573,7 @@ def test_forecast_panel(budget=5):
    print(f"Training duration of best run: {automl.best_config_train_time}s")
    print(automl.model.estimator)
    """ pickle and save the automl object """
-    import pickle
+    import dill as pickle

    with open("automl.pkl", "wb") as f:
        pickle.dump(automl, f, pickle.HIGHEST_PROTOCOL)
@@ -666,7 +678,7 @@ if __name__ == "__main__":
    # test_forecast_automl(60)
    # test_multivariate_forecast_num(5)
    # test_multivariate_forecast_cat(5)
-    # test_numpy()
+    test_numpy()
    # test_forecast_classification(5)
-    test_forecast_panel(5)
+    # test_forecast_panel(5)
    # test_cv_step()
--- a/test/automl/test_mlflow.py
+++ b/test/automl/test_mlflow.py
@@ -1,3 +1,5 @@
+import pickle
+
 import mlflow
 import mlflow.entities
 import pytest
@@ -9,57 +11,98 @@ from flaml import AutoML

 class TestMLFlowLoggingParam:
    def test_should_start_new_run_by_default(self, automl_settings):
-        with mlflow.start_run():
-            parent = mlflow.last_active_run()
+        with mlflow.start_run() as parent_run:
            automl = AutoML()
            X_train, y_train = load_iris(return_X_y=True)
            automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
+            try:
+                self._check_mlflow_parameters(automl, parent_run.info)
+            except FileNotFoundError:
+                print("[WARNING]: No file found")

-        children = self._get_child_runs(parent)
-        assert len(children) >= 1, "Expected at least 1 child run, got {}".format(len(children))
+        children = self._get_child_runs(parent_run)
+        assert len(children) >= 1, f"Expected at least 1 child run, got {len(children)}"

    def test_should_not_start_new_run_when_mlflow_logging_set_to_false_in_init(self, automl_settings):
-        with mlflow.start_run():
-            parent = mlflow.last_active_run()
+        with mlflow.start_run() as parent_run:
            automl = AutoML(mlflow_logging=False)
            X_train, y_train = load_iris(return_X_y=True)
            automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
+            try:
+                self._check_mlflow_parameters(automl, parent_run.info)
+            except FileNotFoundError:
+                print("[WARNING]: No file found")

-        children = self._get_child_runs(parent)
-        assert len(children) == 0, "Expected 0 child runs, got {}".format(len(children))
+        children = self._get_child_runs(parent_run)
+        assert len(children) == 0, f"Expected 0 child runs, got {len(children)}"

    def test_should_not_start_new_run_when_mlflow_logging_set_to_false_in_fit(self, automl_settings):
-        with mlflow.start_run():
-            parent = mlflow.last_active_run()
+        with mlflow.start_run() as parent_run:
            automl = AutoML()
            X_train, y_train = load_iris(return_X_y=True)
            automl.fit(X_train=X_train, y_train=y_train, mlflow_logging=False, **automl_settings)
+            try:
+                self._check_mlflow_parameters(automl, parent_run.info)
+            except FileNotFoundError:
+                print("[WARNING]: No file found")

-        children = self._get_child_runs(parent)
-        assert len(children) == 0, "Expected 0 child runs, got {}".format(len(children))
+        children = self._get_child_runs(parent_run)
+        assert len(children) == 0, f"Expected 0 child runs, got {len(children)}"

    def test_should_start_new_run_when_mlflow_logging_set_to_true_in_fit(self, automl_settings):
-        with mlflow.start_run():
-            parent = mlflow.last_active_run()
+        with mlflow.start_run() as parent_run:
            automl = AutoML(mlflow_logging=False)
            X_train, y_train = load_iris(return_X_y=True)
            automl.fit(X_train=X_train, y_train=y_train, mlflow_logging=True, **automl_settings)
+            try:
+                self._check_mlflow_parameters(automl, parent_run.info)
+            except FileNotFoundError:
+                print("[WARNING]: No file found")

-        children = self._get_child_runs(parent)
-        assert len(children) >= 1, "Expected at least 1 child run, got {}".format(len(children))
+        children = self._get_child_runs(parent_run)
+        assert len(children) >= 1, f"Expected at least 1 child run, got {len(children)}"

    @staticmethod
    def _get_child_runs(parent_run: mlflow.entities.Run) -> DataFrame:
        experiment_id = parent_run.info.experiment_id
        return mlflow.search_runs(
-            [experiment_id], filter_string="tags.mlflow.parentRunId = '{}'".format(parent_run.info.run_id)
+            [experiment_id], filter_string=f"tags.mlflow.parentRunId = '{parent_run.info.run_id}'"
        )

+    @staticmethod
+    def _check_mlflow_parameters(automl: AutoML, run_info: mlflow.entities.RunInfo):
+        with open(
+            f"./mlruns/{run_info.experiment_id}/{run_info.run_id}/artifacts/automl_pipeline/model.pkl", "rb"
+        ) as f:
+            t = pickle.load(f)
+            if __name__ == "__main__":
+                print(t)
+            for param in automl.model._model._get_param_names():
+                assert eval("t._final_estimator._model" + f".{param}") == eval(
+                    "automl.model._model" + f".{param}"
+                ), "The mlflow logging not consistent with automl model"
+                if __name__ == "__main__":
+                    print(param, "\t", eval("automl.model._model" + f".{param}"))
+        print("[INFO]: Successfully Logged")
+
    @pytest.fixture(scope="class")
    def automl_settings(self):
+        mlflow.end_run()
        return {
-            "time_budget": 2,  # in seconds
+            "time_budget": 5,  # in seconds
            "metric": "accuracy",
            "task": "classification",
            "log_file_name": "iris.log",
        }
+
+
+if __name__ == "__main__":
+    s = TestMLFlowLoggingParam()
+    automl_settings = {
+        "time_budget": 5,  # in seconds
+        "metric": "accuracy",
+        "task": "classification",
+        "log_file_name": "iris.log",
+    }
+    s.test_should_start_new_run_by_default(automl_settings)
+    s.test_should_start_new_run_when_mlflow_logging_set_to_true_in_fit(automl_settings)
--- a/test/automl/test_multiclass.py
+++ b/test/automl/test_multiclass.py
@@ -438,8 +438,8 @@ class TestMultiClass(unittest.TestCase):
        automl_val_accuracy = 1.0 - automl.best_loss
        print("Best ML leaner:", automl.best_estimator)
        print("Best hyperparmeter config:", automl.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(automl.best_config_train_time))
+        print(f"Best accuracy on validation data: {automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")

        starting_points = automl.best_config_per_estimator
        print("starting_points", starting_points)
@@ -461,8 +461,8 @@ class TestMultiClass(unittest.TestCase):
        new_automl_val_accuracy = 1.0 - new_automl.best_loss
        print("Best ML leaner:", new_automl.best_estimator)
        print("Best hyperparmeter config:", new_automl.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(new_automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(new_automl.best_config_train_time))
+        print(f"Best accuracy on validation data: {new_automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {new_automl.best_config_train_time:.4g} s")

    def test_fit_w_starting_point_2(self, as_frame=True):
        try:
@@ -493,8 +493,8 @@ class TestMultiClass(unittest.TestCase):
        automl_val_accuracy = 1.0 - automl.best_loss
        print("Best ML leaner:", automl.best_estimator)
        print("Best hyperparmeter config:", automl.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(automl.best_config_train_time))
+        print(f"Best accuracy on validation data: {automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")

        starting_points = {}
        log_file_name = settings["log_file_name"]
@@ -508,7 +508,7 @@ class TestMultiClass(unittest.TestCase):
                if learner not in starting_points:
                    starting_points[learner] = []
                starting_points[learner].append(config)
-        max_iter = sum([len(s) for k, s in starting_points.items()])
+        max_iter = sum(len(s) for k, s in starting_points.items())
        settings_resume = {
            "time_budget": 2,
            "metric": "accuracy",
@@ -528,7 +528,7 @@ class TestMultiClass(unittest.TestCase):
        new_automl_val_accuracy = 1.0 - new_automl.best_loss
        # print('Best ML leaner:', new_automl.best_estimator)
        # print('Best hyperparmeter config:', new_automl.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(new_automl_val_accuracy))
+        print(f"Best accuracy on validation data: {new_automl_val_accuracy:.4g}")
        # print('Training duration of best run: {0:.4g} s'.format(new_automl_experiment.best_config_train_time))


--- a/test/automl/test_notebook_example.py
+++ b/test/automl/test_notebook_example.py
@@ -1,5 +1,6 @@
 import sys

+import pytest
 from minio.error import ServerError
 from openml.exceptions import OpenMLServerException
 from requests.exceptions import ChunkedEncodingError, SSLError
@@ -64,8 +65,8 @@ def test_automl(budget=5, dataset_format="dataframe", hpo_method=None):
    """ retrieve best config and best learner """
    print("Best ML leaner:", automl.best_estimator)
    print("Best hyperparmeter config:", automl.best_config)
-    print("Best accuracy on validation data: {0:.4g}".format(1 - automl.best_loss))
-    print("Training duration of best run: {0:.4g} s".format(automl.best_config_train_time))
+    print(f"Best accuracy on validation data: {1 - automl.best_loss:.4g}")
+    print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")
    print(automl.model.estimator)
    print(automl.best_config_per_estimator)
    print("time taken to find best model:", automl.time_to_find_best_model)
@@ -108,6 +109,10 @@ def test_automl(budget=5, dataset_format="dataframe", hpo_method=None):
        automl.fit(X_train=X_train, y_train=y_train, ensemble=True, **settings)


+@pytest.mark.skipif(
+    sys.platform in ["win32"] and sys.version.startswith("3.9"),
+    reason="do not run if windows and python 3.9",
+)
 def test_automl_array():
    test_automl(5, "array", "bs")

--- a/test/automl/test_score.py
+++ b/test/automl/test_score.py
@@ -195,7 +195,7 @@ class TestScore:
            automl_settings = {
                "time_budget": 2,
                "task": "rank",
-                "log_file_name": "test/{}.log".format(dataset),
+                "log_file_name": f"test/{dataset}.log",
                "model_history": True,
                "groups": np.array([0] * 200 + [1] * 200 + [2] * 100),  # group labels
                "learner_selector": "roundrobin",
--- a/test/automl/test_split.py
+++ b/test/automl/test_split.py
@@ -16,7 +16,7 @@ def _test(split_type):
        "time_budget": 2,
        # "metric": 'accuracy',
        "task": "classification",
-        "log_file_name": "test/{}.log".format(dataset),
+        "log_file_name": f"test/{dataset}.log",
        "model_history": True,
        "log_training_metric": True,
        "split_type": split_type,
@@ -64,7 +64,7 @@ def test_groups():
    automl_settings = {
        "time_budget": 2,
        "task": "classification",
-        "log_file_name": "test/{}.log".format(dataset),
+        "log_file_name": f"test/{dataset}.log",
        "model_history": True,
        "eval_method": "cv",
        "groups": np.random.randint(low=0, high=10, size=len(y)),
@@ -136,7 +136,7 @@ def test_rank():
    automl_settings = {
        "time_budget": 2,
        "task": "rank",
-        "log_file_name": "test/{}.log".format(dataset),
+        "log_file_name": f"test/{dataset}.log",
        "model_history": True,
        "eval_method": "cv",
        "groups": np.array([0] * 200 + [1] * 200 + [2] * 200 + [3] * 200 + [4] * 100 + [5] * 100),  # group labels
@@ -149,7 +149,7 @@ def test_rank():
        "time_budget": 2,
        "task": "rank",
        "metric": "ndcg@5",  # 5 can be replaced by any number
-        "log_file_name": "test/{}.log".format(dataset),
+        "log_file_name": f"test/{dataset}.log",
        "model_history": True,
        "groups": [200] * 4 + [100] * 2,  # alternative way: group counts
        # "estimator_list": ['lgbm', 'xgboost'],  # list of ML learners
@@ -188,7 +188,7 @@ def test_object():
    automl_settings = {
        "time_budget": 2,
        "task": "classification",
-        "log_file_name": "test/{}.log".format(dataset),
+        "log_file_name": f"test/{dataset}.log",
        "model_history": True,
        "log_training_metric": True,
        "split_type": TestKFold(5),
--- a/test/automl/test_training_log.py
+++ b/test/automl/test_training_log.py
@@ -98,6 +98,8 @@ class TestTrainingLog(unittest.TestCase):
            print("IsADirectoryError happens as expected in linux.")
        except PermissionError:
            print("PermissionError happens as expected in windows.")
+        except FileExistsError:
+            print("FileExistsError happens as expected in MacOS.")

    def test_each_estimator(self):
        try:
--- a/test/automl/test_warmstart.py
+++ b/test/automl/test_warmstart.py
@@ -29,8 +29,8 @@ class TestWarmStart(unittest.TestCase):
        automl_val_accuracy = 1.0 - automl.best_loss
        print("Best ML leaner:", automl.best_estimator)
        print("Best hyperparmeter config:", automl.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(automl.best_config_train_time))
+        print(f"Best accuracy on validation data: {automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")
        # 1. Get starting points from previous experiments.
        starting_points = automl.best_config_per_estimator
        print("starting_points", starting_points)
@@ -97,8 +97,8 @@ class TestWarmStart(unittest.TestCase):
        new_automl_val_accuracy = 1.0 - new_automl.best_loss
        print("Best ML leaner:", new_automl.best_estimator)
        print("Best hyperparmeter config:", new_automl.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(new_automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(new_automl.best_config_train_time))
+        print(f"Best accuracy on validation data: {new_automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {new_automl.best_config_train_time:.4g} s")

    def test_nobudget(self):
        automl = AutoML()
--- a/test/nlp/test_autohf.py
+++ b/test/nlp/test_autohf.py
@@ -30,7 +30,7 @@ def test_hf_data():

    import json

-    with open("seqclass.log", "r") as fin:
+    with open("seqclass.log") as fin:
        for line in fin:
            each_log = json.loads(line.strip("\n"))
            if "validation_loss" in each_log:
--- a/test/nlp/test_autohf_classificationhead.py
+++ b/test/nlp/test_autohf_classificationhead.py
@@ -21,6 +21,9 @@ model_path_list = [
    "textattack/bert-base-uncased-MNLI",
 ]

+if sys.platform.startswith("darwin") and sys.version_info[0] == 3 and sys.version_info[1] == 11:
+    pytest.skip("skipping Python 3.11 on MacOS", allow_module_level=True)
+

 def test_switch_1_1():
    data_idx, model_path_idx = 0, 0
--- a/test/nlp/test_autohf_tokenclassification.py
+++ b/test/nlp/test_autohf_tokenclassification.py
@@ -44,7 +44,7 @@ def test_tokenclassification_idlabel():
    # perf test
    import json

-    with open("seqclass.log", "r") as fin:
+    with open("seqclass.log") as fin:
        for line in fin:
            each_log = json.loads(line.strip("\n"))
            if "validation_loss" in each_log:
@@ -86,7 +86,7 @@ def test_tokenclassification_tokenlabel():
    # perf test
    import json

-    with open("seqclass.log", "r") as fin:
+    with open("seqclass.log") as fin:
        for line in fin:
            each_log = json.loads(line.strip("\n"))
            if "validation_loss" in each_log:
--- a/test/nlp/test_default.py
+++ b/test/nlp/test_default.py
@@ -7,6 +7,9 @@ from utils import get_automl_settings, get_toy_data_seqclassification

 from flaml.default import portfolio

+if sys.platform.startswith("darwin") and sys.version_info[0] == 3 and sys.version_info[1] == 11:
+    pytest.skip("skipping Python 3.11 on MacOS", allow_module_level=True)
+

 def pop_args(fit_kwargs):
    fit_kwargs.pop("max_iter", None)
--- a/test/nni/mnist.py
+++ b/test/nni/mnist.py
@@ -25,7 +25,7 @@ logger = logging.getLogger("mnist_AutoML")

 class Net(nn.Module):
    def __init__(self, hidden_size):
-        super(Net, self).__init__()
+        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4 * 4 * 50, hidden_size)
--- a/test/spark/test_0sparkml.py
+++ b/test/spark/test_0sparkml.py
@@ -5,6 +5,7 @@ import warnings
 import mlflow
 import pytest
 import sklearn.datasets as skds
+from packaging.version import Version

 from flaml import AutoML
 from flaml.tune.spark.utils import check_spark
@@ -20,23 +21,26 @@ else:

        from flaml.automl.spark.utils import to_pandas_on_spark

-        postfix_version = "-spark3.3," if pyspark.__version__ > "3.2" else ","
        spark = (
            pyspark.sql.SparkSession.builder.appName("MyApp")
            .master("local[2]")
            .config(
                "spark.jars.packages",
                (
-                    f"com.microsoft.azure:synapseml_2.12:0.11.3{postfix_version}"
+                    "com.microsoft.azure:synapseml_2.12:1.0.4,"
                    "org.apache.hadoop:hadoop-azure:3.3.5,"
                    "com.microsoft.azure:azure-storage:8.6.6,"
-                    f"org.mlflow:mlflow-spark:2.6.0"
+                    f"org.mlflow:mlflow-spark_2.12:{mlflow.__version__}"
+                    if Version(mlflow.__version__) >= Version("2.9.0")
+                    else f"org.mlflow:mlflow-spark:{mlflow.__version__}"
                ),
            )
            .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven")
            .config("spark.sql.debug.maxToStringFields", "100")
            .config("spark.driver.extraJavaOptions", "-Xss1m")
            .config("spark.executor.extraJavaOptions", "-Xss1m")
+            # .config("spark.executor.memory", "48G")
+            # .config("spark.driver.memory", "48G")
            .getOrCreate()
        )
        spark.sparkContext._conf.set(
@@ -49,6 +53,10 @@ else:
    except ImportError:
        skip_spark = True

+if sys.version_info >= (3, 11):
+    skip_py311 = True
+else:
+    skip_py311 = False

 pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")

@@ -159,10 +167,11 @@ def test_spark_input_df():
    settings = {
        "time_budget": 30,  # total running time in seconds
        "metric": "roc_auc",
-        "estimator_list": ["lgbm_spark"],  # list of ML learners; we tune lightgbm in this example
+        # "estimator_list": ["lgbm_spark"],  # list of ML learners; we tune lightgbm in this example
        "task": "classification",  # task type
        "log_file_name": "flaml_experiment.log",  # flaml log file
        "seed": 7654321,  # random seed
+        "eval_method": "holdout",
    }
    df = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))

@@ -176,17 +185,17 @@ def test_spark_input_df():
    try:
        model = automl.model.estimator
        predictions = model.transform(test_data)
-        predictions.show()

-        # from synapse.ml.train import ComputeModelStatistics
-
-        # metrics = ComputeModelStatistics(
-        #     evaluationMetric="classification",
-        #     labelCol="Bankrupt?",
-        #     scoredLabelsCol="prediction",
-        # ).transform(predictions)
-        # metrics.show()
+        from synapse.ml.train import ComputeModelStatistics

+        if not skip_py311:
+            # ComputeModelStatistics doesn't support python 3.11
+            metrics = ComputeModelStatistics(
+                evaluationMetric="classification",
+                labelCol="Bankrupt?",
+                scoredLabelsCol="prediction",
+            ).transform(predictions)
+            metrics.show()
    except AttributeError:
        print("No fitted model because of too short training time.")

@@ -207,6 +216,86 @@ def test_spark_input_df():
    assert "No estimator is left." in str(excinfo.value)


+def _test_spark_large_df():
+    """Test with large dataframe, should not run in pipeline."""
+    import os
+    import time
+
+    import pandas as pd
+    from pyspark.sql import functions as F
+
+    import flaml
+
+    os.environ["FLAML_MAX_CONCURRENT"] = "8"
+    start_time = time.time()
+
+    def load_higgs():
+        # 11M rows, 29 columns, 1.1GB
+        df = (
+            spark.read.format("csv")
+            .option("header", False)
+            .option("inferSchema", True)
+            .load("/datadrive/datasets/HIGGS.csv")
+            .withColumnRenamed("_c0", "target")
+            .withColumn("target", F.col("target").cast("integer"))
+            .limit(1000000)
+            .fillna(0)
+            .na.drop(how="any")
+            .repartition(64)
+            .cache()
+        )
+        print("Number of rows in data: ", df.count())
+        return df
+
+    def load_bosch():
+        # 1.184M rows, 969 cols, 1.5GB
+        df = (
+            spark.read.format("csv")
+            .option("header", True)
+            .option("inferSchema", True)
+            .load("/datadrive/datasets/train_numeric.csv")
+            .withColumnRenamed("Response", "target")
+            .withColumn("target", F.col("target").cast("integer"))
+            .limit(1000000)
+            .fillna(0)
+            .drop("Id")
+            .repartition(64)
+            .cache()
+        )
+        print("Number of rows in data: ", df.count())
+        return df
+
+    def prepare_data(dataset_name="higgs"):
+        df = load_higgs() if dataset_name == "higgs" else load_bosch()
+        train, test = df.randomSplit([0.75, 0.25], seed=7654321)
+        feature_cols = [col for col in df.columns if col not in ["target", "arrest"]]
+        final_cols = ["target", "features"]
+        featurizer = VectorAssembler(inputCols=feature_cols, outputCol="features")
+        train_data = featurizer.transform(train)[final_cols]
+        test_data = featurizer.transform(test)[final_cols]
+        train_data = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))
+        return train_data, test_data
+
+    train_data, test_data = prepare_data("higgs")
+    end_time = time.time()
+    print("time cost in minutes for prepare data: ", (end_time - start_time) / 60)
+    automl = flaml.AutoML()
+    automl_settings = {
+        "max_iter": 3,
+        "time_budget": 7200,
+        "metric": "accuracy",
+        "task": "classification",
+        "seed": 1234,
+        "eval_method": "holdout",
+    }
+    automl.fit(dataframe=train_data, label="target", ensemble=False, **automl_settings)
+    model = automl.model.estimator
+    predictions = model.transform(test_data)
+    predictions.show(5)
+    end_time = time.time()
+    print("time cost in minutes: ", (end_time - start_time) / 60)
+
+
 if __name__ == "__main__":
    test_spark_synapseml_classification()
    test_spark_synapseml_regression()
@@ -217,6 +306,6 @@ if __name__ == "__main__":
    # import pstats
    # from pstats import SortKey

-    # cProfile.run("test_spark_input_df()", "test_spark_input_df.profile")
-    # p = pstats.Stats("test_spark_input_df.profile")
-    # p.strip_dirs().sort_stats(SortKey.CUMULATIVE).print_stats("utils.py")
+    # cProfile.run("_test_spark_large_df()", "_test_spark_large_df.profile")
+    # p = pstats.Stats("_test_spark_large_df.profile")
+    # p.strip_dirs().sort_stats(SortKey.CUMULATIVE).print_stats(50)
--- a/test/spark/test_exceptions.py
+++ b/test/spark/test_exceptions.py
@@ -41,8 +41,8 @@ def base_automl(n_concurrent_trials=1, use_ray=False, use_spark=False, verbose=0

    print("Best ML leaner:", automl.best_estimator)
    print("Best hyperparmeter config:", automl.best_config)
-    print("Best accuracy on validation data: {0:.4g}".format(1 - automl.best_loss))
-    print("Training duration of best run: {0:.4g} s".format(automl.best_config_train_time))
+    print(f"Best accuracy on validation data: {1 - automl.best_loss:.4g}")
+    print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")


 def test_both_ray_spark():
--- a/test/spark/test_mlflow.py
+++ b/test/spark/test_mlflow.py
@@ -0,0 +1,342 @@
+import importlib
+import os
+import sys
+import time
+import warnings
+
+import mlflow
+import pytest
+from packaging.version import Version
+from sklearn.datasets import fetch_california_housing, load_diabetes
+from sklearn.ensemble import RandomForestRegressor
+from sklearn.metrics import r2_score
+from sklearn.model_selection import train_test_split
+
+import flaml
+from flaml.automl.spark.utils import to_pandas_on_spark
+
+try:
+    import pyspark
+    from pyspark.ml.evaluation import RegressionEvaluator
+    from pyspark.ml.feature import VectorAssembler
+except ImportError:
+    pass
+warnings.filterwarnings("ignore")
+
+skip_spark = importlib.util.find_spec("pyspark") is None
+client = mlflow.tracking.MlflowClient()
+
+if (sys.platform.startswith("darwin") or sys.platform.startswith("nt")) and (
+    sys.version_info[0] == 3 and sys.version_info[1] >= 10
+):
+    # TODO: remove this block when tests are stable
+    # Below tests will fail, but the functions run without error if run individually.
+    # test_tune_autolog_parentrun_nonparallel()
+    # test_tune_autolog_noparentrun_nonparallel()
+    # test_tune_noautolog_parentrun_nonparallel()
+    # test_tune_noautolog_noparentrun_nonparallel()
+    pytest.skip("skipping MacOS and Windows for python 3.10 and 3.11", allow_module_level=True)
+
+"""
+The spark used in below tests should be initiated in test_0sparkml.py when run with pytest.
+"""
+
+
+def _sklearn_tune(config):
+    is_autolog = config.pop("is_autolog")
+    is_parent_run = config.pop("is_parent_run")
+    is_parallel = config.pop("is_parallel")
+    X, y = load_diabetes(return_X_y=True, as_frame=True)
+    train_x, test_x, train_y, test_y = train_test_split(X, y, test_size=0.25)
+    rf = RandomForestRegressor(**config)
+    rf.fit(train_x, train_y)
+    pred = rf.predict(test_x)
+    r2 = r2_score(test_y, pred)
+    if not is_autolog and not is_parent_run and not is_parallel:
+        with mlflow.start_run(nested=True):
+            mlflow.log_metric("r2", r2)
+    return {"r2": r2}
+
+
+def _test_tune(is_autolog, is_parent_run, is_parallel):
+    mlflow.end_run()
+    mlflow_exp_name = f"test_mlflow_integration_{int(time.time())}"
+    mlflow_experiment = mlflow.set_experiment(mlflow_exp_name)
+    params = {
+        "n_estimators": flaml.tune.randint(100, 1000),
+        "min_samples_leaf": flaml.tune.randint(1, 10),
+        "is_autolog": is_autolog,
+        "is_parent_run": is_parent_run,
+        "is_parallel": is_parallel,
+    }
+    if is_autolog:
+        mlflow.autolog()
+    else:
+        mlflow.autolog(disable=True)
+    if is_parent_run:
+        mlflow.start_run(run_name=f"tune_autolog_{is_autolog}_sparktrial_{is_parallel}")
+    flaml.tune.run(
+        _sklearn_tune,
+        params,
+        metric="r2",
+        mode="max",
+        num_samples=3,
+        use_spark=True if is_parallel else False,
+        n_concurrent_trials=2 if is_parallel else 1,
+        mlflow_exp_name=mlflow_exp_name,
+    )
+    mlflow.end_run()  # end current run
+    mlflow.autolog(disable=True)
+    return mlflow_experiment.experiment_id
+
+
+def _check_mlflow_logging(possible_num_runs, metric, is_parent_run, experiment_id, is_automl=False, skip_tags=False):
+    if isinstance(possible_num_runs, int):
+        possible_num_runs = [possible_num_runs]
+    if is_parent_run:
+        parent_run = mlflow.last_active_run()
+        child_runs = client.search_runs(
+            experiment_ids=[experiment_id],
+            filter_string=f"tags.mlflow.parentRunId = '{parent_run.info.run_id}'",
+        )
+    else:
+        child_runs = client.search_runs(experiment_ids=[experiment_id])
+    experiment_name = client.get_experiment(experiment_id).name
+    metrics = [metric in run.data.metrics for run in child_runs]
+    tags = ["flaml.version" in run.data.tags for run in child_runs]
+    params = ["learner" in run.data.params for run in child_runs]
+    assert (
+        len(child_runs) in possible_num_runs
+    ), f"The number of child runs is not correct on experiment {experiment_name}."
+    if possible_num_runs[0] > 0:
+        assert all(metrics), f"The metrics are not logged correctly on experiment {experiment_name}."
+        assert (
+            all(tags) if not skip_tags else True
+        ), f"The tags are not logged correctly on experiment {experiment_name}."
+        assert (
+            all(params) if is_automl else True
+        ), f"The params are not logged correctly on experiment {experiment_name}."
+    # mlflow.delete_experiment(experiment_id)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_tune_autolog_parentrun_parallel():
+    experiment_id = _test_tune(is_autolog=True, is_parent_run=True, is_parallel=True)
+    _check_mlflow_logging([4, 3], "r2", True, experiment_id)
+
+
+def test_tune_autolog_parentrun_nonparallel():
+    experiment_id = _test_tune(is_autolog=True, is_parent_run=True, is_parallel=False)
+    _check_mlflow_logging(3, "r2", True, experiment_id)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_tune_autolog_noparentrun_parallel():
+    experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=True)
+    _check_mlflow_logging([4, 3], "r2", False, experiment_id)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_tune_noautolog_parentrun_parallel():
+    experiment_id = _test_tune(is_autolog=False, is_parent_run=True, is_parallel=True)
+    _check_mlflow_logging([4, 3], "r2", True, experiment_id)
+
+
+def test_tune_autolog_noparentrun_nonparallel():
+    experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=False)
+    _check_mlflow_logging(3, "r2", False, experiment_id)
+
+
+def test_tune_noautolog_parentrun_nonparallel():
+    experiment_id = _test_tune(is_autolog=False, is_parent_run=True, is_parallel=False)
+    _check_mlflow_logging(3, "r2", True, experiment_id)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_tune_noautolog_noparentrun_parallel():
+    experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=True)
+    _check_mlflow_logging(0, "r2", False, experiment_id)
+
+
+def test_tune_noautolog_noparentrun_nonparallel():
+    experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=False)
+    _check_mlflow_logging(3, "r2", False, experiment_id, skip_tags=True)
+
+
+def _test_automl_sparkdata(is_autolog, is_parent_run):
+    mlflow.end_run()
+    mlflow_exp_name = f"test_mlflow_integration_{int(time.time())}"
+    mlflow_experiment = mlflow.set_experiment(mlflow_exp_name)
+    if is_autolog:
+        mlflow.autolog()
+    else:
+        mlflow.autolog(disable=True)
+    if is_parent_run:
+        mlflow.start_run(run_name=f"automl_sparkdata_autolog_{is_autolog}")
+    spark = pyspark.sql.SparkSession.builder.getOrCreate()
+    pd_df = load_diabetes(as_frame=True).frame
+    df = spark.createDataFrame(pd_df)
+    df = df.repartition(4).cache()
+    train, test = df.randomSplit([0.8, 0.2], seed=1)
+    feature_cols = df.columns[:-1]
+    featurizer = VectorAssembler(inputCols=feature_cols, outputCol="features")
+    train_data = featurizer.transform(train)["target", "features"]
+    featurizer.transform(test)["target", "features"]
+    automl = flaml.AutoML()
+    settings = {
+        "max_iter": 3,
+        "metric": "mse",
+        "task": "regression",  # task type
+        "log_file_name": "flaml_experiment.log",  # flaml log file
+        "mlflow_exp_name": mlflow_exp_name,
+        "log_type": "all",
+        "n_splits": 2,
+        "model_history": True,
+    }
+    df = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))
+    automl.fit(
+        dataframe=df,
+        label="target",
+        **settings,
+    )
+    mlflow.end_run()  # end current run
+    mlflow.autolog(disable=True)
+    return mlflow_experiment.experiment_id
+
+
+def _test_automl_nonsparkdata(is_autolog, is_parent_run):
+    mlflow_exp_name = f"test_mlflow_integration_{int(time.time())}"
+    mlflow_experiment = mlflow.set_experiment(mlflow_exp_name)
+    if is_autolog:
+        mlflow.autolog()
+    else:
+        mlflow.autolog(disable=True)
+    if is_parent_run:
+        mlflow.start_run(run_name=f"automl_nonsparkdata_autolog_{is_autolog}")
+    automl_experiment = flaml.AutoML()
+    automl_settings = {
+        "max_iter": 3,
+        "metric": "r2",
+        "task": "regression",
+        "n_concurrent_trials": 2,
+        "use_spark": True,
+        "mlflow_exp_name": None if is_parent_run else mlflow_exp_name,
+        "log_type": "all",
+        "n_splits": 2,
+        "model_history": True,
+    }
+    X, y = load_diabetes(return_X_y=True, as_frame=True)
+    train_x, test_x, train_y, test_y = train_test_split(X, y, test_size=0.25)
+    automl_experiment.fit(X_train=train_x, y_train=train_y, **automl_settings)
+    mlflow.end_run()  # end current run
+    mlflow.autolog(disable=True)
+    return mlflow_experiment.experiment_id
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_sparkdata_autolog_parentrun():
+    experiment_id = _test_automl_sparkdata(is_autolog=True, is_parent_run=True)
+    _check_mlflow_logging(3, "mse", True, experiment_id, is_automl=True)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_sparkdata_autolog_noparentrun():
+    experiment_id = _test_automl_sparkdata(is_autolog=True, is_parent_run=False)
+    _check_mlflow_logging(3, "mse", False, experiment_id, is_automl=True)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_sparkdata_noautolog_parentrun():
+    experiment_id = _test_automl_sparkdata(is_autolog=False, is_parent_run=True)
+    _check_mlflow_logging(3, "mse", True, experiment_id, is_automl=True)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_sparkdata_noautolog_noparentrun():
+    experiment_id = _test_automl_sparkdata(is_autolog=False, is_parent_run=False)
+    _check_mlflow_logging(0, "mse", False, experiment_id, is_automl=True)  # no logging
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_nonsparkdata_autolog_parentrun():
+    experiment_id = _test_automl_nonsparkdata(is_autolog=True, is_parent_run=True)
+    _check_mlflow_logging([4, 3], "r2", True, experiment_id, is_automl=True)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_nonsparkdata_autolog_noparentrun():
+    experiment_id = _test_automl_nonsparkdata(is_autolog=True, is_parent_run=False)
+    _check_mlflow_logging([4, 3], "r2", False, experiment_id, is_automl=True)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_nonsparkdata_noautolog_parentrun():
+    experiment_id = _test_automl_nonsparkdata(is_autolog=False, is_parent_run=True)
+    _check_mlflow_logging([4, 3], "r2", True, experiment_id, is_automl=True)
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_automl_nonsparkdata_noautolog_noparentrun():
+    experiment_id = _test_automl_nonsparkdata(is_autolog=False, is_parent_run=False)
+    _check_mlflow_logging(0, "r2", False, experiment_id, is_automl=True)  # no logging
+
+
+@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
+def test_exit_pyspark_autolog():
+    import pyspark
+
+    spark = pyspark.sql.SparkSession.builder.getOrCreate()
+    spark.sparkContext._gateway.shutdown_callback_server()  # this is to avoid stucking
+    mlflow.autolog(disable=True)
+
+
+def _init_spark_for_main():
+    import pyspark
+
+    spark = (
+        pyspark.sql.SparkSession.builder.appName("MyApp")
+        .master("local[2]")
+        .config(
+            "spark.jars.packages",
+            (
+                "com.microsoft.azure:synapseml_2.12:1.0.4,"
+                "org.apache.hadoop:hadoop-azure:3.3.5,"
+                "com.microsoft.azure:azure-storage:8.6.6,"
+                f"org.mlflow:mlflow-spark_2.12:{mlflow.__version__}"
+                if Version(mlflow.__version__) >= Version("2.9.0")
+                else f"org.mlflow:mlflow-spark:{mlflow.__version__}"
+            ),
+        )
+        .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven")
+        .config("spark.sql.debug.maxToStringFields", "100")
+        .config("spark.driver.extraJavaOptions", "-Xss1m")
+        .config("spark.executor.extraJavaOptions", "-Xss1m")
+        .getOrCreate()
+    )
+    spark.sparkContext._conf.set(
+        "spark.mlflow.pysparkml.autolog.logModelAllowlistFile",
+        "https://mmlspark.blob.core.windows.net/publicwasb/log_model_allowlist.txt",
+    )
+
+
+if __name__ == "__main__":
+    _init_spark_for_main()
+
+    # test_tune_autolog_parentrun_parallel()
+    # test_tune_autolog_parentrun_nonparallel()
+    test_tune_autolog_noparentrun_parallel()  # TODO: runs not removed
+    # test_tune_noautolog_parentrun_parallel()
+    # test_tune_autolog_noparentrun_nonparallel()
+    # test_tune_noautolog_parentrun_nonparallel()
+    # test_tune_noautolog_noparentrun_parallel()
+    # test_tune_noautolog_noparentrun_nonparallel()
+    # test_automl_sparkdata_autolog_parentrun()
+    # test_automl_sparkdata_autolog_noparentrun()
+    # test_automl_sparkdata_noautolog_parentrun()
+    # test_automl_sparkdata_noautolog_noparentrun()
+    # test_automl_nonsparkdata_autolog_parentrun()
+    # test_automl_nonsparkdata_autolog_noparentrun()  # TODO: runs not removed
+    # test_automl_nonsparkdata_noautolog_parentrun()
+    # test_automl_nonsparkdata_noautolog_noparentrun()
+
+    test_exit_pyspark_autolog()
--- a/test/spark/test_multiclass.py
+++ b/test/spark/test_multiclass.py
@@ -344,8 +344,8 @@ class TestMultiClass(unittest.TestCase):
        automl_val_accuracy = 1.0 - automl_experiment.best_loss
        print("Best ML leaner:", automl_experiment.best_estimator)
        print("Best hyperparmeter config:", automl_experiment.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(automl_experiment.best_config_train_time))
+        print(f"Best accuracy on validation data: {automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {automl_experiment.best_config_train_time:.4g} s")

        starting_points = automl_experiment.best_config_per_estimator
        print("starting_points", starting_points)
@@ -369,8 +369,8 @@ class TestMultiClass(unittest.TestCase):
        new_automl_val_accuracy = 1.0 - new_automl_experiment.best_loss
        print("Best ML leaner:", new_automl_experiment.best_estimator)
        print("Best hyperparmeter config:", new_automl_experiment.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(new_automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(new_automl_experiment.best_config_train_time))
+        print(f"Best accuracy on validation data: {new_automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {new_automl_experiment.best_config_train_time:.4g} s")

    def test_fit_w_starting_points_list(self, as_frame=True):
        automl_experiment = AutoML()
@@ -394,8 +394,8 @@ class TestMultiClass(unittest.TestCase):
        automl_val_accuracy = 1.0 - automl_experiment.best_loss
        print("Best ML leaner:", automl_experiment.best_estimator)
        print("Best hyperparmeter config:", automl_experiment.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(automl_val_accuracy))
-        print("Training duration of best run: {0:.4g} s".format(automl_experiment.best_config_train_time))
+        print(f"Best accuracy on validation data: {automl_val_accuracy:.4g}")
+        print(f"Training duration of best run: {automl_experiment.best_config_train_time:.4g} s")

        starting_points = {}
        log_file_name = automl_settings["log_file_name"]
@@ -409,7 +409,7 @@ class TestMultiClass(unittest.TestCase):
                if learner not in starting_points:
                    starting_points[learner] = []
                starting_points[learner].append(config)
-        max_iter = sum([len(s) for k, s in starting_points.items()])
+        max_iter = sum(len(s) for k, s in starting_points.items())
        automl_settings_resume = {
            "time_budget": 2,
            "metric": "accuracy",
@@ -431,7 +431,7 @@ class TestMultiClass(unittest.TestCase):
        new_automl_val_accuracy = 1.0 - new_automl_experiment.best_loss
        # print('Best ML leaner:', new_automl_experiment.best_estimator)
        # print('Best hyperparmeter config:', new_automl_experiment.best_config)
-        print("Best accuracy on validation data: {0:.4g}".format(new_automl_val_accuracy))
+        print(f"Best accuracy on validation data: {new_automl_val_accuracy:.4g}")
        # print('Training duration of best run: {0:.4g} s'.format(new_automl_experiment.best_config_train_time))


--- a/test/spark/test_overtime.py
+++ b/test/spark/test_overtime.py
@@ -55,7 +55,7 @@ def test_overtime():
    start_time = time.time()
    automl_experiment.fit(**automl_settings)
    elapsed_time = time.time() - start_time
-    print("time budget: {:.2f}s, actual elapsed time: {:.2f}s".format(time_budget, elapsed_time))
+    print(f"time budget: {time_budget:.2f}s, actual elapsed time: {elapsed_time:.2f}s")
    # assert abs(elapsed_time - time_budget) < 5  # cancel assertion because github VM sometimes is super slow, causing the test to fail
    print(automl_experiment.predict(df))
    print(automl_experiment.model)
--- a/test/spark/test_performance.py
+++ b/test/spark/test_performance.py
@@ -75,8 +75,8 @@ def run_automl(budget=3, dataset_format="dataframe", hpo_method=None):
    """ retrieve best config and best learner """
    print("Best ML leaner:", automl.best_estimator)
    print("Best hyperparmeter config:", automl.best_config)
-    print("Best accuracy on validation data: {0:.4g}".format(1 - automl.best_loss))
-    print("Training duration of best run: {0:.4g} s".format(automl.best_config_train_time))
+    print(f"Best accuracy on validation data: {1 - automl.best_loss:.4g}")
+    print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")
    print(automl.model.estimator)
    print(automl.best_config_per_estimator)
    print("time taken to find best model:", automl.time_to_find_best_model)
--- a/test/spark/test_utils.py
+++ b/test/spark/test_utils.py
@@ -167,7 +167,7 @@ def test_len_labels():
    assert len_labels(y1) == 4
    ll, la = len_labels(y2, return_labels=True)
    assert ll == 4
-    assert set(la.to_numpy()) == set([1, 2, 5, 4])
+    assert set(la.to_numpy()) == {1, 2, 5, 4}


 def test_unique_value_first_index():
--- a/test/test_autovw.py
+++ b/test/test_autovw.py
@@ -50,11 +50,11 @@ def oml_to_vw_w_grouping(X, y, ds_dir, fname, orginal_dim, group_num, grouping_m
                for i in range(len(X)):
                    NS_content = []
                    for zz in range(len(group_indexes)):
-                        ns_features = " ".join("{}:{:.6f}".format(ind, X[i][ind]) for ind in group_indexes[zz])
+                        ns_features = " ".join(f"{ind}:{X[i][ind]:.6f}" for ind in group_indexes[zz])
                        NS_content.append(ns_features)
                    ns_line = "{} |{}".format(
                        str(y[i]),
-                        "|".join("{} {}".format(NS_LIST[j], NS_content[j]) for j in range(len(group_indexes))),
+                        "|".join(f"{NS_LIST[j]} {NS_content[j]}" for j in range(len(group_indexes))),
                    )
                    f.write(ns_line)
                    f.write("\n")
@@ -67,7 +67,7 @@ def save_vw_dataset_w_ns(X, y, did, ds_dir, max_ns_num, is_regression):
    """convert openml dataset to vw example and save to file"""
    print("is_regression", is_regression)
    if is_regression:
-        fname = "ds_{}_{}_{}.vw".format(did, max_ns_num, 0)
+        fname = f"ds_{did}_{max_ns_num}_{0}.vw"
        print("dataset size", X.shape[0], X.shape[1])
        print("saving data", did, ds_dir, fname)
        dim = X.shape[1]
@@ -91,11 +91,14 @@ def shuffle_data(X, y, seed):
 def get_oml_to_vw(did, max_ns_num, ds_dir=VW_DS_DIR):
    success = False
    print("-----getting oml dataset-------", did)
-    ds = openml.datasets.get_dataset(did)
-    target_attribute = ds.default_target_attribute
-    # if target_attribute is None and did in OML_target_attribute_dict:
-    #     target_attribute = OML_target_attribute_dict[did]
-
+    try:
+        ds = openml.datasets.get_dataset(did)
+        target_attribute = ds.default_target_attribute
+        # if target_attribute is None and did in OML_target_attribute_dict:
+        #     target_attribute = OML_target_attribute_dict[did]
+    except SSLError as e:
+        print(e)
+        return
    print("target=ds.default_target_attribute", target_attribute)
    data = ds.get_data(target=target_attribute, dataset_format="array")
    X, y = data[0], data[1]  # return X: pd DataFrame, y: pd series
@@ -128,7 +131,7 @@ def load_vw_dataset(did, ds_dir, is_regression, max_ns_num):

    if is_regression:
        # the second field specifies the largest number of namespaces using.
-        fname = "ds_{}_{}_{}.vw".format(did, max_ns_num, 0)
+        fname = f"ds_{did}_{max_ns_num}_{0}.vw"
        vw_dataset_file = os.path.join(ds_dir, fname)
        # if file does not exist, generate and save the datasets
        if not os.path.exists(vw_dataset_file) or os.stat(vw_dataset_file).st_size < 1000:
@@ -136,7 +139,7 @@ def load_vw_dataset(did, ds_dir, is_regression, max_ns_num):
        print(ds_dir, vw_dataset_file)
        if not os.path.exists(ds_dir):
            os.makedirs(ds_dir)
-        with open(os.path.join(ds_dir, fname), "r") as f:
+        with open(os.path.join(ds_dir, fname)) as f:
            vw_content = f.read().splitlines()
            print(type(vw_content), len(vw_content))
        return vw_content
@@ -349,8 +352,8 @@ def get_vw_tuning_problem(tuning_hp="NamesapceInteraction"):


@pytest.mark.skipif(
-    "3.10" in sys.version,
-    reason="do not run on py 3.10",
+    "3.10" in sys.version or "3.11" in sys.version,
+    reason="do not run on py >= 3.10",
 )
 class TestAutoVW(unittest.TestCase):
    def test_vw_oml_problem_and_vanilla_vw(self):
--- a/test/tune/test_lexiflow.py
+++ b/test/tune/test_lexiflow.py
@@ -1,12 +1,18 @@
 import math
+import sys
 from collections import defaultdict

 import numpy as np
+import pytest
 import thop
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
-import torchvision
+
+try:
+    import torchvision
+except ImportError:
+    torchvision = None

 from flaml import tune

@@ -15,6 +21,9 @@ BATCHSIZE = 128
 N_TRAIN_EXAMPLES = BATCHSIZE * 30
 N_VALID_EXAMPLES = BATCHSIZE * 10

+if sys.platform.startswith("darwin") and sys.version_info[0] == 3 and sys.version_info[1] == 11:
+    pytest.skip("skipping Python 3.11 on MacOS", allow_module_level=True)
+

 def _BraninCurrin(config):
    # Rescale brain
@@ -35,6 +44,9 @@ def _BraninCurrin(config):


 def test_lexiflow():
+    if torchvision is None:
+        return False
+
    train_dataset = torchvision.datasets.FashionMNIST(
        "test/data",
        train=True,
@@ -63,10 +75,10 @@ def test_lexiflow():
        layers = []
        in_features = 28 * 28
        for i in range(n_layers):
-            out_features = configuration["n_units_l{}".format(i)]
+            out_features = configuration[f"n_units_l{i}"]
            layers.append(nn.Linear(in_features, out_features))
            layers.append(nn.ReLU())
-            p = configuration["dropout_{}".format(i)]
+            p = configuration[f"dropout_{i}"]
            layers.append(nn.Dropout(p))
            in_features = out_features
        layers.append(nn.Linear(in_features, 10))
--- a/test/tune/test_pytorch_cifar10.py
+++ b/test/tune/test_pytorch_cifar10.py
@@ -24,7 +24,7 @@ try:
    # __net_begin__
    class Net(nn.Module):
        def __init__(self, l1=120, l2=84):
-            super(Net, self).__init__()
+            super().__init__()
            self.conv1 = nn.Conv2d(3, 6, 5)
            self.pool = nn.MaxPool2d(2, 2)
            self.conv2 = nn.Conv2d(6, 16, 5)
@@ -277,7 +277,7 @@ def cifar10_main(method="BlendSearch", num_samples=10, max_num_epochs=100, gpus_
    logger.info(f"#trials={len(result.trials)}")
    logger.info(f"time={time.time()-start_time}")
    best_trial = result.get_best_trial("loss", "min", "all")
-    logger.info("Best trial config: {}".format(best_trial.config))
+    logger.info(f"Best trial config: {best_trial.config}")
    logger.info("Best trial final validation loss: {}".format(best_trial.metric_analysis["loss"]["min"]))
    logger.info("Best trial final validation accuracy: {}".format(best_trial.metric_analysis["accuracy"]["max"]))

@@ -296,7 +296,7 @@ def cifar10_main(method="BlendSearch", num_samples=10, max_num_epochs=100, gpus_
    best_trained_model.load_state_dict(model_state)

    test_acc = _test_accuracy(best_trained_model, device)
-    logger.info("Best trial test set accuracy: {}".format(test_acc))
+    logger.info(f"Best trial test set accuracy: {test_acc}")


 # __main_end__
--- a/test/tune/test_searcher.py
+++ b/test/tune/test_searcher.py
@@ -310,7 +310,7 @@ def test_searchers():
    print(searcher.suggest("t1"))
    from flaml import tune

-    tune.run(lambda x: 1, config={}, use_ray=use_ray, log_file_name="logs/searcher.log")
+    tune.run(lambda x: 1, config={}, mode="max", use_ray=use_ray, log_file_name="logs/searcher.log")
    searcher = BlendSearch(space=config, cost_attr="cost", cost_budget=10, metric="m", mode="min")
    analysis = tune.run(lambda x: {"cost": 2, "m": x["b"]}, search_alg=searcher, num_samples=10)
    assert len(analysis.trials) == 5
--- a/test/tune/test_tune.py
+++ b/test/tune/test_tune.py
@@ -3,8 +3,10 @@
 import logging
 import math
 import os
+import sys
 import time

+import pytest
 import sklearn.datasets
 import sklearn.metrics
 import xgboost as xgb
@@ -17,6 +19,7 @@ try:
 except ImportError:
    print("skip test_xgboost because ray tune cannot be imported.")

+
 logger = logging.getLogger(__name__)
 os.makedirs("logs", exist_ok=True)
 logger.addHandler(logging.FileHandler("logs/tune.log"))
@@ -496,4 +499,8 @@ def _test_xgboost_bohb():


 if __name__ == "__main__":
+    test_nested_run()
+    test_nested_space()
+    test_run_training_function_return_value()
+    test_passing_search_alg()
    test_xgboost_bs()
--- a/tutorials/README.md
+++ b/tutorials/README.md
@@ -1,4 +1,5 @@
 Please find tutorials on FLAML below:
+
 - [PyData Seattle 2023](flaml-tutorial-pydata-23.md)
 - [A hands-on tutorial on FLAML presented at KDD 2022](flaml-tutorial-kdd-22.md)
 - [A lab forum on FLAML at AAAI 2023](flaml-tutorial-aaai-23.md)
--- a/tutorials/flaml-tutorial-aaai-23.md
+++ b/tutorials/flaml-tutorial-aaai-23.md
@@ -15,8 +15,8 @@ For the most up-to-date information, see the [AAAI'23 Program Agenda](https://aa
 ## What Will You Learn?

 - What FLAML is and how to use FLAML to
-    - find accurate ML models with low computational resources for common ML tasks
-    - tune hyperparameters generically
+  - find accurate ML models with low computational resources for common ML tasks
+  - tune hyperparameters generically
 - How to leverage the flexible and rich customization choices
  - finish the last mile for deployment
  - create new applications
@@ -29,39 +29,43 @@ For the most up-to-date information, see the [AAAI'23 Program Agenda](https://aa

 - Overview of AutoML and FLAML
 - Basic usages of FLAML
-    - Task-oriented AutoML
-        - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML)
-        - [Notebook: A classification task with AutoML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_classification.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_classification.ipynb)
-    - Tune User-Defined-functions with FLAML
-        - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function)
-        - [Notebook: Tune user-defined function](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_demo.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_demo.ipynb)
-    - Zero-shot AutoML
-        - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML)
-        - [Notebook: Zeroshot AutoML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/zeroshot_lightgbm.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/zeroshot_lightgbm.ipynb)
+  - Task-oriented AutoML
+    - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML)
+    - [Notebook: A classification task with AutoML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_classification.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_classification.ipynb)
+  - Tune User-Defined-functions with FLAML
+    - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function)
+    - [Notebook: Tune user-defined function](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_demo.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_demo.ipynb)
+  - Zero-shot AutoML
+    - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML)
+    - [Notebook: Zeroshot AutoML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/zeroshot_lightgbm.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/zeroshot_lightgbm.ipynb)
 - [ML.NET demo](https://learn.microsoft.com/dotnet/machine-learning/tutorials/predict-prices-with-model-builder)

 Break (15m)

 ### **Part 2. Deep Dive into FLAML**
+
 - The Science Behind FLAML’s Success
-    - [Economical hyperparameter optimization methods in FLAML](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm)
-    - [Other research in FLAML](https://microsoft.github.io/FLAML/docs/Research)
+
+  - [Economical hyperparameter optimization methods in FLAML](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm)
+  - [Other research in FLAML](https://microsoft.github.io/FLAML/docs/Research)

 - Maximize the Power of FLAML through Customization and Advanced Functionalities
-    - [Notebook: Customize your AutoML with FLAML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/customize_your_automl_with_flaml.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/customize_your_automl_with_flaml.ipynb)
-    - [Notebook: Further acceleration of AutoML with FLAML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/further_acceleration_of_automl_with_flaml.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/further_acceleration_of_automl_with_flaml.ipynb)
-    - [Notebook: Neural network model tuning with FLAML ](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_pytorch.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_pytorch.ipynb)

+  - [Notebook: Customize your AutoML with FLAML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/customize_your_automl_with_flaml.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/customize_your_automl_with_flaml.ipynb)
+  - [Notebook: Further acceleration of AutoML with FLAML](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/further_acceleration_of_automl_with_flaml.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/further_acceleration_of_automl_with_flaml.ipynb)
+  - [Notebook: Neural network model tuning with FLAML ](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_pytorch.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_pytorch.ipynb)

 ### **Part 3. New features in FLAML**
+
 - Natural language processing
-    - [Notebook: AutoML for NLP tasks](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_nlp.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_nlp.ipynb)
+  - [Notebook: AutoML for NLP tasks](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_nlp.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_nlp.ipynb)
 - Time Series Forecasting
-    - [Notebook: AutoML for Time Series Forecast tasks](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_time_series_forecast.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_time_series_forecast.ipynb)
+  - [Notebook: AutoML for Time Series Forecast tasks](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_time_series_forecast.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/automl_time_series_forecast.ipynb)
 - Targeted Hyperparameter Optimization With Lexicographic Objectives
-    - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#lexicographic-objectives)
-    - [Notebook: Find accurate and fast neural networks with lexicographic objectives](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_lexicographic.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_lexicographic.ipynb)
+  - [Documentation](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#lexicographic-objectives)
+  - [Notebook: Find accurate and fast neural networks with lexicographic objectives](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_lexicographic.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/tune_lexicographic.ipynb)
 - Online AutoML
-    - [Notebook: Online AutoML with Vowpal Wabbit](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/autovw.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/autovw.ipynb)
+  - [Notebook: Online AutoML with Vowpal Wabbit](https://github.com/microsoft/FLAML/blob/tutorial-aaai23/notebook/autovw.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial-aaai23/notebook/autovw.ipynb)
 - Fair AutoML
+
 ### Challenges and open problems
--- a/tutorials/flaml-tutorial-kdd-22.md
+++ b/tutorials/flaml-tutorial-kdd-22.md
@@ -26,23 +26,23 @@ For the most up-to-date information, see the [SIGKDD'22 Program Agenda](https://

 - Overview of AutoML and FLAML
 - Task-oriented AutoML with FLAML
-    - [Notebook: A classification task with AutoML](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_classification.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_classification.ipynb)
-    - [Notebook: A regression task with AuotML using LightGBM as the learner](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_lightgbm.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_lightgbm.ipynb)
+  - [Notebook: A classification task with AutoML](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_classification.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_classification.ipynb)
+  - [Notebook: A regression task with AuotML using LightGBM as the learner](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_lightgbm.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_lightgbm.ipynb)
 - [ML.NET demo](https://docs.microsoft.com/dotnet/machine-learning/tutorials/predict-prices-with-model-builder)
 - Tune user defined functions with FLAML
-    - [Notebook: Basic tuning procedures and advanced tuning options](https://github.com/microsoft/FLAML/blob/tutorial/notebook/tune_demo.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/tune_demo.ipynb)
-    - [Notebook: Tune pytorch](https://github.com/microsoft/FLAML/blob/tutorial/notebook/tune_pytorch.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/tune_pytorch.ipynb)
+  - [Notebook: Basic tuning procedures and advanced tuning options](https://github.com/microsoft/FLAML/blob/tutorial/notebook/tune_demo.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/tune_demo.ipynb)
+  - [Notebook: Tune pytorch](https://github.com/microsoft/FLAML/blob/tutorial/notebook/tune_pytorch.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/tune_pytorch.ipynb)
 - Q & A

 ### Part 2

 - Zero-shot AutoML
-    - [Notebook: Zeroshot AutoML](https://github.com/microsoft/FLAML/blob/tutorial/notebook/zeroshot_lightgbm.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/zeroshot_lightgbm.ipynb)
+  - [Notebook: Zeroshot AutoML](https://github.com/microsoft/FLAML/blob/tutorial/notebook/zeroshot_lightgbm.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/zeroshot_lightgbm.ipynb)
 - Time series forecasting
-    - [Notebook: AutoML for Time Series Forecast tasks](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_time_series_forecast.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_time_series_forecast.ipynb)
+  - [Notebook: AutoML for Time Series Forecast tasks](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_time_series_forecast.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_time_series_forecast.ipynb)
 - Natural language processing
-    - [Notebook: AutoML for NLP tasks](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_nlp.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_nlp.ipynb)
+  - [Notebook: AutoML for NLP tasks](https://github.com/microsoft/FLAML/blob/tutorial/notebook/automl_nlp.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/automl_nlp.ipynb)
 - Online AutoML
-    - [Notebook: Online AutoML with Vowpal Wabbit](https://github.com/microsoft/FLAML/blob/tutorial/notebook/autovw.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/autovw.ipynb)
+  - [Notebook: Online AutoML with Vowpal Wabbit](https://github.com/microsoft/FLAML/blob/tutorial/notebook/autovw.ipynb); [Open In Colab](https://colab.research.google.com/github/microsoft/FLAML/blob/tutorial/notebook/autovw.ipynb)
 - Fair AutoML
 - Challenges and open problems
--- a/tutorials/flaml-tutorial-pydata-23.md
+++ b/tutorials/flaml-tutorial-pydata-23.md
@@ -19,22 +19,26 @@ In this session, we will provide an in-depth and hands-on tutorial on Automated
 ## Tutorial Outline

 ### **Part 1. Overview**
+
 - Overview of AutoML & Hyperparameter Tuning

 ### **Part 2. Introduction to FLAML**
+
 - Introduction to FLAML
 - AutoML and Hyperparameter Tuning with FLAML
-    - [Notebook: AutoML with FLAML Library](https://github.com/microsoft/FLAML/blob/d047c79352a2b5d32b72f4323dadfa2be0db8a45/notebook/automl_flight_delays.ipynb)
-    - [Notebook: Hyperparameter Tuning with FLAML](https://github.com/microsoft/FLAML/blob/d047c79352a2b5d32b72f4323dadfa2be0db8a45/notebook/tune_synapseml.ipynb)
+  - [Notebook: AutoML with FLAML Library](https://github.com/microsoft/FLAML/blob/d047c79352a2b5d32b72f4323dadfa2be0db8a45/notebook/automl_flight_delays.ipynb)
+  - [Notebook: Hyperparameter Tuning with FLAML](https://github.com/microsoft/FLAML/blob/d047c79352a2b5d32b72f4323dadfa2be0db8a45/notebook/tune_synapseml.ipynb)

 ### **Part 3. Deep Dive into FLAML**
+
 - Advanced Functionalities
 - Parallelization with Apache Spark
-    - [Notebook: FLAML AutoML on Apache Spark](https://github.com/microsoft/FLAML/blob/d047c79352a2b5d32b72f4323dadfa2be0db8a45/notebook/automl_bankrupt_synapseml.ipynb)
+  - [Notebook: FLAML AutoML on Apache Spark](https://github.com/microsoft/FLAML/blob/d047c79352a2b5d32b72f4323dadfa2be0db8a45/notebook/automl_bankrupt_synapseml.ipynb)

 ### **Part 4. New features in FLAML**
+
 - Targeted Hyperparameter Optimization With Lexicographic Objectives
-    - [Notebook: Tune models with lexicographic preference across objectives](https://github.com/microsoft/FLAML/blob/7ae410c8eb967e2084b2e7dbe7d5fa2145a44b79/notebook/tune_lexicographic.ipynb)
+  - [Notebook: Tune models with lexicographic preference across objectives](https://github.com/microsoft/FLAML/blob/7ae410c8eb967e2084b2e7dbe7d5fa2145a44b79/notebook/tune_lexicographic.ipynb)
 - OpenAI GPT-3, GPT-4 and ChatGPT tuning
-    - [Notebook: Use FLAML to Tune OpenAI Models](https://github.com/microsoft/FLAML/blob/a0b318b12ee8288db54b674904655307f9e201c2/notebook/autogen_openai_completion.ipynb)
-    - [Notebook: Use FLAML to Tune ChatGPT](https://github.com/microsoft/FLAML/blob/a0b318b12ee8288db54b674904655307f9e201c2/notebook/autogen_chatgpt_gpt4.ipynb)
+  - [Notebook: Use FLAML to Tune OpenAI Models](https://github.com/microsoft/FLAML/blob/a0b318b12ee8288db54b674904655307f9e201c2/notebook/autogen_openai_completion.ipynb)
+  - [Notebook: Use FLAML to Tune ChatGPT](https://github.com/microsoft/FLAML/blob/a0b318b12ee8288db54b674904655307f9e201c2/notebook/autogen_chatgpt_gpt4.ipynb)
--- a/website/docs/Contribute.md
+++ b/website/docs/Contribute.md
@@ -2,13 +2,13 @@

 This project welcomes and encourages all forms of contributions, including but not limited to:

-  Pushing patches.
-  Code review of pull requests.
-  Documentation, examples and test cases.
-  Readability improvement, e.g., improvement on docstr and comments.
-  Community participation in [issues](https://github.com/microsoft/FLAML/issues), [discussions](https://github.com/microsoft/FLAML/discussions), and [discord](https://discord.gg/7ZVfhbTQZ5).
-  Tutorials, blog posts, talks that promote the project.
-  Sharing application scenarios and/or related research.
+- Pushing patches.
+- Code review of pull requests.
+- Documentation, examples and test cases.
+- Readability improvement, e.g., improvement on docstr and comments.
+- Community participation in [issues](https://github.com/microsoft/FLAML/issues), [discussions](https://github.com/microsoft/FLAML/discussions), and [discord](https://discord.gg/7ZVfhbTQZ5).
+- Tutorials, blog posts, talks that promote the project.
+- Sharing application scenarios and/or related research.

 You can take a look at the [Roadmap for Upcoming Features](https://github.com/microsoft/FLAML/wiki/Roadmap-for-Upcoming-Features) to identify potential things to work on.

@@ -41,8 +41,10 @@ feedback:
 - Please include your **operating system type and version number**, as well as
  your **Python, flaml, scikit-learn versions**. The version of flaml
  can be found by running the following code snippet:
+
 ```python
 import flaml
+
 print(flaml.__version__)
 ```

@@ -50,7 +52,6 @@ print(flaml.__version__)
  appropriate code blocks**.  See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks)
  for more details.

-
 ## Becoming a Reviewer

 There is currently no formal reviewer solicitation process. Current reviewers identify reviewers from active contributors. If you are willing to become a reviewer, you are welcome to let us know on discord.
@@ -87,7 +88,7 @@ Run `pre-commit install` to install pre-commit into your git hooks. Before you c

 ### Coverage

-Any code you commit should not decrease coverage. To run all unit tests, install the [test] option under FLAML/:
+Any code you commit should not decrease coverage. To run all unit tests, install the \[test\] option under FLAML/:

 ```bash
 pip install -e."[test]"
--- a/website/docs/Examples/AutoGen-AgentChat.md
+++ b/website/docs/Examples/AutoGen-AgentChat.md
@@ -1,3 +1,3 @@
 # AutoGen - Automated Multi Agent Chat

-Please refer to https://microsoft.github.io/autogen/docs/Examples/AutoGen-AgentChat.
+Please refer to https://microsoft.github.io/autogen/docs/Examples/#AutoGen-AgentChat.
--- a/website/docs/Examples/AutoGen-OpenAI.md
+++ b/website/docs/Examples/AutoGen-OpenAI.md
@@ -4,5 +4,6 @@
 Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/#enhanced-inference).

 Links to notebook examples:
-* [Optimize for Code Generation](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb)
-* [Optimize for Math](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
+
+- [Optimize for Code Generation](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb)
+- [Optimize for Math](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
--- a/website/docs/Examples/AutoML-Classification.md
+++ b/website/docs/Examples/AutoML-Classification.md
@@ -2,7 +2,8 @@

 ### Prerequisites

-Install the [automl] option.
+Install the \[automl\] option.
+
 ```bash
 pip install "flaml[automl]"
 ```
@@ -18,14 +19,13 @@ automl = AutoML()
 # Specify automl goal and constraint
 automl_settings = {
    "time_budget": 1,  # in seconds
-    "metric": 'accuracy',
-    "task": 'classification',
+    "metric": "accuracy",
+    "task": "classification",
    "log_file_name": "iris.log",
 }
 X_train, y_train = load_iris(return_X_y=True)
 # Train with labeled input data
-automl.fit(X_train=X_train, y_train=y_train,
-           **automl_settings)
+automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
 # Predict
 print(automl.predict_proba(X_train))
 # Print the best model
@@ -33,6 +33,7 @@ print(automl.model.estimator)
 ```

 #### Sample of output
+
 ```
 [flaml.automl: 11-12 18:21:44] {1485} INFO - Data split method: stratified
 [flaml.automl: 11-12 18:21:44] {1489} INFO - Evaluation method: cv
--- a/website/docs/Examples/AutoML-NLP.md
+++ b/website/docs/Examples/AutoML-NLP.md
@@ -2,7 +2,8 @@

 ### Requirements

-This example requires GPU. Install the [automl,hf] option:
+This example requires GPU. Install the \[automl,hf\] option:
+
 ```python
 pip install "flaml[automl,hf]"
 ```
@@ -31,9 +32,11 @@ automl_settings = {
            "output_dir": "data/output/"  # if model_path is not set, the default model is facebook/muppet-roberta-base: https://huggingface.co/facebook/muppet-roberta-base
        }
    },  # setting the huggingface arguments: output directory
-    "gpu_per_trial": 1,                         # set to 0 if no GPU is available
+    "gpu_per_trial": 1,  # set to 0 if no GPU is available
 }
-automl.fit(X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings)
+automl.fit(
+    X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings
+)
 automl.predict(X_test)
 ```

@@ -68,12 +71,8 @@ if os.path.exists("data/output/"):
 from flaml import AutoML
 from datasets import load_dataset

-train_dataset = (
-    load_dataset("glue", "stsb", split="train").to_pandas()
-)
-dev_dataset = (
-    load_dataset("glue", "stsb", split="train").to_pandas()
-)
+train_dataset = load_dataset("glue", "stsb", split="train").to_pandas()
+dev_dataset = load_dataset("glue", "stsb", split="train").to_pandas()
 custom_sent_keys = ["sentence1", "sentence2"]
 label_key = "label"
 X_train = train_dataset[custom_sent_keys]
@@ -90,10 +89,10 @@ automl_settings = {
 }
 automl_settings["fit_kwargs_by_estimator"] = {  # setting the huggingface arguments
    "transformer": {
-        "model_path": "google/electra-small-discriminator", # if model_path is not set, the default model is facebook/muppet-roberta-base: https://huggingface.co/facebook/muppet-roberta-base
-        "output_dir": "data/output/",                       # setting the output directory
+        "model_path": "google/electra-small-discriminator",  # if model_path is not set, the default model is facebook/muppet-roberta-base: https://huggingface.co/facebook/muppet-roberta-base
+        "output_dir": "data/output/",  # setting the output directory
        "fp16": False,
-    }   # setting whether to use FP16
+    }  # setting whether to use FP16
 }
 automl.fit(
    X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings
@@ -117,12 +116,8 @@ automl.fit(
 from flaml import AutoML
 from datasets import load_dataset

-train_dataset = (
-    load_dataset("xsum", split="train").to_pandas()
-)
-dev_dataset = (
-    load_dataset("xsum", split="validation").to_pandas()
-)
+train_dataset = load_dataset("xsum", split="train").to_pandas()
+dev_dataset = load_dataset("xsum", split="validation").to_pandas()
 custom_sent_keys = ["document"]
 label_key = "summary"

@@ -139,17 +134,18 @@ automl_settings = {
    "task": "summarization",
    "metric": "rouge1",
 }
-automl_settings["fit_kwargs_by_estimator"] = {      # setting the huggingface arguments
+automl_settings["fit_kwargs_by_estimator"] = {  # setting the huggingface arguments
    "transformer": {
-        "model_path": "t5-small",             # if model_path is not set, the default model is t5-small: https://huggingface.co/t5-small
-        "output_dir": "data/output/",         # setting the output directory
+        "model_path": "t5-small",  # if model_path is not set, the default model is t5-small: https://huggingface.co/t5-small
+        "output_dir": "data/output/",  # setting the output directory
        "fp16": False,
-    } # setting whether to use FP16
+    }  # setting whether to use FP16
 }
 automl.fit(
    X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings
 )
 ```
+
 #### Sample Output

 ```
@@ -234,7 +230,15 @@ train_dataset = {
    ],
    "tokens": [
        [
-            "EU", "rejects", "German", "call", "to", "boycott", "British", "lamb", ".",
+            "EU",
+            "rejects",
+            "German",
+            "call",
+            "to",
+            "boycott",
+            "British",
+            "lamb",
+            ".",
        ],
        ["Peter", "Blackburn"],
    ],
@@ -244,18 +248,14 @@ dev_dataset = {
    "ner_tags": [
        ["O"],
    ],
-    "tokens": [
-        ["1996-08-22"]
-    ],
+    "tokens": [["1996-08-22"]],
 }
 test_dataset = {
    "id": ["0"],
    "ner_tags": [
        ["O"],
    ],
-    "tokens": [
-        ['.']
-    ],
+    "tokens": [["."]],
 }
 custom_sent_keys = ["tokens"]
 label_key = "ner_tags"
@@ -273,17 +273,18 @@ automl_settings = {
    "time_budget": 10,
    "task": "token-classification",
    "fit_kwargs_by_estimator": {
-        "transformer":
-            {
-                "output_dir": "data/output/"
-                # if model_path is not set, the default model is facebook/muppet-roberta-base: https://huggingface.co/facebook/muppet-roberta-base
-            }
+        "transformer": {
+            "output_dir": "data/output/"
+            # if model_path is not set, the default model is facebook/muppet-roberta-base: https://huggingface.co/facebook/muppet-roberta-base
+        }
    },  # setting the huggingface arguments: output directory
    "gpu_per_trial": 1,  # set to 0 if no GPU is available
-    "metric": "seqeval:overall_f1"
+    "metric": "seqeval:overall_f1",
 }

-automl.fit(X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings)
+automl.fit(
+    X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings
+)
 automl.predict(X_test)
 ```

@@ -294,35 +295,39 @@ from flaml import AutoML
 import pandas as pd

 train_dataset = {
-        "id": ["0", "1"],
-        "ner_tags": [
-            [3, 0, 7, 0, 0, 0, 7, 0, 0],
-            [1, 2],
+    "id": ["0", "1"],
+    "ner_tags": [
+        [3, 0, 7, 0, 0, 0, 7, 0, 0],
+        [1, 2],
+    ],
+    "tokens": [
+        [
+            "EU",
+            "rejects",
+            "German",
+            "call",
+            "to",
+            "boycott",
+            "British",
+            "lamb",
+            ".",
        ],
-        "tokens": [
-            [
-                "EU", "rejects", "German", "call", "to", "boycott", "British", "lamb", ".",
-            ],
-            ["Peter", "Blackburn"],
-        ],
-    }
+        ["Peter", "Blackburn"],
+    ],
+}
 dev_dataset = {
    "id": ["0"],
    "ner_tags": [
        [0],
    ],
-    "tokens": [
-        ["1996-08-22"]
-    ],
+    "tokens": [["1996-08-22"]],
 }
 test_dataset = {
    "id": ["0"],
    "ner_tags": [
        [0],
    ],
-    "tokens": [
-        ['.']
-    ],
+    "tokens": [["."]],
 }
 custom_sent_keys = ["tokens"]
 label_key = "ner_tags"
@@ -340,18 +345,29 @@ automl_settings = {
    "time_budget": 10,
    "task": "token-classification",
    "fit_kwargs_by_estimator": {
-        "transformer":
-            {
-                "output_dir": "data/output/",
-                # if model_path is not set, the default model is facebook/muppet-roberta-base: https://huggingface.co/facebook/muppet-roberta-base
-                "label_list": [ "O","B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "B-MISC", "I-MISC" ]
-            }
+        "transformer": {
+            "output_dir": "data/output/",
+            # if model_path is not set, the default model is facebook/muppet-roberta-base: https://huggingface.co/facebook/muppet-roberta-base
+            "label_list": [
+                "O",
+                "B-PER",
+                "I-PER",
+                "B-ORG",
+                "I-ORG",
+                "B-LOC",
+                "I-LOC",
+                "B-MISC",
+                "I-MISC",
+            ],
+        }
    },  # setting the huggingface arguments: output directory
    "gpu_per_trial": 1,  # set to 0 if no GPU is available
-    "metric": "seqeval:overall_f1"
+    "metric": "seqeval:overall_f1",
 }

-automl.fit(X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings)
+automl.fit(
+    X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings
+)
 automl.predict(X_test)
 ```

--- a/website/docs/Examples/AutoML-Rank.md
+++ b/website/docs/Examples/AutoML-Rank.md
@@ -2,7 +2,8 @@

 ### Prerequisites

-Install the [automl] option.
+Install the \[automl\] option.
+
 ```bash
 pip install "flaml[automl]"
 ```
@@ -16,11 +17,14 @@ from flaml import AutoML
 X_train, y_train = fetch_openml(name="credit-g", return_X_y=True, as_frame=False)
 y_train = y_train.cat.codes
 # not a real learning to rank dataaset
-groups = [200] * 4 + [100] * 2    # group counts
+groups = [200] * 4 + [100] * 2  # group counts
 automl = AutoML()
 automl.fit(
-    X_train, y_train, groups=groups,
-    task='rank', time_budget=10,    # in seconds
+    X_train,
+    y_train,
+    groups=groups,
+    task="rank",
+    time_budget=10,  # in seconds
 )
 ```

--- a/website/docs/Examples/AutoML-Regression.md
+++ b/website/docs/Examples/AutoML-Regression.md
@@ -2,7 +2,8 @@

 ### Prerequisites

-Install the [automl] option.
+Install the \[automl\] option.
+
 ```bash
 pip install "flaml[automl]"
 ```
@@ -18,14 +19,13 @@ automl = AutoML()
 # Specify automl goal and constraint
 automl_settings = {
    "time_budget": 1,  # in seconds
-    "metric": 'r2',
-    "task": 'regression',
+    "metric": "r2",
+    "task": "regression",
    "log_file_name": "california.log",
 }
 X_train, y_train = fetch_california_housing(return_X_y=True)
 # Train with labeled input data
-automl.fit(X_train=X_train, y_train=y_train,
-           **automl_settings)
+automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
 # Predict
 print(automl.predict(X_train))
 # Print the best model
@@ -95,7 +95,9 @@ from sklearn.multioutput import MultiOutputRegressor
 X, y = make_regression(n_targets=3)

 # split into train and test data
-X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.30, random_state=42
+)

 # train the model
 model = MultiOutputRegressor(AutoML(task="regression", time_budget=60))
--- a/website/docs/Examples/AutoML-Time
+++ b/website/docs/Examples/AutoML-Time
@@ -2,7 +2,8 @@

 ### Prerequisites

-Install the [automl,ts_forecast] option.
+Install the \[automl,ts_forecast\] option.
+
 ```bash
 pip install "flaml[automl,ts_forecast]"
 ```
@@ -13,16 +14,18 @@ pip install "flaml[automl,ts_forecast]"
 import numpy as np
 from flaml import AutoML

-X_train = np.arange('2014-01', '2022-01', dtype='datetime64[M]')
+X_train = np.arange("2014-01", "2022-01", dtype="datetime64[M]")
 y_train = np.random.random(size=84)
 automl = AutoML()
-automl.fit(X_train=X_train[:84],  # a single column of timestamp
-           y_train=y_train,  # value for each timestamp
-           period=12,  # time horizon to forecast, e.g., 12 months
-           task='ts_forecast', time_budget=15,  # time budget in seconds
-           log_file_name="ts_forecast.log",
-           eval_method="holdout",
-          )
+automl.fit(
+    X_train=X_train[:84],  # a single column of timestamp
+    y_train=y_train,  # value for each timestamp
+    period=12,  # time horizon to forecast, e.g., 12 months
+    task="ts_forecast",
+    time_budget=15,  # time budget in seconds
+    log_file_name="ts_forecast.log",
+    eval_method="holdout",
+)
 print(automl.predict(X_train[84:]))
 ```

@@ -246,32 +249,40 @@ import statsmodels.api as sm

 data = sm.datasets.co2.load_pandas().data
 # data is given in weeks, but the task is to predict monthly, so use monthly averages instead
-data = data['co2'].resample('MS').mean()
+data = data["co2"].resample("MS").mean()
 data = data.bfill().ffill()  # makes sure there are no missing values
 data = data.to_frame().reset_index()
 num_samples = data.shape[0]
 time_horizon = 12
 split_idx = num_samples - time_horizon
-train_df = data[:split_idx]  # train_df is a dataframe with two columns: timestamp and label
-X_test = data[split_idx:]['index'].to_frame()  # X_test is a dataframe with dates for prediction
-y_test = data[split_idx:]['co2']  # y_test is a series of the values corresponding to the dates for prediction
+train_df = data[
+    :split_idx
+]  # train_df is a dataframe with two columns: timestamp and label
+X_test = data[split_idx:][
+    "index"
+].to_frame()  # X_test is a dataframe with dates for prediction
+y_test = data[split_idx:][
+    "co2"
+]  # y_test is a series of the values corresponding to the dates for prediction

 from flaml import AutoML

 automl = AutoML()
 settings = {
    "time_budget": 10,  # total running time in seconds
-    "metric": 'mape',  # primary metric for validation: 'mape' is generally used for forecast tasks
-    "task": 'ts_forecast',  # task type
-    "log_file_name": 'CO2_forecast.log',  # flaml log file
+    "metric": "mape",  # primary metric for validation: 'mape' is generally used for forecast tasks
+    "task": "ts_forecast",  # task type
+    "log_file_name": "CO2_forecast.log",  # flaml log file
    "eval_method": "holdout",  # validation method can be chosen from ['auto', 'holdout', 'cv']
    "seed": 7654321,  # random seed
 }

-automl.fit(dataframe=train_df,  # training data
-           label='co2',  # label column
-           period=time_horizon,  # key word argument 'period' must be included for forecast task)
-           **settings)
+automl.fit(
+    dataframe=train_df,  # training data
+    label="co2",  # label column
+    period=time_horizon,  # key word argument 'period' must be included for forecast task)
+    **settings
+)
 ```

 #### Sample output
@@ -417,16 +428,17 @@ The example plotting code requires matplotlib.
 flaml_y_pred = automl.predict(X_test)
 import matplotlib.pyplot as plt

-plt.plot(X_test, y_test, label='Actual level')
-plt.plot(X_test, flaml_y_pred, label='FLAML forecast')
-plt.xlabel('Date')
-plt.ylabel('CO2 Levels')
+plt.plot(X_test, y_test, label="Actual level")
+plt.plot(X_test, flaml_y_pred, label="FLAML forecast")
+plt.xlabel("Date")
+plt.ylabel("CO2 Levels")
 plt.legend()
 ```

 ![png](images/CO2.png)

 ### Multivariate Time Series (Forecasting with Exogenous Variables)
+
 ```python
 import pandas as pd

@@ -444,6 +456,7 @@ multi_df["precip"] = multi_df["precip"].fillna(method="ffill")
 multi_df = multi_df[:-2]  # last two rows are NaN for 'demand' column so remove them
 multi_df = multi_df.reset_index()

+
 # Using temperature values create categorical values
 # where 1 denotes daily tempurature is above monthly average and 0 is below.
 def get_monthly_avg(data):
@@ -452,8 +465,10 @@ def get_monthly_avg(data):
    data = data.agg({"temp": "mean"})
    return data

+
 monthly_avg = get_monthly_avg(multi_df).to_dict().get("temp")

+
 def above_monthly_avg(date, temp):
    month = date.month
    if temp > monthly_avg.get(month):
@@ -461,6 +476,7 @@ def above_monthly_avg(date, temp):
    else:
        return 0

+
 multi_df["temp_above_monthly_avg"] = multi_df.apply(
    lambda x: above_monthly_avg(x["timeStamp"], x["temp"]), axis=1
 )
@@ -536,6 +552,7 @@ print(automl.predict(multi_X_test))
 ```

 ### Forecasting Discrete Variables
+
 ```python
 from hcrystalball.utils import get_sales_data
 import numpy as np
@@ -557,7 +574,10 @@ discrete_X_train, discrete_X_test = (
    discrete_train_df[["Date", "Open", "Promo", "Promo2"]],
    discrete_test_df[["Date", "Open", "Promo", "Promo2"]],
 )
-discrete_y_train, discrete_y_test = discrete_train_df["above_mean_sales"], discrete_test_df["above_mean_sales"]
+discrete_y_train, discrete_y_test = (
+    discrete_train_df["above_mean_sales"],
+    discrete_test_df["above_mean_sales"],
+)

 # initialize AutoML instance
 automl = AutoML()
@@ -572,10 +592,9 @@ settings = {
 }

 # train the model
-automl.fit(X_train=discrete_X_train,
-           y_train=discrete_y_train,
-           **settings,
-           period=time_horizon)
+automl.fit(
+    X_train=discrete_X_train, y_train=discrete_y_train, **settings, period=time_horizon
+)

 # make predictions
 discrete_y_pred = automl.predict(discrete_X_test)
@@ -713,6 +732,7 @@ def get_stalliion_data():
    )
    return data, special_days

+
 data, special_days = get_stalliion_data()
 time_horizon = 6  # predict six months
 training_cutoff = data["time_idx"].max() - time_horizon
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Li Jiang	efaba26d2e	Update version and readme (#1338 ) * Update version and readme * Update pr template	2024-08-22 22:33:23 +00:00
Li Jiang	62194f321d	Update issue templates (#1337 )	2024-08-21 10:00:48 +00:00
Li Jiang	5bfa0b1cd3	Improve mlflow integration and add more models (#1331 ) * Add more spark models and improved mlflow integration * Update test_extra_models, setup and gitignore * Remove autofe * Remove autofe * Remove autofe * Sync changes in internal * Fix test for env without pyspark * Fix import errors * Fix tests * Fix typos * Fix pytorch-forecasting version * Remove internal funcs, rename _mlflow.py * Fix import error * Fix dependency * Fix experiment name setting * Fix dependency * Update pandas version * Update pytorch-forecasting version * Add warning message for not has_automl * Fix test errors with nltk 3.8.2 * Don't enable mlflow logging w/o an active run * Fix pytorch-forecasting can't be pickled issue * Update pyspark tests condition * Update synapseml * Update synapseml * No parent run, no logging for OSS * Log when autolog is enabled * upgrade code * Enable autolog for tune * Increase time budget for test * End run before start a new run * Update parent run * Fix import error * clean up * skip macos and win * Update notes * Update default value of model_history	2024-08-13 07:53:47 +00:00
dependabot[bot]	bd34b4e75a	Bump express from 4.18.2 to 4.19.2 in /website (#1293 ) Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2. - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/master/History.md) - [Commits](https://github.com/expressjs/express/compare/4.18.2...4.19.2) --- updated-dependencies: - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-12 12:55:25 +00:00
dependabot[bot]	7670945298	Bump follow-redirects from 1.15.4 to 1.15.6 in /website (#1291 ) Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.4 to 1.15.6. - [Release notes](https://github.com/follow-redirects/follow-redirects/releases) - [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.4...v1.15.6) --- updated-dependencies: - dependency-name: follow-redirects dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-12 12:52:11 +00:00
dependabot[bot]	43537cb539	Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /website (#1292 ) Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.3 to 5.3.4. - [Release notes](https://github.com/webpack/webpack-dev-middleware/releases) - [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md) - [Commits](https://github.com/webpack/webpack-dev-middleware/compare/v5.3.3...v5.3.4) --- updated-dependencies: - dependency-name: webpack-dev-middleware dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-12 12:50:17 +00:00
Gökhan Geyik	f913b79225	Fix(doc): Page Not Found (#1296 ) - Fix the redirect link that received a page not found error. Co-authored-by: Li Jiang <bnujli@gmail.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>	2024-08-12 12:01:46 +00:00
dependabot[bot]	a092a39b5e	Bump braces from 3.0.2 to 3.0.3 in /website (#1336 ) Bumps [braces](https://github.com/micromatch/braces) from 3.0.2 to 3.0.3. - [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md) - [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3) --- updated-dependencies: - dependency-name: braces dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-12 08:37:56 +00:00
Jirka Borovec	04bf1b8741	update py versions, sourced from PyPI (#1332 ) * update py versions, sourced from PyPI * lint --------- Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-12 04:53:48 +00:00
Jirka Borovec	b348cb1136	configure & apply pyupgrade with `py3.8+` (#1333 ) * configure pyupgrade with `py3.8+` * apply update --------- Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-12 02:54:18 +00:00
Jirka Borovec	cd0e88e383	fix missing req. arg for new `datasets` package (#1334 ) Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-12 02:19:11 +00:00
Li Jiang	a17c6e392e	Fix test errors of nltk and numpy (#1335 ) * Fix test errors with nltk 3.8.2 * Fix test errors with numpy large * Fix test errors with numpy large	2024-08-12 00:14:21 +00:00
Li Jiang	52627ff14b	Add 3.11 icon (#1330 )	2024-08-08 06:18:49 +00:00
Li Jiang	7729855f49	Bump version to 2.2.0 (#1329 )	2024-08-08 01:05:53 +00:00
Noël Barron	0fe284b21f	Doc and comment typos improvements (#1319 ) * typographical corrections in the descriptions, comment improvements, general formatting for consistency * consistent indentation for better readability, improved comments, typographical corrections * updated docstrings for better clarity, added type hint for *kwargs, typographical corrections (no functionality changes) Fix format --------- Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-06 15:29:37 +00:00
Yang, Bo	853c9501bc	Keep searching hyperparameters when `r2_score` raises an error (#1325 ) * Keep searching hyperparameters when `r2_score` raises an error * Add log info --------- Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-06 15:01:10 +00:00
Yang, Bo	8e63dd417b	Don't pass `callbacks=None` to `XGBoostSklearnEstimator._fit` (#1322 ) * Don't pass `callbacks=None` to `XGBoostSklearnEstimator._fit` The original implmentation would pass `callbacks=None` to `XGBoostSklearnEstimator._fit` and eventually lead to a `TypeError` of `XGBModel.fit() got an unexpected keyword argument 'callbacks'`. This PR instead does not pass the `callbacks=None` parameter to avoid the error. * Update setup.py to allow for xgboost 2.x --------- Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-08-06 09:24:11 +00:00
Li Jiang	f27f98c6d7	Fix test mac os python 3.11 (#1328 ) * add test * Skip test_autohf_classificationhead.py for MacOS py311 * Skip test/nlp/test_default.py for MacOS py311 * Check test_tune * Check test_lexiflow * Check test_tune * Remove checks * Skip test_nested_run for macos py311) * Skip test_nested_space for macos py311 * Test tune on MacOS Python 3.11 w/o pytest * Split tests by folder * Skip test lexiflow for MacOS py311 * Enable test_tune for MacOS py311 * Clean up	2024-08-06 05:50:44 +00:00
Li Jiang	a68d073ccf	Add support to python 3.11 (#1326 ) * Add support to python 3.11 * Fix workflow python version comparison * Ray is not supported in python 3.11 * Fix test_numpy	2024-07-31 00:18:41 +00:00
Li Jiang	15fda2206b	Add example of how to get best config and convert it to parameters (#1323 )	2024-07-24 08:20:36 +00:00
leafy-lee	a9d7b7f971	Handle IntLogUniformDistribution Deprecation before Optuna<=v4.0.0 (#1324 ) Co-authored-by: Yifei Li <v-liyifei@microsoft.com>	2024-07-24 07:02:06 +00:00
Li Jiang	d24d2e0088	Upgrade Optuna (#1321 )	2024-07-23 01:21:20 +00:00
Ranuga	67f4048667	Update ts_model.py (#1312 ) Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-07-22 05:32:51 +00:00
Li Jiang	d8129b9211	Fix typos, upgrade yarn packages, add some improvements (#1290 ) * Fix typos, upgrade yarn packages, add some improvements * Fix joblib 1.4.0 breaks joblib-spark * Fix xgboost test error * Pin xgboost<2.0.0 * Try update prophet to 1.5.1 * Update github workflow * Revert prophet version * Update github workflow * Update install libomp * Fix test errors * Fix test errors * Add retry to test and coverage * Revert "Add retry to test and coverage" This reverts commit `ce13097cd5`. * Increase test budget * Add more data to test_models, try fixing ValueError: Found array with 0 sample(s) (shape=(0, 252)) while a minimum of 1 is required.	2024-07-19 13:40:04 +00:00
Jirka Borovec	165d7467f9	precommit: introduce `mdformat` (#1276 ) * precommit: introduce `mdformat` * precommit: apply	2024-03-19 22:46:56 +00:00