Compare commits

..

63 Commits
v2.3.4 ... main

Author SHA1 Message Date
dependabot[bot]
bc1e4dc5ea Bump webpack from 5.94.0 to 5.105.0 in /website (#1515) 2026-02-08 16:29:18 +08:00
Copilot
158ff7d99e Fix transformers API compatibility: support v4.26+ and v5.0+ with version-aware parameter selection (#1514)
* Initial plan

* Fix transformers API compatibility issues

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add backward compatibility for transformers v4.26+ by version check

Support both tokenizer (v4.26-4.43) and processing_class (v4.44+) parameters based on installed transformers version. Fallback to tokenizer if version check fails.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Improve exception handling specificity

Use specific exception types (ImportError, AttributeError, ValueError) instead of broad Exception catch for better error handling.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Run pre-commit formatting on all files

Applied black formatting to fix code style across the repository.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
2026-01-28 09:00:21 +08:00
Li Jiang
a5021152d2 ci: skip pre-commit workflow on main (#1513)
* ci: skip pre-commit workflow on main

* ci: run pre-commit only on pull requests
2026-01-25 21:10:05 +08:00
Copilot
fc4efe3510 Fix sklearn 1.7+ compatibility: BaseEstimator type detection for ensemble (#1512)
* Initial plan

* Fix ExtraTreesEstimator regression ensemble error with sklearn 1.7+

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: improve __sklearn_tags__ implementation

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix format error

* Emphasize pre-commit

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-23 10:20:59 +08:00
Li Jiang
cd0e9fb0d2 Only run save dependencies on main branch (#1510) 2026-01-22 11:07:40 +08:00
dependabot[bot]
a9c0a9e30a Bump lodash from 4.17.21 to 4.17.23 in /website (#1509)
Bumps [lodash](https://github.com/lodash/lodash) from 4.17.21 to 4.17.23.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/compare/4.17.21...4.17.23)

---
updated-dependencies:
- dependency-name: lodash
  dependency-version: 4.17.23
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-22 08:47:33 +08:00
Li Jiang
a05b669de3 Update Python version support and pre-commit in documentation (#1505) 2026-01-21 16:39:54 +08:00
Copilot
6e59103e86 Add hierarchical search space documentation (#1496)
* Initial plan

* Add hierarchical search space documentation to Tune-User-Defined-Function.md

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add clarifying comments to hierarchical search space examples

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix formatting issues with pre-commit

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-21 14:40:56 +08:00
Copilot
d9e74031e0 Expose task-level and estimator-level preprocessors as public API (#1497)
* Initial plan

* Add public preprocess() API methods for AutoML and estimators

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add documentation for preprocess() API methods

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add example script demonstrating preprocess() API usage

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback - fix type hints and simplify test logic

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix formatting issues with pre-commit hooks

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Remove example.py, make tests faster

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-21 14:38:25 +08:00
Copilot
7ec1414e9b Clarify period parameter and automatic label lagging in time series forecasting (#1495)
* Initial plan

* Add comprehensive documentation for period parameter and automatic label lagging

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback on docstring clarity

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Clarify period vs prediction output length per @thinkall's feedback

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Refine terminology per code review feedback

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Run pre-commit formatting fixes

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-21 14:19:23 +08:00
Copilot
9233a52736 Add configurable label overlap handling for classification holdout strategy (#1491)
* Initial plan

* Fix training/test set overlap in holdout classification by only adding missing labels when needed

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: add bounds checking and fix edge cases

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix bounds checking: use correct comparison operator for array indexing

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix potential ValueError with max() on empty lists and simplify test assertions

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add extra bounds checking for label_matches indices

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix pandas_on_spark compatibility by using iloc_pandas_on_spark util method

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Run pre-commit to fix formatting issues

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Improve missing label handling to avoid overlap when possible

For classes with multiple instances that end up in one set, properly
re-split them between train and val instead of duplicating. Only add
to both sets when the class has exactly 1 instance (unavoidable overlap).

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix: ensure at least 1 instance remains in original set when re-splitting

Addresses comments on lines 580 and 724 - prevents moving ALL instances
from validation to training or vice versa by using min(num_instances - 1, ...).

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Refactor: Extract missing label handling into two methods (fast/slow)

- Created _handle_missing_labels_fast(): Fast version that adds first instance to both sets (may overlap)
- Created _handle_missing_labels_no_overlap(): Precise version that avoids overlap when possible
- Added allow_label_overlap parameter to AutoML.fit() (default=True for fast version)
- Updated documentation with new parameter
- Both versions maintain label completeness while offering speed/precision tradeoff

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add usage example for allow_label_overlap to Best-Practices.md

- Added comprehensive documentation in Classification section
- Included code examples showing both fast and precise versions
- Explained tradeoffs between speed and precision
- Noted that parameter only affects holdout evaluation

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: update documentation and tests

- Updated docstrings to clarify fast version only adds instances to missing sets
- Fixed documentation to reflect actual behavior (not "both sets" but "set with missing label")
- Completely rewrote test_no_overlap.py to test both allow_label_overlap modes
- Added tests with sample_weights for better code coverage
- Added test for single-instance class handling
- All 5 tests passing

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix AttributeError: initialize _allow_label_overlap in settings and retrain_from_log

- Added allow_label_overlap to settings initialization with default=True
- Added parameter defaulting in fit() method to use settings value if not provided
- Added _allow_label_overlap initialization in retrain_from_log method
- Fixes test failures in test_multiclass, test_regression, and spark tests

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add docstring to fit()

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-21 14:03:48 +08:00
Copilot
7ac076d544 Use scientific notation for best error in logger output (#1498)
* Initial plan

* Change best error format from .4f to .4e for scientific notation

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-21 09:06:19 +08:00
Copilot
3d489f1aaa Add validation and clear error messages for custom_metric parameter (#1500)
* Initial plan

* Add validation and documentation for custom_metric parameter

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Refactor validation into reusable method and improve error handling

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Apply pre-commit formatting fixes

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-21 08:58:11 +08:00
Copilot
c64eeb5e8d Document that final_estimator parameters in ensemble are not auto-tuned (#1499)
* Initial plan

* Document final_estimator parameter behavior in ensemble configuration

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: fix syntax in examples and use float comparison

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Run pre-commit to fix formatting issues

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 21:59:31 +08:00
Copilot
bf35f98a24 Document missing value handling behavior for AutoML estimators (#1473)
* Initial plan

* Add comprehensive documentation on missing value handling in FAQ

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Apply mdformat to FAQ.md

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Correct FAQ: FLAML does preprocess missing values with SimpleImputer

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 21:53:10 +08:00
Copilot
1687ca9a94 Fix eval_set preprocessing for XGBoost estimators with categorical features (#1470)
* Initial plan

* Initial analysis - reproduced eval_set preprocessing bug

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix eval_set preprocessing for XGBoost estimators with categorical features

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add eval_set tests to test_xgboost function

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix linting issues with ruff and black

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 20:41:21 +08:00
Copilot
7a597adcc9 Add GitHub Copilot instructions for FLAML repository (#1502)
* Initial plan

* Add comprehensive Copilot instructions for FLAML repository

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Update forecast dependencies list to be complete

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Clarify Python version support details

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
2026-01-20 18:06:47 +08:00
Copilot
4ea9650f99 Fix nested dictionary merge in SearchThread losing sampled hyperparameters (#1494)
* Initial plan

* Add recursive dict update to fix nested config merge

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 15:50:18 +08:00
Li Jiang
fa1a32afb6 Fix indents (#1493) 2026-01-20 11:18:58 +08:00
Copilot
5eb7d623b0 Expand docs to include all flamlized estimators (#1472)
* Initial plan

* Add documentation for all flamlized estimators (RandomForest, ExtraTrees, LGBMClassifier, XGBRegressor)

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix markdown formatting per pre-commit

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 10:59:48 +08:00
Copilot
22dcfcd3c0 Add comprehensive metric documentation and URL reference to AutoML docstrings (#1471)
* Initial plan

* Update AutoML metric documentation with full list and documentation link

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Apply black and mdformat formatting to code and documentation

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Apply pre-commit formatting fixes

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 10:34:54 +08:00
Li Jiang
d7208b32d0 Bump version to 2.5.0 (#1492) 2026-01-20 10:30:39 +08:00
Copilot
5f1aa2dda8 Fix: Preserve FLAML_sample_size in best_config_per_estimator (#1475)
* Initial plan

* Fix: Preserve FLAML_sample_size in best_config_per_estimator

Modified best_config_per_estimator property to keep FLAML_sample_size when returning best configurations. Previously, AutoMLState.sanitize() was removing this key, which caused the sample size information to be lost when using starting_points from a previous run.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add a test to verify the improvement of starting_points

* Update documentation to reflect FLAML_sample_size preservation

Updated Task-Oriented-AutoML.md to document that best_config_per_estimator now preserves FLAML_sample_size:
- Added note in "Warm start" section explaining that FLAML_sample_size is preserved for effective warm-starting
- Added note in "Get best configuration" section with example showing FLAML_sample_size in output
- Explains importance of sample size preservation for continuing optimization with correct sample sizes

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix unintended code change

* Improve docstrings and docs

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-20 07:42:31 +08:00
Copilot
67bdcde4d5 Fix BlendSearch OptunaSearch warning for non-hierarchical spaces with Ray Tune domains (#1477)
* Initial plan

* Fix BlendSearch OptunaSearch warning for non-hierarchical spaces

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Clean up test file

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add regression test for BlendSearch UDF mode warning fix

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Improve the fix and tests

* Fix Define-by-run function passed in  argument is not yet supported when using

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-20 00:01:41 +08:00
Copilot
46a406edd4 Add objective parameter to LGBMEstimator search space (#1474)
* Initial plan

* Add objective parameter to LGBMEstimator search_space

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add test for LGBMEstimator objective parameter

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix format error

* Remove changes, just add a test to verify the current supported usage

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-19 21:10:21 +08:00
Li Jiang
f1817ea7b1 Add support to python 3.13 (#1486) 2026-01-19 18:31:43 +08:00
Li Jiang
f6a5163e6a Fix isinstance usage issues (#1488)
* Fix isinstance usage issues

* Pin python version to 3.12 for pre-commit

* Update mdformat to 0.7.22
2026-01-19 15:19:05 +08:00
Li Jiang
e64b486528 Fix Best Practices not shown (#1483)
* Simplify automl.fit calls in Best Practices

Removed 'retrain_full' and 'eval_method' parameters from automl.fit calls.

* Fix best practices not shown
2026-01-13 14:25:28 +08:00
Li Jiang
a74354f7a9 Update documents, Bump version to 2.4.1, Sync Fabric till 088cfb98 (#1482)
* Add best practices

* Update docs to reflect on the recent changes

* Improve model persisting best practices

* Bump version to 2.4.1

* List all estimators

* Remove autogen

* Update dependencies
2026-01-13 12:49:36 +08:00
Li Jiang
ced1d6f331 Support pickling the whole AutoML instance, Sync Fabric till 0d4ab16f (#1481) 2026-01-12 23:04:38 +08:00
Li Jiang
bb213e7ebd Add timeout for tests and remove macos test envs (#1479) 2026-01-10 22:48:54 +08:00
Li Jiang
d241e8de90 Update readme, enable all python versions for macos tests (#1478)
* Fix macOS hang with running coverage

* Run coverage only in ubuntu

* Fix syntax error

* Fix run tests logic

* Update readme

* Don't test python 3.10 on macos as it's stuck

* Enable all python versions for macos
2026-01-10 20:03:24 +08:00
Copilot
0b138d9193 Fix log_training_metric causing IndexError for time series models (#1469)
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-10 18:07:17 +08:00
Li Jiang
1c9835dc0a Add support to Python 3.12, Sync Fabric till dc382961 (#1467)
* Merged PR 1686010: Bump version to 2.3.5.post2, Distribute source and wheel, Fix license-file, Only log better models

- Fix license-file
- Bump version to 2.3.5.post2
- Distribute source and wheel
- Log better models only
- Add artifact_path to register_automl_pipeline
- Improve logging of _automl_user_configurations

----
This pull request fixes the project’s configuration by updating the license metadata for compliance with FLAML OSS 2.3.5.

The changes in `/pyproject.toml` update the project’s license and readme metadata by replacing deprecated keys with the new structured fields.
- `/pyproject.toml`: Replaced `license_file` with `license = { text = "MIT" }`.
- `/pyproject.toml`: Replaced `description-file` with `readme = "README.md"`.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4252053

* Merged PR 1688479: Handle feature_importances_ is None, Catch RuntimeError and wait for spark cluster to recover

- Add warning message when feature_importances_ is None (#3982120)
- Catch RuntimeError and wait for spark cluster to recover (#3982133)

----
Bug fix.

This pull request prevents an AttributeError in the feature importance plotting function by adding a check for a `None` value with an informative warning message.
- `flaml/fabric/visualization.py`: Checks if `result.feature_importances_` is `None`, logs a warning with possible reasons, and returns early.
- `flaml/fabric/visualization.py`: Imports `logger` from `flaml.automl.logger` to support the warning message.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #3982120, #3982133

* Removed deprecated metadata section

* Fix log_params, log_artifact doesn't support run_id in mlflow 2.6.0

* Remove autogen

* Remove autogen

* Remove autogen

* Merged PR 1776547: Fix flaky test test_automl

Don't throw error when time budget is not enough

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix addressing a failing test in the AutoML notebook example.

#### PR Summary
This PR fixes a flaky test by adding a conditional check in the AutoML test that prints a message and exits early if no best estimator is set, thereby preventing unpredictable test failures.
- `test/automl/test_notebook_example.py`: Introduced a check to print "Training budget is not sufficient" and return if `automl.best_estimator` is not found.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4573514

* Merged PR 1777952: Fix unrecognized or malformed field 'license-file' when uploading wheel to feed

Try to fix InvalidDistribution: Invalid distribution metadata: unrecognized or malformed field 'license-file'

----
Bug fix addressing package metadata configuration.

This pull request fixes the error with unrecognized or malformed license file fields during wheel uploads by updating the setup configuration.
- In `setup.py`, added `license="MIT"` and `license_files=["LICENSE"]` to provide proper license metadata.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4560034

* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0

* Cherry-pick Merged PR 1890869: Improve time_budget estimation for mlflow logging

* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0

* Disable openai workflow

* Add python 3.12 to test envs

* Manually trigger openai

* Support markdown files with underscore-prefixed file names

* Improve save dependencies

* SynapseML is not installed

* Fix syntax error:Module !flaml/autogen was never imported

* macos 3.12 also hangs

* fix syntax error

* Update python version in actions

* Install setuptools for using pkg_resources

* Fix test_automl_performance in Github actions

* Fix test_nested_run
2026-01-10 12:17:21 +08:00
Li Jiang
1285700d7a Update readme, bump version to 2.4.0, fix CI errors (#1466)
* Update gitignore

* Bump version to 2.4.0

* Update readme

* Pre-download california housing data

* Use pre-downloaded california housing data

* Pin lightning<=2.5.6

* Fix typo in find and replace

* Fix estimators has no attribute __sklearn_tags__

* Pin torch to 2.2.2 in tests

* Fix conflict

* Update pytorch-forecasting

* Update pytorch-forecasting

* Update pytorch-forecasting

* Use numpy<2 for testing

* Update scikit-learn

* Run Build and UT every other day

* Pin pip<24.1

* Pin pip<24.1 in pipeline

* Loosen pip, install pytorch_forecasting only in py311

* Add support to new versions of nlp dependecies

* Fix formats

* Remove redefinition

* Update mlflow versions

* Fix mlflow version syntax

* Update gitignore

* Clean up cache to free space

* Remove clean up action cache

* Fix blendsearch

* Update test workflow

* Update setup.py

* Fix catboost version

* Update workflow

* Prepare for python 3.14

* Support no catboost

* Fix tests

* Fix python_requires

* Update test workflow

* Fix vw tests

* Remove python 3.9

* Fix nlp tests

* Fix prophet

* Print pip freeze for better debugging

* Fix Optuna search does not support parameters of type Float with samplers of type Quantized

* Save dependencies for later inspection

* Fix coverage.xml not exists

* Fix github action permission

* Handle python 3.13

* Address openml is not installed

* Check dependencies before run tests

* Update dependencies

* Fix syntax error

* Use bash

* Update dependencies

* Fix git error

* Loose mlflow constraints

* Add rerun, use mlflow-skinny

* Fix git error

* Remove ray tests

* Update xgboost versions

* Fix automl pickle error

* Don't test python 3.10 on macos as it's stuck

* Rebase before push

* Reduce number of branches
2026-01-09 13:40:52 +08:00
dependabot[bot]
7f42bece89 Bump algoliasearch-helper from 3.11.1 to 3.26.0 in /website (#1461)
* Bump algoliasearch-helper from 3.11.1 to 3.26.0 in /website

Bumps [algoliasearch-helper](https://github.com/algolia/instantsearch) from 3.11.1 to 3.26.0.
- [Release notes](https://github.com/algolia/instantsearch/releases)
- [Commits](https://github.com/algolia/instantsearch/commits/algoliasearch-helper@3.26.0)

---
updated-dependencies:
- dependency-name: algoliasearch-helper
  dependency-version: 3.26.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix format error

* Fix format error

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2025-10-09 14:37:31 +08:00
Keita Onabuta
e19107407b update loc second args - column (#1458)
Configure second args of loc function to time_col instead of dataframe - X.
2025-08-30 11:07:19 +08:00
Li Jiang
f5d6693253 Bump version to 2.3.7 (#1457) 2025-08-26 14:59:32 +08:00
Azamatkhan Arifkhanov
d4e43c50a2 Fix OSError: [Errno 24] Too many open files: 'nul' (#1455)
* Update model.py

Added closing of save_fds.

* Updated model.py for pre-commit requirements
2025-08-26 12:50:22 +08:00
dependabot[bot]
13aec414ea Bump brace-expansion from 1.1.11 to 1.1.12 in /website (#1453)
Bumps [brace-expansion](https://github.com/juliangruber/brace-expansion) from 1.1.11 to 1.1.12.
- [Release notes](https://github.com/juliangruber/brace-expansion/releases)
- [Commits](https://github.com/juliangruber/brace-expansion/compare/1.1.11...v1.1.12)

---
updated-dependencies:
- dependency-name: brace-expansion
  dependency-version: 1.1.12
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-08-14 10:50:51 +08:00
Li Jiang
bb16dcde93 Bump version to 2.3.6 (#1451) 2025-08-05 14:29:36 +08:00
Li Jiang
be81a76da9 Fix TypeError of customized kfold method which needs 'y' (#1450) 2025-08-02 08:05:50 +08:00
Li Jiang
2d16089529 Improve FAQ docs (#1448)
* Fix settings usage error

* Add new code example
2025-07-09 18:33:10 +08:00
Li Jiang
01c3c83653 Install wheel and setuptools (#1443) 2025-05-28 12:56:48 +08:00
Li Jiang
9b66103f7c Fix typo, add quotes to python-version (#1442) 2025-05-28 12:24:00 +08:00
Li Jiang
48dfd72e64 Fix CD actions (#1441)
* Fix CD actions

* Skip Build if no relevant changes
2025-05-28 10:45:27 +08:00
Li Jiang
dec92e5b02 Upgrade python 3.8 to 3.10 in github actions (#1440) 2025-05-27 21:34:21 +08:00
Li Jiang
22911ea1ef Merged PR 1685054: Add more logs and function wait_futures for easier post analysis (#1438)
- Add function wait_futures for easier post analysis
- Use logger instead of print

----
#### AI description  (iteration 1)
#### PR Classification
A code enhancement for debugging asynchronous mlflow logging and improving post-run analysis.

#### PR Summary
This PR adds detailed debug logging to the mlflow integration and introduces a new `wait_futures` function to streamline the collection of asynchronous task results for improved analysis.
- `flaml/fabric/mlflow.py`: Added debug log statements around starting and ending mlflow runs to trace run IDs and execution flow.
- `flaml/automl/automl.py`: Implemented the `wait_futures` function to handle asynchronous task results and replaced a print call with `logger.info` for consistent logging.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4029592
2025-05-27 15:32:56 +08:00
murunlin
12183e5f73 Add the detailed info for parameter 'verbose' (#1435)
* explain-verbose-parameter

* concise-verbose-docstring

* explain-verbose-parameter

* explain-verbose-parameter

* test-ignore

* test-ignore

* sklearn-version-califonia

* submit-0526

---------

Co-authored-by: Runlin Mu (FESCO Adecco Human Resources) <v-runlinmu@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-05-27 10:01:01 +08:00
Li Jiang
c2b25310fc Sync Fabric till 2cd1c3da (#1433)
* Sync Fabric till 2cd1c3da

* Remove synapseml from tag names

* Fix 'NoneType' object has no attribute 'DataFrame'

* Deprecated 3.8 support

* Fix 'NoneType' object has no attribute 'DataFrame'

* Still use python 3.8 for pydoc

* Don't run tests in parallel

* Remove autofe and lowcode
2025-05-23 10:19:31 +08:00
murunlin
0f9420590d fix: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model (#1429)
* mrl-issue1422-0513

* fix version dependency

* fix datasets version

* test completion

---------

Co-authored-by: Runlin Mu (FESCO Adecco Human Resources) <v-runlinmu@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-05-15 09:37:34 +08:00
hexiang-x
5107c506b4 fix:When use_spark = True and mlflow_logging = True are set, an error is reported when logging the best model: 'NoneType' object has no attribute 'save' bug Something isn't working (#1432) 2025-05-14 19:34:06 +08:00
dependabot[bot]
9e219ef8dc Bump http-proxy-middleware from 2.0.7 to 2.0.9 in /website (#1425)
Bumps [http-proxy-middleware](https://github.com/chimurai/http-proxy-middleware) from 2.0.7 to 2.0.9.
- [Release notes](https://github.com/chimurai/http-proxy-middleware/releases)
- [Changelog](https://github.com/chimurai/http-proxy-middleware/blob/v2.0.9/CHANGELOG.md)
- [Commits](https://github.com/chimurai/http-proxy-middleware/compare/v2.0.7...v2.0.9)

---
updated-dependencies:
- dependency-name: http-proxy-middleware
  dependency-version: 2.0.9
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-04-23 14:22:12 +08:00
Li Jiang
6e4083743b Revert "Numpy 2.x is not supported yet. (#1424)" (#1426)
This reverts commit 17e95edd9e.
2025-04-22 21:31:44 +08:00
Li Jiang
17e95edd9e Numpy 2.x is not supported yet. (#1424) 2025-04-22 12:11:27 +08:00
Stickic-cyber
468bc62d27 Fix issue with "list index out of range" when max_iter=1 (#1419) 2025-04-09 21:54:17 +08:00
dependabot[bot]
437c239c11 Bump @babel/helpers from 7.20.1 to 7.26.10 in /website (#1413)
Bumps [@babel/helpers](https://github.com/babel/babel/tree/HEAD/packages/babel-helpers) from 7.20.1 to 7.26.10.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.26.10/packages/babel-helpers)

---
updated-dependencies:
- dependency-name: "@babel/helpers"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-03-14 15:51:06 +08:00
dependabot[bot]
8e753f1092 Bump @babel/runtime from 7.20.1 to 7.26.10 in /website (#1414)
Bumps [@babel/runtime](https://github.com/babel/babel/tree/HEAD/packages/babel-runtime) from 7.20.1 to 7.26.10.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.26.10/packages/babel-runtime)

---
updated-dependencies:
- dependency-name: "@babel/runtime"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-03-13 21:34:02 +08:00
dependabot[bot]
a3b57e11d4 Bump prismjs from 1.29.0 to 1.30.0 in /website (#1411)
Bumps [prismjs](https://github.com/PrismJS/prism) from 1.29.0 to 1.30.0.
- [Release notes](https://github.com/PrismJS/prism/releases)
- [Changelog](https://github.com/PrismJS/prism/blob/master/CHANGELOG.md)
- [Commits](https://github.com/PrismJS/prism/compare/v1.29.0...v1.30.0)

---
updated-dependencies:
- dependency-name: prismjs
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-03-13 14:06:41 +08:00
dependabot[bot]
a80dcf9925 Bump @babel/runtime-corejs3 from 7.20.1 to 7.26.10 in /website (#1412)
Bumps [@babel/runtime-corejs3](https://github.com/babel/babel/tree/HEAD/packages/babel-runtime-corejs3) from 7.20.1 to 7.26.10.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.26.10/packages/babel-runtime-corejs3)

---
updated-dependencies:
- dependency-name: "@babel/runtime-corejs3"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-13 10:04:03 +08:00
SkBlaz
7157af44e0 Improved error handling in case no scikit present (#1402)
* Improved error handling in case no scikit present

Currently there is no description for when this error is thrown. Being explicit seems of value.

* Update histgb.py

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2025-03-03 15:39:43 +08:00
Li Jiang
1798c4591e Upgrade setuptools (#1410) 2025-03-01 08:05:51 +08:00
Li Jiang
dd26263330 Bump version to 2.3.5 (#1409) 2025-02-17 22:26:59 +08:00
118 changed files with 6204 additions and 1019 deletions

View File

@@ -1,5 +1,7 @@
[run]
branch = True
source = flaml
source =
flaml
omit =
*test*
*/test/*
*/flaml/autogen/*

243
.github/copilot-instructions.md vendored Normal file
View File

@@ -0,0 +1,243 @@
# GitHub Copilot Instructions for FLAML
## Project Overview
FLAML (Fast Library for Automated Machine Learning & Tuning) is a lightweight Python library for efficient automation of machine learning and AI operations. It automates workflow based on large language models, machine learning models, etc. and optimizes their performance.
**Key Components:**
- `flaml/automl/`: AutoML functionality for classification and regression
- `flaml/tune/`: Generic hyperparameter tuning
- `flaml/default/`: Zero-shot AutoML with default configurations
- `flaml/autogen/`: Legacy autogen code (note: AutoGen has moved to a separate repository)
- `flaml/fabric/`: Microsoft Fabric integration
- `test/`: Comprehensive test suite
## Build and Test Commands
### Installation
```bash
# Basic installation
pip install -e .
# Install with test dependencies
pip install -e .[test]
# Install with automl dependencies
pip install -e .[automl]
# Install with forecast dependencies (Linux only)
pip install -e .[forecast]
```
### Running Tests
```bash
# Run all tests (excluding autogen)
pytest test/ --ignore=test/autogen --reruns 2 --reruns-delay 10
# Run tests with coverage
coverage run -a -m pytest test --ignore=test/autogen --reruns 2 --reruns-delay 10
coverage xml
# Check dependencies
python test/check_dependency.py
```
### Linting and Formatting
```bash
# Run pre-commit hooks
pre-commit run --all-files
# Format with black (line length: 120)
black . --line-length 120
# Run ruff for linting and auto-fix
ruff check . --fix
```
## Code Style and Formatting
### Python Style
- **Line length:** 120 characters (configured in both Black and Ruff)
- **Formatter:** Black (v23.3.0+)
- **Linter:** Ruff with Pyflakes and pycodestyle rules
- **Import sorting:** Use isort (via Ruff)
- **Python version:** Supports Python >= 3.10 (full support for 3.10, 3.11, 3.12 and 3.13)
### Code Quality Rules
- Follow Black formatting conventions
- Keep imports sorted and organized
- Avoid unused imports (F401) - these are flagged but not auto-fixed
- Avoid wildcard imports (F403) where possible
- Complexity: Max McCabe complexity of 10
- Use type hints where appropriate
- Write clear docstrings for public APIs
### Pre-commit Hooks
The repository uses pre-commit hooks for:
- Checking for large files, AST syntax, YAML/TOML/JSON validity
- Detecting merge conflicts and private keys
- Trailing whitespace and end-of-file fixes
- pyupgrade for Python 3.8+ syntax
- Black formatting
- Markdown formatting (mdformat with GFM and frontmatter support)
- Ruff linting with auto-fix
## Testing Strategy
### Test Organization
- Tests are in the `test/` directory, organized by module
- `test/automl/`: AutoML feature tests
- `test/tune/`: Hyperparameter tuning tests
- `test/default/`: Zero-shot AutoML tests
- `test/nlp/`: NLP-related tests
- `test/spark/`: Spark integration tests
### Test Requirements
- Write tests for new functionality
- Ensure tests pass on multiple Python versions (3.10, 3.11, 3.12 and 3.13)
- Tests should work on both Ubuntu and Windows
- Use pytest markers for platform-specific tests (e.g., `@pytest.mark.spark`)
- Tests should be idempotent and not depend on external state
- Use `--reruns 2 --reruns-delay 10` for flaky tests
### Coverage
- Aim for good test coverage on new code
- Coverage reports are generated for Python 3.11 builds
- Coverage reports are uploaded to Codecov
## Git Workflow and Best Practices
### Branching
- Main branch: `main`
- Create feature branches from `main`
- PR reviews are required before merging
### Commit Messages
- Use clear, descriptive commit messages
- Reference issue numbers when applicable
- ALWAYS run `pre-commit run --all-files` before each commit to avoid formatting issues
### Pull Requests
- Ensure all tests pass before requesting review
- Update documentation if adding new features
- Follow the PR template in `.github/PULL_REQUEST_TEMPLATE.md`
- ALWAYS run `pre-commit run --all-files` before each commit to avoid formatting issues
## Project Structure
```
flaml/
├── automl/ # AutoML functionality
├── tune/ # Hyperparameter tuning
├── default/ # Zero-shot AutoML
├── autogen/ # Legacy autogen (deprecated, moved to separate repo)
├── fabric/ # Microsoft Fabric integration
├── onlineml/ # Online learning
└── version.py # Version information
test/ # Test suite
├── automl/
├── tune/
├── default/
├── nlp/
└── spark/
notebook/ # Example notebooks
website/ # Documentation website
```
## Dependencies and Package Management
### Core Dependencies
- NumPy >= 1.17
- Python >= 3.10 (officially supported: 3.10, 3.11, 3.12 and 3.13)
### Optional Dependencies
- `[automl]`: lightgbm, xgboost, scipy, pandas, scikit-learn
- `[test]`: Full test suite dependencies
- `[spark]`: PySpark and joblib dependencies
- `[forecast]`: holidays, prophet, statsmodels, hcrystalball, pytorch-forecasting, pytorch-lightning, tensorboardX
- `[hf]`: Hugging Face transformers and datasets
- See `setup.py` for complete list
### Version Constraints
- Be mindful of Python version-specific dependencies (check setup.py)
- XGBoost versions differ based on Python version
- NumPy 2.0+ only for Python >= 3.13
- Some features (like vowpalwabbit) only work with older Python versions
## Boundaries and Restrictions
### Do NOT Modify
- `.git/` directory and Git configuration
- `LICENSE` file
- Version information in `flaml/version.py` (unless explicitly updating version)
- GitHub Actions workflows without careful consideration
- Existing test files unless fixing bugs or adding coverage
### Be Cautious With
- `setup.py`: Changes to dependencies should be carefully reviewed
- `pyproject.toml`: Linting and testing configuration
- `.pre-commit-config.yaml`: Pre-commit hook configuration
- Backward compatibility: FLAML is a library with external users
### Security Considerations
- Never commit secrets or API keys
- Be careful with external data sources in tests
- Validate user inputs in public APIs
- Follow secure coding practices for ML operations
## Special Notes
### AutoGen Migration
- AutoGen has moved to a separate repository: https://github.com/microsoft/autogen
- The `flaml/autogen/` directory contains legacy code
- Tests in `test/autogen/` are ignored in the main test suite
- Direct users to the new AutoGen repository for AutoGen-related issues
### Platform-Specific Considerations
- Some tests only run on Linux (e.g., forecast tests with prophet)
- Windows and Ubuntu are the primary supported platforms
- macOS support exists but requires special libomp setup for lgbm/xgboost
### Performance
- FLAML focuses on efficient automation and tuning
- Consider computational cost when adding new features
- Optimize for low resource usage where possible
## Documentation
- Main documentation: https://microsoft.github.io/FLAML/
- Update documentation when adding new features
- Provide clear examples in docstrings
- Add notebook examples for significant new features
## Contributing
- Follow the contributing guide: https://microsoft.github.io/FLAML/docs/Contribute
- Sign the Microsoft CLA when making your first contribution
- Be respectful and follow the Microsoft Open Source Code of Conduct
- Join the Discord community for discussions: https://discord.gg/Cppx2vSPVP

View File

@@ -12,26 +12,17 @@ jobs:
deploy:
strategy:
matrix:
os: ['ubuntu-latest']
python-version: [3.8]
os: ["ubuntu-latest"]
python-version: ["3.12"]
runs-on: ${{ matrix.os }}
environment: package
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Cache conda
uses: actions/cache@v3
uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
path: ~/conda_pkgs_dir
key: conda-${{ matrix.os }}-python-${{ matrix.python-version }}-${{ hashFiles('environment.yml') }}
- name: Setup Miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
auto-activate-base: false
activate-environment: hcrystalball
python-version: ${{ matrix.python-version }}
use-only-tar-bz2: true
- name: Install from source
# This is required for the pre-commit tests
shell: pwsh
@@ -42,7 +33,7 @@ jobs:
- name: Build
shell: pwsh
run: |
pip install twine
pip install twine wheel setuptools
python setup.py sdist bdist_wheel
- name: Publish to PyPI
env:

View File

@@ -37,11 +37,11 @@ jobs:
- name: setup python
uses: actions/setup-python@v4
with:
python-version: "3.8"
python-version: "3.12"
- name: pydoc-markdown install
run: |
python -m pip install --upgrade pip
pip install pydoc-markdown==4.5.0
pip install pydoc-markdown==4.7.0 setuptools
- name: pydoc-markdown run
run: |
pydoc-markdown
@@ -73,11 +73,11 @@ jobs:
- name: setup python
uses: actions/setup-python@v4
with:
python-version: "3.8"
python-version: "3.12"
- name: pydoc-markdown install
run: |
python -m pip install --upgrade pip
pip install pydoc-markdown==4.5.0
pip install pydoc-markdown==4.7.0 setuptools
- name: pydoc-markdown run
run: |
pydoc-markdown

View File

@@ -4,14 +4,15 @@
name: OpenAI
on:
pull_request:
branches: ['main']
paths:
- 'flaml/autogen/**'
- 'test/autogen/**'
- 'notebook/autogen_openai_completion.ipynb'
- 'notebook/autogen_chatgpt_gpt4.ipynb'
- '.github/workflows/openai.yml'
workflow_dispatch:
# pull_request:
# branches: ['main']
# paths:
# - 'flaml/autogen/**'
# - 'test/autogen/**'
# - 'notebook/autogen_openai_completion.ipynb'
# - 'notebook/autogen_chatgpt_gpt4.ipynb'
# - '.github/workflows/openai.yml'
permissions: {}

View File

@@ -1,9 +1,7 @@
name: Code formatting
# see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
on: # Trigger the workflow on push or pull request, but only for the main branch
push:
branches: [main]
on:
pull_request: {}
defaults:

View File

@@ -14,10 +14,20 @@ on:
- 'setup.py'
pull_request:
branches: ['main']
paths:
- 'flaml/**'
- 'test/**'
- 'notebook/**'
- '.github/workflows/python-package.yml'
- 'setup.py'
merge_group:
types: [checks_requested]
schedule:
# Every other day at 02:00 UTC
- cron: '0 2 */2 * *'
permissions: {}
permissions:
contents: write
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
@@ -29,8 +39,8 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-2019]
python-version: ["3.8", "3.9", "3.10", "3.11"]
os: [ubuntu-latest, windows-latest]
python-version: ["3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
@@ -38,7 +48,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: On mac, install libomp to facilitate lgbm and xgboost install
if: matrix.os == 'macOS-latest'
if: matrix.os == 'macos-latest'
run: |
brew update
brew install libomp
@@ -50,76 +60,82 @@ jobs:
export LDFLAGS="$LDFLAGS -Wl,-rpath,/usr/local/opt/libomp/lib -L/usr/local/opt/libomp/lib -lomp"
- name: Install packages and dependencies
run: |
python -m pip install --upgrade pip wheel
python -m pip install --upgrade pip wheel setuptools
pip install -e .
python -c "import flaml"
pip install -e .[test]
- name: On Ubuntu python 3.10, install pyspark 3.4.1
if: matrix.python-version == '3.10' && matrix.os == 'ubuntu-latest'
run: |
pip install pyspark==3.4.1
pip list | grep "pyspark"
- name: On Ubuntu python 3.11, install pyspark 3.5.1
if: matrix.python-version == '3.11' && matrix.os == 'ubuntu-latest'
run: |
pip install pyspark==3.5.1
pip list | grep "pyspark"
- name: If linux and python<3.11, install ray 2
if: matrix.os == 'ubuntu-latest' && matrix.python-version != '3.11'
- name: On Ubuntu python 3.12, install pyspark 4.0.1
if: matrix.python-version == '3.12' && matrix.os == 'ubuntu-latest'
run: |
pip install "ray[tune]<2.5.0"
- name: If mac and python 3.10, install ray and xgboost 1
if: matrix.os == 'macOS-latest' && matrix.python-version == '3.10'
pip install pyspark==4.0.1
pip list | grep "pyspark"
- name: On Ubuntu python 3.13, install pyspark 4.1.0
if: matrix.python-version == '3.13' && matrix.os == 'ubuntu-latest'
run: |
pip install -e .[ray]
# use macOS to test xgboost 1, but macOS also supports xgboost 2
pip install "xgboost<2"
- name: If linux, install prophet on python < 3.9
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.8'
pip install pyspark==4.1.0
pip list | grep "pyspark"
# # TODO: support ray
# - name: If linux and python<3.11, install ray 2
# if: matrix.os == 'ubuntu-latest' && matrix.python-version < '3.11'
# run: |
# pip install "ray[tune]<2.5.0"
- name: Install prophet when on linux
if: matrix.os == 'ubuntu-latest'
run: |
pip install -e .[forecast]
- name: Install vw on python < 3.10
if: matrix.python-version == '3.8' || matrix.python-version == '3.9'
# TODO: support vw for python 3.10+
- name: If linux and python<3.10, install vw
if: matrix.os == 'ubuntu-latest' && matrix.python-version < '3.10'
run: |
pip install -e .[vw]
- name: Test with pytest
if: matrix.python-version != '3.10'
- name: Pip freeze
run: |
pytest test/
pip freeze
- name: Check dependencies
run: |
python test/check_dependency.py
- name: Clear pip cache
run: |
pip cache purge
- name: Test with pytest
timeout-minutes: 120
if: matrix.python-version != '3.11'
run: |
pytest test/ --ignore=test/autogen --reruns 2 --reruns-delay 10
- name: Coverage
if: matrix.python-version == '3.10'
timeout-minutes: 120
if: matrix.python-version == '3.11'
run: |
pip install coverage
coverage run -a -m pytest test
coverage run -a -m pytest test --ignore=test/autogen --reruns 2 --reruns-delay 10
coverage xml
- name: Upload coverage to Codecov
if: matrix.python-version == '3.10'
if: matrix.python-version == '3.11'
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
flags: unittests
- name: Save dependencies
if: github.ref == 'refs/heads/main'
shell: bash
run: |
git config --global user.name 'github-actions[bot]'
git config --global user.email 'github-actions[bot]@users.noreply.github.com'
git config advice.addIgnoredFile false
# docs:
BRANCH=unit-tests-installed-dependencies
git fetch origin
git checkout -B "$BRANCH" "origin/$BRANCH"
# runs-on: ubuntu-latest
# steps:
# - uses: actions/checkout@v3
# - name: Setup Python
# uses: actions/setup-python@v4
# with:
# python-version: '3.8'
# - name: Compile documentation
# run: |
# pip install -e .
# python -m pip install sphinx sphinx_rtd_theme
# cd docs
# make html
# - name: Deploy to GitHub pages
# if: ${{ github.ref == 'refs/heads/main' }}
# uses: JamesIves/github-pages-deploy-action@3.6.2
# with:
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# BRANCH: gh-pages
# FOLDER: docs/_build/html
# CLEAN: true
pip freeze > installed_all_dependencies_${{ matrix.python-version }}_${{ matrix.os }}.txt
python test/check_dependency.py > installed_first_tier_dependencies_${{ matrix.python-version }}_${{ matrix.os }}.txt
git add installed_*dependencies*.txt
mv coverage.xml ./coverage_${{ matrix.python-version }}_${{ matrix.os }}.xml || true
git add -f ./coverage_${{ matrix.python-version }}_${{ matrix.os }}.xml || true
git commit -m "Update installed dependencies for Python ${{ matrix.python-version }} on ${{ matrix.os }}" || exit 0
git push origin "$BRANCH" --force

7
.gitignore vendored
View File

@@ -60,6 +60,7 @@ coverage.xml
.hypothesis/
.pytest_cache/
cover/
junit
# Translations
*.mo
@@ -172,7 +173,7 @@ test/default
test/housing.json
test/nlp/default/transformer_ms/seq-classification.json
flaml/fabric/fanova/_fanova.c
flaml/fabric/fanova/*fanova.c
# local config files
*.config.local
@@ -184,3 +185,7 @@ notebook/lightning_logs/
lightning_logs/
flaml/autogen/extensions/tmp/
test/autogen/my_tmp/
catboost_*
# Internal configs
.pypirc

View File

@@ -36,7 +36,7 @@ repos:
- id: black
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.17
rev: 0.7.22
hooks:
- id: mdformat
additional_dependencies:

View File

@@ -1,5 +1,5 @@
# basic setup
FROM mcr.microsoft.com/devcontainers/python:3.8
FROM mcr.microsoft.com/devcontainers/python:3.10
RUN apt-get update && apt-get -y update
RUN apt-get install -y sudo git npm

View File

@@ -4,8 +4,8 @@ This repository incorporates material as listed below or described in the code.
## Component. Ray.
Code in tune/\[analysis.py, sample.py, trial.py, result.py\],
searcher/\[suggestion.py, variant_generator.py\], and scheduler/trial_scheduler.py is adapted from
Code in tune/[analysis.py, sample.py, trial.py, result.py],
searcher/[suggestion.py, variant_generator.py], and scheduler/trial_scheduler.py is adapted from
https://github.com/ray-project/ray/blob/master/python/ray/tune/
## Open Source License/Copyright Notice.

View File

@@ -14,15 +14,9 @@
<br>
</p>
:fire: FLAML supports AutoML and Hyperparameter Tuning in [Microsoft Fabric Data Science](https://learn.microsoft.com/en-us/fabric/data-science/automated-machine-learning-fabric). In addition, we've introduced Python 3.11 support, along with a range of new estimators, and comprehensive integration with MLflow—thanks to contributions from the Microsoft Fabric product team.
:fire: FLAML supports AutoML and Hyperparameter Tuning in [Microsoft Fabric Data Science](https://learn.microsoft.com/en-us/fabric/data-science/automated-machine-learning-fabric). In addition, we've introduced Python 3.11 and 3.12 support, along with a range of new estimators, and comprehensive integration with MLflow—thanks to contributions from the Microsoft Fabric product team.
:fire: Heads-up: We have migrated [AutoGen](https://microsoft.github.io/autogen/) into a dedicated [github repository](https://github.com/microsoft/autogen). Alongside this move, we have also launched a dedicated [Discord](https://discord.gg/pAbnFJrkgZ) server and a [website](https://microsoft.github.io/autogen/) for comprehensive documentation.
:fire: The automated multi-agent chat framework in [AutoGen](https://microsoft.github.io/autogen/) is in preview from v2.0.0.
:fire: FLAML is highlighted in OpenAI's [cookbook](https://github.com/openai/openai-cookbook#related-resources-from-around-the-web).
:fire: [autogen](https://microsoft.github.io/autogen/) is released with support for ChatGPT and GPT-4, based on [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673).
:fire: Heads-up: [AutoGen](https://microsoft.github.io/autogen/) has moved to a dedicated [GitHub repository](https://github.com/microsoft/autogen). FLAML no longer includes the `autogen` module—please use AutoGen directly.
## What is FLAML
@@ -30,7 +24,7 @@ FLAML is a lightweight Python library for efficient automation of machine
learning and AI operations. It automates workflow based on large language models, machine learning models, etc.
and optimizes their performance.
- FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
- FLAML enables economical automation and tuning for ML/AI workflows, including model selection and hyperparameter optimization under resource constraints.
- For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
- It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.
@@ -40,16 +34,16 @@ FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source,
## Installation
FLAML requires **Python version >= 3.8**. It can be installed from pip:
The latest version of FLAML requires **Python >= 3.10 and < 3.14**. While other Python versions may work for core components, full model support is not guaranteed. FLAML can be installed via `pip`:
```bash
pip install flaml
```
Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`autogen`](https://microsoft.github.io/autogen/) package.
Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`automl`](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML) module.
```bash
pip install "flaml[autogen]"
pip install "flaml[automl]"
```
Find more options in [Installation](https://microsoft.github.io/FLAML/docs/Installation).
@@ -57,39 +51,6 @@ Each of the [`notebook examples`](https://github.com/microsoft/FLAML/tree/main/n
## Quickstart
- (New) The [autogen](https://microsoft.github.io/autogen/) package enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
It offers customizable and conversable agents which integrate LLMs, tools and human.
By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
```python
from flaml import autogen
assistant = autogen.AssistantAgent("assistant")
user_proxy = autogen.UserProxyAgent("user_proxy")
user_proxy.initiate_chat(
assistant,
message="Show me the YTD gain of 10 largest technology companies as of today.",
)
# This initiates an automated chat between the two agents to solve the task
```
Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
```python
# perform tuning
config, analysis = autogen.Completion.tune(
data=tune_data,
metric="success",
mode="max",
eval_func=eval_func,
inference_budget=0.05,
optimization_budget=3,
num_samples=-1,
)
# perform inference for a test instance
response = autogen.Completion.create(context=test_instance, **config)
```
- With three lines of code, you can start using this economical and fast
AutoML engine as a [scikit-learn style estimator](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML).
@@ -111,7 +72,10 @@ automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])
```python
from flaml import tune
tune.run(evaluation_function, config={}, low_cost_partial_config={}, time_budget_s=3600)
tune.run(
evaluation_function, config={}, low_cost_partial_config={}, time_budget_s=3600
)
```
- [Zero-shot AutoML](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML) allows using the existing training API from lightgbm, xgboost etc. while getting the benefit of AutoML in choosing high-performance hyperparameter configurations per task.

View File

@@ -12,7 +12,7 @@ If you believe you have found a security vulnerability in any Microsoft-owned re
Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).
If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).
If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).
You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).

View File

@@ -1,3 +1,12 @@
import warnings
from .agentchat import *
from .code_utils import DEFAULT_MODEL, FAST_MODEL
from .oai import *
warnings.warn(
"The `flaml.autogen` module is deprecated and will be removed in a future release. "
"Please refer to `https://github.com/microsoft/autogen` for latest usage.",
DeprecationWarning,
stacklevel=2,
)

View File

@@ -156,7 +156,7 @@ class MathUserProxyAgent(UserProxyAgent):
when the number of auto reply reaches the max_consecutive_auto_reply or when is_termination_msg is True.
default_auto_reply (str or dict or None): the default auto reply message when no code execution or llm based reply is generated.
max_invalid_q_per_step (int): (ADDED) the maximum number of invalid queries per step.
**kwargs (dict): other kwargs in [UserProxyAgent](user_proxy_agent#__init__).
**kwargs (dict): other kwargs in [UserProxyAgent](../user_proxy_agent#__init__).
"""
super().__init__(
name=name,

View File

@@ -123,7 +123,7 @@ class RetrieveUserProxyAgent(UserProxyAgent):
can be found at `https://www.sbert.net/docs/pretrained_models.html`. The default model is a
fast model. If you want to use a high performance model, `all-mpnet-base-v2` is recommended.
- customized_prompt (Optional, str): the customized prompt for the retrieve chat. Default is None.
**kwargs (dict): other kwargs in [UserProxyAgent](user_proxy_agent#__init__).
**kwargs (dict): other kwargs in [UserProxyAgent](../user_proxy_agent#__init__).
"""
super().__init__(
name=name,

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
try:
from sklearn.ensemble import HistGradientBoostingClassifier, HistGradientBoostingRegressor
except ImportError:
pass
except ImportError as e:
print(f"scikit-learn is required for HistGradientBoostingEstimator. Please install it; error: {e}")
from flaml import tune
from flaml.automl.model import SKLearnEstimator

View File

@@ -2,13 +2,18 @@
# * Copyright (c) Microsoft Corporation. All rights reserved.
# * Licensed under the MIT License. See LICENSE file in the
# * project root for license information.
import json
import os
from datetime import datetime
import random
import re
import uuid
from datetime import datetime, timedelta
from decimal import ROUND_HALF_UP, Decimal
from typing import TYPE_CHECKING, Union
import numpy as np
from flaml.automl.spark import DataFrame, Series, pd, ps, psDataFrame, psSeries
from flaml.automl.spark import DataFrame, F, Series, T, pd, ps, psDataFrame, psSeries
from flaml.automl.training_log import training_log_reader
try:
@@ -19,6 +24,7 @@ except ImportError:
if TYPE_CHECKING:
from flaml.automl.task import Task
TS_TIMESTAMP_COL = "ds"
TS_VALUE_COL = "y"
@@ -45,7 +51,10 @@ def load_openml_dataset(dataset_id, data_dir=None, random_state=0, dataset_forma
"""
import pickle
import openml
try:
import openml
except ImportError:
openml = None
from sklearn.model_selection import train_test_split
filename = "openml_ds" + str(dataset_id) + ".pkl"
@@ -56,15 +65,15 @@ def load_openml_dataset(dataset_id, data_dir=None, random_state=0, dataset_forma
dataset = pickle.load(f)
else:
print("download dataset from openml")
dataset = openml.datasets.get_dataset(dataset_id)
dataset = openml.datasets.get_dataset(dataset_id) if openml else None
if not os.path.exists(data_dir):
os.makedirs(data_dir)
with open(filepath, "wb") as f:
pickle.dump(dataset, f, pickle.HIGHEST_PROTOCOL)
print("Dataset name:", dataset.name)
print("Dataset name:", dataset.name) if dataset else None
try:
X, y, *__ = dataset.get_data(target=dataset.default_target_attribute, dataset_format=dataset_format)
except ValueError:
except (ValueError, AttributeError, TypeError):
from sklearn.datasets import fetch_openml
X, y = fetch_openml(data_id=dataset_id, return_X_y=True)
@@ -445,3 +454,343 @@ class DataTransformer:
def group_counts(groups):
_, i, c = np.unique(groups, return_counts=True, return_index=True)
return c[np.argsort(i)]
def get_random_dataframe(n_rows: int = 200, ratio_none: float = 0.1, seed: int = 42) -> DataFrame:
"""Generate a random pandas DataFrame with various data types for testing.
This function creates a DataFrame with multiple column types including:
- Timestamps
- Integers
- Floats
- Categorical values
- Booleans
- Lists (tags)
- Decimal strings
- UUIDs
- Binary data (as hex strings)
- JSON blobs
- Nullable text fields
Parameters
----------
n_rows : int, default=200
Number of rows in the generated DataFrame
ratio_none : float, default=0.1
Probability of generating None values in applicable columns
seed : int, default=42
Random seed for reproducibility
Returns
-------
pd.DataFrame
A DataFrame with 14 columns of various data types
Examples
--------
>>> df = get_random_dataframe(100, 0.05, 123)
>>> df.shape
(100, 14)
>>> df.dtypes
timestamp datetime64[ns]
id int64
score float64
status object
flag object
count object
value object
tags object
rating object
uuid object
binary object
json_blob object
category category
nullable_text object
dtype: object
"""
np.random.seed(seed)
random.seed(seed)
def random_tags():
tags = ["AI", "ML", "data", "robotics", "vision"]
return random.sample(tags, k=random.randint(1, 3)) if random.random() > ratio_none else None
def random_decimal():
return (
str(Decimal(random.uniform(1, 5)).quantize(Decimal("0.01"), rounding=ROUND_HALF_UP))
if random.random() > ratio_none
else None
)
def random_json_blob():
blob = {"a": random.randint(1, 10), "b": random.random()}
return json.dumps(blob) if random.random() > ratio_none else None
def random_binary():
return bytes(random.randint(0, 255) for _ in range(4)).hex() if random.random() > ratio_none else None
data = {
"timestamp": [
datetime(2020, 1, 1) + timedelta(days=np.random.randint(0, 1000)) if np.random.rand() > ratio_none else None
for _ in range(n_rows)
],
"id": range(1, n_rows + 1),
"score": np.random.uniform(0, 100, n_rows),
"status": np.random.choice(
["active", "inactive", "pending", None],
size=n_rows,
p=[(1 - ratio_none) / 3, (1 - ratio_none) / 3, (1 - ratio_none) / 3, ratio_none],
),
"flag": np.random.choice(
[True, False, None], size=n_rows, p=[(1 - ratio_none) / 2, (1 - ratio_none) / 2, ratio_none]
),
"count": [np.random.randint(0, 100) if np.random.rand() > ratio_none else None for _ in range(n_rows)],
"value": [round(np.random.normal(50, 15), 2) if np.random.rand() > ratio_none else None for _ in range(n_rows)],
"tags": [random_tags() for _ in range(n_rows)],
"rating": [random_decimal() for _ in range(n_rows)],
"uuid": [str(uuid.uuid4()) if np.random.rand() > ratio_none else None for _ in range(n_rows)],
"binary": [random_binary() for _ in range(n_rows)],
"json_blob": [random_json_blob() for _ in range(n_rows)],
"category": pd.Categorical(
np.random.choice(
["A", "B", "C", None],
size=n_rows,
p=[(1 - ratio_none) / 3, (1 - ratio_none) / 3, (1 - ratio_none) / 3, ratio_none],
)
),
"nullable_text": [random.choice(["Good", "Bad", "Average", None]) for _ in range(n_rows)],
}
return pd.DataFrame(data)
def auto_convert_dtypes_spark(
df: psDataFrame,
na_values: list = None,
category_threshold: float = 0.3,
convert_threshold: float = 0.6,
sample_ratio: float = 0.1,
) -> tuple[psDataFrame, dict]:
"""Automatically convert data types in a PySpark DataFrame using heuristics.
This function analyzes a sample of the DataFrame to infer appropriate data types
and applies the conversions. It handles timestamps, numeric values, booleans,
and categorical fields.
Args:
df: A PySpark DataFrame to convert.
na_values: List of strings to be considered as NA/NaN. Defaults to
['NA', 'na', 'NULL', 'null', ''].
category_threshold: Maximum ratio of unique values to total values
to consider a column categorical. Defaults to 0.3.
convert_threshold: Minimum ratio of successfully converted values required
to apply a type conversion. Defaults to 0.6.
sample_ratio: Fraction of data to sample for type inference. Defaults to 0.1.
Returns:
tuple: (The DataFrame with converted types, A dictionary mapping column names to
their inferred types as strings)
Note:
- 'category' in the schema dict is conceptual as PySpark doesn't have a true
category type like pandas
- The function uses sampling for efficiency with large datasets
"""
n_rows = df.count()
if na_values is None:
na_values = ["NA", "na", "NULL", "null", ""]
# Normalize NA-like values
for colname, coltype in df.dtypes:
if coltype == "string":
df = df.withColumn(
colname,
F.when(F.trim(F.lower(F.col(colname))).isin([v.lower() for v in na_values]), None).otherwise(
F.col(colname)
),
)
schema = {}
for colname in df.columns:
# Sample once at an appropriate ratio
sample_ratio_to_use = min(1.0, sample_ratio if n_rows * sample_ratio > 100 else 100 / n_rows)
col_sample = df.select(colname).sample(withReplacement=False, fraction=sample_ratio_to_use).dropna()
sample_count = col_sample.count()
inferred_type = "string" # Default
if col_sample.dtypes[0][1] != "string":
schema[colname] = col_sample.dtypes[0][1]
continue
if sample_count == 0:
schema[colname] = "string"
continue
# Check if timestamp
ts_col = col_sample.withColumn("parsed", F.to_timestamp(F.col(colname)))
# Check numeric
if (
col_sample.withColumn("n", F.col(colname).cast("double")).filter("n is not null").count()
>= sample_count * convert_threshold
):
# All whole numbers?
all_whole = (
col_sample.withColumn("n", F.col(colname).cast("double"))
.filter("n is not null")
.withColumn("frac", F.abs(F.col("n") % 1))
.filter("frac > 0.000001")
.count()
== 0
)
inferred_type = "int" if all_whole else "double"
# Check low-cardinality (category-like)
elif (
sample_count > 0
and col_sample.select(F.countDistinct(F.col(colname))).collect()[0][0] / sample_count <= category_threshold
):
inferred_type = "category" # Will just be string, but marked as such
# Check if timestamp
elif ts_col.filter(F.col("parsed").isNotNull()).count() >= sample_count * convert_threshold:
inferred_type = "timestamp"
schema[colname] = inferred_type
# Apply inferred schema
for colname, inferred_type in schema.items():
if inferred_type == "int":
df = df.withColumn(colname, F.col(colname).cast(T.IntegerType()))
elif inferred_type == "double":
df = df.withColumn(colname, F.col(colname).cast(T.DoubleType()))
elif inferred_type == "boolean":
df = df.withColumn(
colname,
F.when(F.lower(F.col(colname)).isin("true", "yes", "1"), True)
.when(F.lower(F.col(colname)).isin("false", "no", "0"), False)
.otherwise(None),
)
elif inferred_type == "timestamp":
df = df.withColumn(colname, F.to_timestamp(F.col(colname)))
elif inferred_type == "category":
df = df.withColumn(colname, F.col(colname).cast(T.StringType())) # Marked conceptually
# otherwise keep as string (or original type)
return df, schema
def auto_convert_dtypes_pandas(
df: DataFrame,
na_values: list = None,
category_threshold: float = 0.3,
convert_threshold: float = 0.6,
sample_ratio: float = 1.0,
) -> tuple[DataFrame, dict]:
"""Automatically convert data types in a pandas DataFrame using heuristics.
This function analyzes the DataFrame to infer appropriate data types
and applies the conversions. It handles timestamps, timedeltas, numeric values,
and categorical fields.
Args:
df: A pandas DataFrame to convert.
na_values: List of strings to be considered as NA/NaN. Defaults to
['NA', 'na', 'NULL', 'null', ''].
category_threshold: Maximum ratio of unique values to total values
to consider a column categorical. Defaults to 0.3.
convert_threshold: Minimum ratio of successfully converted values required
to apply a type conversion. Defaults to 0.6.
sample_ratio: Fraction of data to sample for type inference. Not used in pandas version
but included for API compatibility. Defaults to 1.0.
Returns:
tuple: (The DataFrame with converted types, A dictionary mapping column names to
their inferred types as strings)
"""
if na_values is None:
na_values = {"NA", "na", "NULL", "null", ""}
# Remove the empty string separately (handled by the regex `^\s*$`)
vals = [re.escape(v) for v in na_values if v != ""]
# Build inner alternation group
inner = "|".join(vals) if vals else ""
if inner:
pattern = re.compile(rf"^\s*(?:{inner})?\s*$")
else:
pattern = re.compile(r"^\s*$")
df_converted = df.convert_dtypes()
schema = {}
# Sample if needed (for API compatibility)
if sample_ratio < 1.0:
df = df.sample(frac=sample_ratio)
n_rows = len(df)
for col in df.columns:
series = df[col]
# Replace NA-like values if string
if series.dtype == object:
mask = series.astype(str).str.match(pattern)
series_cleaned = series.where(~mask, np.nan)
else:
series_cleaned = series
# Skip conversion if already non-object data type, except bool which can potentially be categorical
if (
not isinstance(series_cleaned.dtype, pd.BooleanDtype)
and not isinstance(series_cleaned.dtype, pd.StringDtype)
and series_cleaned.dtype != "object"
):
# Keep the original data type for non-object dtypes
df_converted[col] = series
schema[col] = str(series_cleaned.dtype)
continue
# print(f"type: {series_cleaned.dtype}, column: {series_cleaned.name}")
if not isinstance(series_cleaned.dtype, pd.BooleanDtype):
# Try numeric (int or float)
numeric = pd.to_numeric(series_cleaned, errors="coerce")
if numeric.notna().sum() >= n_rows * convert_threshold:
if (numeric.dropna() % 1 == 0).all():
try:
df_converted[col] = numeric.astype("int") # Nullable integer
schema[col] = "int"
continue
except Exception:
pass
df_converted[col] = numeric.astype("double")
schema[col] = "double"
continue
# Try datetime
datetime_converted = pd.to_datetime(series_cleaned, errors="coerce")
if datetime_converted.notna().sum() >= n_rows * convert_threshold:
df_converted[col] = datetime_converted
schema[col] = "timestamp"
continue
# Try timedelta
try:
timedelta_converted = pd.to_timedelta(series_cleaned, errors="coerce")
if timedelta_converted.notna().sum() >= n_rows * convert_threshold:
df_converted[col] = timedelta_converted
schema[col] = "timedelta"
continue
except TypeError:
pass
# Try category
try:
unique_ratio = series_cleaned.nunique(dropna=True) / n_rows if n_rows > 0 else 1.0
if unique_ratio <= category_threshold:
df_converted[col] = series_cleaned.astype("category")
schema[col] = "category"
continue
except Exception:
pass
df_converted[col] = series_cleaned.astype("string")
schema[col] = "string"
return df_converted, schema

View File

@@ -1,7 +1,37 @@
import logging
import os
class ColoredFormatter(logging.Formatter):
# ANSI escape codes for colors
COLORS = {
# logging.DEBUG: "\033[36m", # Cyan
# logging.INFO: "\033[32m", # Green
logging.WARNING: "\033[33m", # Yellow
logging.ERROR: "\033[31m", # Red
logging.CRITICAL: "\033[1;31m", # Bright Red
}
RESET = "\033[0m" # Reset to default
def __init__(self, fmt, datefmt, use_color=True):
super().__init__(fmt, datefmt)
self.use_color = use_color
def format(self, record):
formatted = super().format(record)
if self.use_color:
color = self.COLORS.get(record.levelno, "")
if color:
return f"{color}{formatted}{self.RESET}"
return formatted
logger = logging.getLogger(__name__)
logger_formatter = logging.Formatter(
"[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S"
use_color = True
if os.getenv("FLAML_LOG_NO_COLOR"):
use_color = False
logger_formatter = ColoredFormatter(
"[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S", use_color
)
logger.propagate = False

View File

@@ -127,9 +127,21 @@ def metric_loss_score(
import datasets
datasets_metric_name = huggingface_submetric_to_metric.get(metric_name, metric_name.split(":")[0])
metric = datasets.load_metric(datasets_metric_name, trust_remote_code=True)
metric_mode = huggingface_metric_to_mode[datasets_metric_name]
# datasets>=3 removed load_metric; prefer evaluate if available
try:
import evaluate
metric = evaluate.load(datasets_metric_name, trust_remote_code=True)
except Exception:
if hasattr(datasets, "load_metric"):
metric = datasets.load_metric(datasets_metric_name, trust_remote_code=True)
else:
from datasets import load_metric as _load_metric # older datasets
metric = _load_metric(datasets_metric_name, trust_remote_code=True)
if metric_name.startswith("seqeval"):
y_processed_true = [[labels[tr] for tr in each_list] for each_list in y_processed_true]
elif metric in ("pearsonr", "spearmanr"):
@@ -299,14 +311,14 @@ def get_y_pred(estimator, X, eval_metric, task: Task):
else:
y_pred = estimator.predict(X)
if isinstance(y_pred, Series) or isinstance(y_pred, DataFrame):
if isinstance(y_pred, (Series, DataFrame)):
y_pred = y_pred.values
return y_pred
def to_numpy(x):
if isinstance(x, Series or isinstance(x, DataFrame)):
if isinstance(x, (Series, DataFrame)):
x = x.values
else:
x = np.ndarray(x)
@@ -574,7 +586,7 @@ def _eval_estimator(
# TODO: why are integer labels being cast to str in the first place?
if isinstance(val_pred_y, Series) or isinstance(val_pred_y, DataFrame) or isinstance(val_pred_y, np.ndarray):
if isinstance(val_pred_y, (Series, DataFrame, np.ndarray)):
test = val_pred_y if isinstance(val_pred_y, np.ndarray) else val_pred_y.values
if not np.issubdtype(test.dtype, np.number):
# some NLP models return a list
@@ -604,7 +616,12 @@ def _eval_estimator(
logger.warning(f"ValueError {e} happened in `metric_loss_score`, set `val_loss` to `np.inf`")
metric_for_logging = {"pred_time": pred_time}
if log_training_metric:
train_pred_y = get_y_pred(estimator, X_train, eval_metric, task)
# For time series forecasting, X_train may be a sampled dataset whose
# test partition can be empty. Use the training partition from X_val
# (which is the dataset used to define y_train above) to keep shapes
# aligned and avoid empty prediction inputs.
X_train_for_metric = X_val.X_train if isinstance(X_val, TimeSeriesDataset) else X_train
train_pred_y = get_y_pred(estimator, X_train_for_metric, eval_metric, task)
metric_for_logging["train_loss"] = metric_loss_score(
eval_metric,
train_pred_y,

View File

@@ -26,6 +26,13 @@ from sklearn.preprocessing import Normalizer
from sklearn.svm import LinearSVC
from xgboost import __version__ as xgboost_version
try:
from sklearn.utils._tags import ClassifierTags, RegressorTags
SKLEARN_TAGS_AVAILABLE = True
except ImportError:
SKLEARN_TAGS_AVAILABLE = False
from flaml import tune
from flaml.automl.data import group_counts
from flaml.automl.spark import ERROR as SPARK_ERROR
@@ -111,7 +118,7 @@ def limit_resource(memory_limit, time_limit):
pass
class BaseEstimator:
class BaseEstimator(sklearn.base.ClassifierMixin, sklearn.base.BaseEstimator):
"""The abstract class for all learners.
Typical examples:
@@ -135,6 +142,7 @@ class BaseEstimator:
self._task = task if isinstance(task, Task) else task_factory(task, None, None)
self.params = self.config2params(config)
self.estimator_class = self._model = None
self.estimator_baseclass = "sklearn"
if "_estimator_type" in self.params:
self._estimator_type = self.params.pop("_estimator_type")
else:
@@ -147,6 +155,25 @@ class BaseEstimator:
params["_estimator_type"] = self._estimator_type
return params
def __sklearn_tags__(self):
"""Override sklearn tags to respect the _estimator_type attribute.
This is needed for sklearn 1.7+ which uses get_tags() instead of
checking _estimator_type directly. Since BaseEstimator inherits from
ClassifierMixin, it would otherwise always be tagged as a classifier.
"""
tags = super().__sklearn_tags__()
if hasattr(self, "_estimator_type") and SKLEARN_TAGS_AVAILABLE:
if self._estimator_type == "regressor":
tags.estimator_type = "regressor"
tags.regressor_tags = RegressorTags()
tags.classifier_tags = None
elif self._estimator_type == "classifier":
tags.estimator_type = "classifier"
tags.classifier_tags = ClassifierTags()
tags.regressor_tags = None
return tags
@property
def classes_(self):
return self._model.classes_
@@ -294,6 +321,35 @@ class BaseEstimator:
train_time = self._fit(X_train, y_train, **kwargs)
return train_time
def preprocess(self, X):
"""Preprocess data using estimator-level preprocessing.
This method applies estimator-specific preprocessing transformations to the input data.
This is the second level of preprocessing that should be applied after task-level
preprocessing (automl.preprocess()). Different estimator types may apply different
preprocessing steps (e.g., sparse matrix conversion, dataframe handling).
Args:
X: A numpy array or a dataframe of featurized instances, shape n*m.
Returns:
Preprocessed data ready for the estimator's predict/fit methods.
Example:
```python
automl = AutoML()
automl.fit(X_train, y_train, task="classification")
# First apply task-level preprocessing
X_test_task = automl.preprocess(X_test)
# Then apply estimator-level preprocessing
estimator = automl.model
X_test_estimator = estimator.preprocess(X_test_task)
```
"""
return self._preprocess(X)
def predict(self, X, **kwargs):
"""Predict label from features.
@@ -439,6 +495,7 @@ class SparkEstimator(BaseEstimator):
raise SPARK_ERROR
super().__init__(task, **config)
self.df_train = None
self.estimator_baseclass = "spark"
def _preprocess(
self,
@@ -974,7 +1031,7 @@ class TransformersEstimator(BaseEstimator):
from .nlp.huggingface.utils import tokenize_text
from .nlp.utils import is_a_list_of_str
is_str = str(X.dtypes[0]) in ("string", "str")
is_str = str(X.dtypes.iloc[0]) in ("string", "str")
is_list_of_str = is_a_list_of_str(X[list(X.keys())[0]].to_list()[0])
if is_str or is_list_of_str:
@@ -1139,16 +1196,31 @@ class TransformersEstimator(BaseEstimator):
control.should_save = True
control.should_evaluate = True
self._trainer = TrainerForAuto(
args=self._training_args,
model_init=self._model_init,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=self.tokenizer,
data_collator=self.data_collator,
compute_metrics=self._compute_metrics_by_dataset_name,
callbacks=[EarlyStoppingCallbackForAuto],
)
# Use processing_class for transformers >= 4.44.0, tokenizer for older versions
trainer_kwargs = {
"args": self._training_args,
"model_init": self._model_init,
"train_dataset": train_dataset,
"eval_dataset": eval_dataset,
"data_collator": self.data_collator,
"compute_metrics": self._compute_metrics_by_dataset_name,
"callbacks": [EarlyStoppingCallbackForAuto],
}
# Check if processing_class parameter is supported (transformers >= 4.44.0)
try:
import transformers
from packaging import version
if version.parse(transformers.__version__) >= version.parse("4.44.0"):
trainer_kwargs["processing_class"] = self.tokenizer
else:
trainer_kwargs["tokenizer"] = self.tokenizer
except (ImportError, AttributeError, ValueError):
# Fallback to tokenizer if version check fails
trainer_kwargs["tokenizer"] = self.tokenizer
self._trainer = TrainerForAuto(**trainer_kwargs)
if self._task in NLG_TASKS:
setattr(self._trainer, "_is_seq2seq", True)
@@ -2347,8 +2419,11 @@ class SGDEstimator(SKLearnEstimator):
params = super().config2params(config)
params["tol"] = params.get("tol", 0.0001)
params["loss"] = params.get("loss", None)
if params["loss"] is None and self._task.is_classification():
params["loss"] = "log_loss" if SKLEARN_VERSION >= "1.1" else "log"
if params["loss"] is None:
if self._task.is_classification():
params["loss"] = "log_loss" if SKLEARN_VERSION >= "1.1" else "log"
else:
params["loss"] = "squared_error"
if not self._task.is_classification() and "n_jobs" in params:
params.pop("n_jobs")
@@ -2820,7 +2895,7 @@ class suppress_stdout_stderr:
# Open a pair of null files
self.null_fds = [os.open(os.devnull, os.O_RDWR) for x in range(2)]
# Save the actual stdout (1) and stderr (2) file descriptors.
self.save_fds = (os.dup(1), os.dup(2))
self.save_fds = [os.dup(1), os.dup(2)]
def __enter__(self):
# Assign the null pointers to stdout and stderr.
@@ -2832,5 +2907,5 @@ class suppress_stdout_stderr:
os.dup2(self.save_fds[0], 1)
os.dup2(self.save_fds[1], 2)
# Close the null files
os.close(self.null_fds[0])
os.close(self.null_fds[1])
for fd in self.null_fds + self.save_fds:
os.close(fd)

View File

@@ -5,7 +5,7 @@ from typing import List, Optional
from flaml.automl.task.task import NLG_TASKS
try:
from transformers import TrainingArguments
from transformers import Seq2SeqTrainingArguments as TrainingArguments
except ImportError:
TrainingArguments = object
@@ -77,6 +77,14 @@ class TrainingArgumentsForAuto(TrainingArguments):
logging_steps: int = field(default=500, metadata={"help": "Log every X updates steps."})
# Newer versions of HuggingFace Transformers may access `TrainingArguments.generation_config`
# (e.g., in generation-aware trainers/callbacks). Keep this attribute to remain compatible
# while defaulting to None for non-generation tasks.
generation_config: Optional[object] = field(
default=None,
metadata={"help": "Optional generation config (or path) used by generation-aware trainers."},
)
@staticmethod
def load_args_from_console():
from dataclasses import fields

View File

@@ -211,29 +211,28 @@ def tokenize_onedataframe(
hf_args=None,
prefix_str=None,
):
with tokenizer.as_target_tokenizer():
_, tokenized_column_names = tokenize_row(
dict(X.iloc[0]),
_, tokenized_column_names = tokenize_row(
dict(X.iloc[0]),
tokenizer,
prefix=(prefix_str,) if task is SUMMARIZATION else None,
task=task,
hf_args=hf_args,
return_column_name=True,
)
d = X.apply(
lambda x: tokenize_row(
x,
tokenizer,
prefix=(prefix_str,) if task is SUMMARIZATION else None,
task=task,
hf_args=hf_args,
return_column_name=True,
)
d = X.apply(
lambda x: tokenize_row(
x,
tokenizer,
prefix=(prefix_str,) if task is SUMMARIZATION else None,
task=task,
hf_args=hf_args,
),
axis=1,
result_type="expand",
)
X_tokenized = pd.DataFrame(columns=tokenized_column_names)
X_tokenized[tokenized_column_names] = d
return X_tokenized
),
axis=1,
result_type="expand",
)
X_tokenized = pd.DataFrame(columns=tokenized_column_names)
X_tokenized[tokenized_column_names] = d
return X_tokenized
def tokenize_row(
@@ -396,7 +395,7 @@ def load_model(checkpoint_path, task, num_labels=None):
if task in (SEQCLASSIFICATION, SEQREGRESSION):
return AutoModelForSequenceClassification.from_pretrained(
checkpoint_path, config=model_config, ignore_mismatched_sizes=True
checkpoint_path, config=model_config, ignore_mismatched_sizes=True, trust_remote_code=True
)
elif task == TOKENCLASSIFICATION:
return AutoModelForTokenClassification.from_pretrained(checkpoint_path, config=model_config)

View File

@@ -25,9 +25,7 @@ def load_default_huggingface_metric_for_task(task):
def is_a_list_of_str(this_obj):
return (isinstance(this_obj, list) or isinstance(this_obj, np.ndarray)) and all(
isinstance(x, str) for x in this_obj
)
return isinstance(this_obj, (list, np.ndarray)) and all(isinstance(x, str) for x in this_obj)
def _clean_value(value: Any) -> str:

View File

@@ -1,3 +1,5 @@
import atexit
import logging
import os
os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
@@ -10,13 +12,14 @@ try:
from pyspark.pandas import Series as psSeries
from pyspark.pandas import set_option
from pyspark.sql import DataFrame as sparkDataFrame
from pyspark.sql import SparkSession
from pyspark.util import VersionUtils
except ImportError:
class psDataFrame:
pass
F = T = ps = sparkDataFrame = psSeries = psDataFrame
F = T = ps = sparkDataFrame = SparkSession = psSeries = psDataFrame
_spark_major_minor_version = set_option = None
ERROR = ImportError(
"""Please run pip install flaml[spark]
@@ -32,3 +35,60 @@ try:
from pandas import DataFrame, Series
except ImportError:
DataFrame = Series = pd = None
logger = logging.getLogger(__name__)
def disable_spark_ansi_mode():
"""Disable Spark ANSI mode if it is enabled."""
spark = SparkSession.getActiveSession() if hasattr(SparkSession, "getActiveSession") else None
adjusted = False
try:
ps_conf = ps.get_option("compute.fail_on_ansi_mode")
except Exception:
ps_conf = None
ansi_conf = [None, ps_conf] # ansi_conf and ps_conf original values
# Spark may store the config as string 'true'/'false' (or boolean in some contexts)
if spark is not None:
ansi_conf[0] = spark.conf.get("spark.sql.ansi.enabled")
ansi_enabled = (
(isinstance(ansi_conf[0], str) and ansi_conf[0].lower() == "true")
or (isinstance(ansi_conf[0], bool) and ansi_conf[0] is True)
or ansi_conf[0] is None
)
try:
if ansi_enabled:
logger.debug("Adjusting spark.sql.ansi.enabled to false")
spark.conf.set("spark.sql.ansi.enabled", "false")
adjusted = True
except Exception:
# If reading/setting options fail for some reason, keep going and let
# pandas-on-Spark raise a meaningful error later.
logger.exception("Failed to set spark.sql.ansi.enabled")
if ansi_conf[1]:
logger.debug("Adjusting pandas-on-Spark compute.fail_on_ansi_mode to False")
ps.set_option("compute.fail_on_ansi_mode", False)
adjusted = True
return spark, ansi_conf, adjusted
def restore_spark_ansi_mode(spark, ansi_conf, adjusted):
"""Restore Spark ANSI mode to its original setting."""
# Restore the original spark.sql.ansi.enabled to avoid persistent side-effects.
if adjusted and spark and ansi_conf[0] is not None:
try:
logger.debug(f"Restoring spark.sql.ansi.enabled to {ansi_conf[0]}")
spark.conf.set("spark.sql.ansi.enabled", ansi_conf[0])
except Exception:
logger.exception("Failed to restore spark.sql.ansi.enabled")
if adjusted and ansi_conf[1]:
logger.debug(f"Restoring pandas-on-Spark compute.fail_on_ansi_mode to {ansi_conf[1]}")
ps.set_option("compute.fail_on_ansi_mode", ansi_conf[1])
spark, ansi_conf, adjusted = disable_spark_ansi_mode()
atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)

View File

@@ -59,17 +59,29 @@ def to_pandas_on_spark(
```
"""
set_option("compute.default_index_type", default_index_type)
if isinstance(df, (DataFrame, Series)):
return ps.from_pandas(df)
elif isinstance(df, sparkDataFrame):
if _spark_major_minor_version[0] == 3 and _spark_major_minor_version[1] < 3:
return df.to_pandas_on_spark(index_col=index_col)
try:
orig_ps_conf = ps.get_option("compute.fail_on_ansi_mode")
except Exception:
orig_ps_conf = None
if orig_ps_conf:
ps.set_option("compute.fail_on_ansi_mode", False)
try:
if isinstance(df, (DataFrame, Series)):
return ps.from_pandas(df)
elif isinstance(df, sparkDataFrame):
if _spark_major_minor_version[0] == 3 and _spark_major_minor_version[1] < 3:
return df.to_pandas_on_spark(index_col=index_col)
else:
return df.pandas_api(index_col=index_col)
elif isinstance(df, (psDataFrame, psSeries)):
return df
else:
return df.pandas_api(index_col=index_col)
elif isinstance(df, (psDataFrame, psSeries)):
return df
else:
raise TypeError(f"{type(df)} is not one of pandas.DataFrame, pandas.Series and pyspark.sql.DataFrame")
raise TypeError(f"{type(df)} is not one of pandas.DataFrame, pandas.Series and pyspark.sql.DataFrame")
finally:
# Restore original config
if orig_ps_conf:
ps.set_option("compute.fail_on_ansi_mode", orig_ps_conf)
def train_test_split_pyspark(

View File

@@ -37,10 +37,9 @@ class SearchState:
if isinstance(domain_one_dim, sample.Domain):
renamed_type = list(inspect.signature(domain_one_dim.is_valid).parameters.values())[0].annotation
type_match = (
renamed_type == Any
renamed_type is Any
or isinstance(value_one_dim, renamed_type)
or isinstance(value_one_dim, int)
and renamed_type is float
or (renamed_type is float and isinstance(value_one_dim, int))
)
if not (type_match and domain_one_dim.is_valid(value_one_dim)):
return False

View File

@@ -365,6 +365,465 @@ class GenericTask(Task):
X_train, X_val, y_train, y_val = GenericTask._split_pyspark(state, X, y, split_ratio, stratify)
return X_train, X_val, y_train, y_val
def _handle_missing_labels_fast(
self,
state,
X_train,
X_val,
y_train,
y_val,
X_train_all,
y_train_all,
is_spark_dataframe,
data_is_df,
):
"""Handle missing labels by adding first instance to the set with missing label.
This is the faster version that may create some overlap but ensures all labels
are present in both sets. If a label is missing from train, it adds the first
instance to train. If a label is missing from val, it adds the first instance to val.
If no labels are missing, no instances are duplicated.
Args:
state: The state object containing fit parameters
X_train, X_val: Training and validation features
y_train, y_val: Training and validation labels
X_train_all, y_train_all: Complete dataset
is_spark_dataframe: Whether data is pandas_on_spark
data_is_df: Whether data is DataFrame/Series
Returns:
Tuple of (X_train, X_val, y_train, y_val) with missing labels added
"""
# Check which labels are present in train and val sets
if is_spark_dataframe:
label_set_train, _ = unique_pandas_on_spark(y_train)
label_set_val, _ = unique_pandas_on_spark(y_val)
label_set_all, first = unique_value_first_index(y_train_all)
else:
label_set_all, first = unique_value_first_index(y_train_all)
label_set_train = np.unique(y_train)
label_set_val = np.unique(y_val)
# Find missing labels
missing_in_train = np.setdiff1d(label_set_all, label_set_train)
missing_in_val = np.setdiff1d(label_set_all, label_set_val)
# Add first instance of missing labels to train set
if len(missing_in_train) > 0:
missing_train_indices = []
for label in missing_in_train:
label_matches = np.where(label_set_all == label)[0]
if len(label_matches) > 0 and label_matches[0] < len(first):
missing_train_indices.append(first[label_matches[0]])
if len(missing_train_indices) > 0:
X_missing_train = (
iloc_pandas_on_spark(X_train_all, missing_train_indices)
if is_spark_dataframe
else X_train_all.iloc[missing_train_indices]
if data_is_df
else X_train_all[missing_train_indices]
)
y_missing_train = (
iloc_pandas_on_spark(y_train_all, missing_train_indices)
if is_spark_dataframe
else y_train_all.iloc[missing_train_indices]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[missing_train_indices]
)
X_train = concat(X_missing_train, X_train)
y_train = concat(y_missing_train, y_train) if data_is_df else np.concatenate([y_missing_train, y_train])
# Handle sample_weight if present
if "sample_weight" in state.fit_kwargs:
sample_weight_source = (
state.sample_weight_all
if hasattr(state, "sample_weight_all")
else state.fit_kwargs.get("sample_weight")
)
if sample_weight_source is not None and max(missing_train_indices) < len(sample_weight_source):
missing_weights = (
sample_weight_source[missing_train_indices]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[missing_train_indices]
)
state.fit_kwargs["sample_weight"] = concat(missing_weights, state.fit_kwargs["sample_weight"])
# Add first instance of missing labels to val set
if len(missing_in_val) > 0:
missing_val_indices = []
for label in missing_in_val:
label_matches = np.where(label_set_all == label)[0]
if len(label_matches) > 0 and label_matches[0] < len(first):
missing_val_indices.append(first[label_matches[0]])
if len(missing_val_indices) > 0:
X_missing_val = (
iloc_pandas_on_spark(X_train_all, missing_val_indices)
if is_spark_dataframe
else X_train_all.iloc[missing_val_indices]
if data_is_df
else X_train_all[missing_val_indices]
)
y_missing_val = (
iloc_pandas_on_spark(y_train_all, missing_val_indices)
if is_spark_dataframe
else y_train_all.iloc[missing_val_indices]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[missing_val_indices]
)
X_val = concat(X_missing_val, X_val)
y_val = concat(y_missing_val, y_val) if data_is_df else np.concatenate([y_missing_val, y_val])
# Handle sample_weight if present
if (
"sample_weight" in state.fit_kwargs
and hasattr(state, "weight_val")
and state.weight_val is not None
):
sample_weight_source = (
state.sample_weight_all
if hasattr(state, "sample_weight_all")
else state.fit_kwargs.get("sample_weight")
)
if sample_weight_source is not None and max(missing_val_indices) < len(sample_weight_source):
missing_weights = (
sample_weight_source[missing_val_indices]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[missing_val_indices]
)
state.weight_val = concat(missing_weights, state.weight_val)
return X_train, X_val, y_train, y_val
def _handle_missing_labels_no_overlap(
self,
state,
X_train,
X_val,
y_train,
y_val,
X_train_all,
y_train_all,
is_spark_dataframe,
data_is_df,
split_ratio,
):
"""Handle missing labels intelligently to avoid overlap when possible.
This is the slower but more precise version that:
- For single-instance classes: Adds to both sets (unavoidable overlap)
- For multi-instance classes: Re-splits them properly to avoid overlap
Args:
state: The state object containing fit parameters
X_train, X_val: Training and validation features
y_train, y_val: Training and validation labels
X_train_all, y_train_all: Complete dataset
is_spark_dataframe: Whether data is pandas_on_spark
data_is_df: Whether data is DataFrame/Series
split_ratio: The ratio for splitting
Returns:
Tuple of (X_train, X_val, y_train, y_val) with missing labels handled
"""
# Check which labels are present in train and val sets
if is_spark_dataframe:
label_set_train, _ = unique_pandas_on_spark(y_train)
label_set_val, _ = unique_pandas_on_spark(y_val)
label_set_all, first = unique_value_first_index(y_train_all)
else:
label_set_all, first = unique_value_first_index(y_train_all)
label_set_train = np.unique(y_train)
label_set_val = np.unique(y_val)
# Find missing labels
missing_in_train = np.setdiff1d(label_set_all, label_set_train)
missing_in_val = np.setdiff1d(label_set_all, label_set_val)
# Handle missing labels intelligently
# For classes with only 1 instance: add to both sets (unavoidable overlap)
# For classes with multiple instances: move/split them properly to avoid overlap
if len(missing_in_train) > 0:
# Process missing labels in training set
for label in missing_in_train:
# Find all indices for this label in the original data
if is_spark_dataframe:
label_indices = np.where(y_train_all.to_numpy() == label)[0].tolist()
else:
label_indices = np.where(np.asarray(y_train_all) == label)[0].tolist()
num_instances = len(label_indices)
if num_instances == 1:
# Single instance: must add to both train and val (unavoidable overlap)
X_single = (
iloc_pandas_on_spark(X_train_all, label_indices)
if is_spark_dataframe
else X_train_all.iloc[label_indices]
if data_is_df
else X_train_all[label_indices]
)
y_single = (
iloc_pandas_on_spark(y_train_all, label_indices)
if is_spark_dataframe
else y_train_all.iloc[label_indices]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[label_indices]
)
X_train = concat(X_single, X_train)
y_train = concat(y_single, y_train) if data_is_df else np.concatenate([y_single, y_train])
# Handle sample_weight
if "sample_weight" in state.fit_kwargs:
sample_weight_source = (
state.sample_weight_all
if hasattr(state, "sample_weight_all")
else state.fit_kwargs.get("sample_weight")
)
if sample_weight_source is not None and label_indices[0] < len(sample_weight_source):
single_weight = (
sample_weight_source[label_indices]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[label_indices]
)
state.fit_kwargs["sample_weight"] = concat(single_weight, state.fit_kwargs["sample_weight"])
else:
# Multiple instances: move some from val to train (no overlap needed)
# Calculate how many to move to train (leave at least 1 in val)
num_to_train = max(1, min(num_instances - 1, int(num_instances * (1 - split_ratio))))
indices_to_move = label_indices[:num_to_train]
X_to_move = (
iloc_pandas_on_spark(X_train_all, indices_to_move)
if is_spark_dataframe
else X_train_all.iloc[indices_to_move]
if data_is_df
else X_train_all[indices_to_move]
)
y_to_move = (
iloc_pandas_on_spark(y_train_all, indices_to_move)
if is_spark_dataframe
else y_train_all.iloc[indices_to_move]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[indices_to_move]
)
# Add to train
X_train = concat(X_to_move, X_train)
y_train = concat(y_to_move, y_train) if data_is_df else np.concatenate([y_to_move, y_train])
# Remove from val (they are currently all in val)
if is_spark_dataframe:
val_mask = ~y_val.isin([label])
X_val = X_val[val_mask]
y_val = y_val[val_mask]
else:
val_mask = np.asarray(y_val) != label
if data_is_df:
X_val = X_val[val_mask]
y_val = y_val[val_mask]
else:
X_val = X_val[val_mask]
y_val = y_val[val_mask]
# Add remaining instances back to val
remaining_indices = label_indices[num_to_train:]
if len(remaining_indices) > 0:
X_remaining = (
iloc_pandas_on_spark(X_train_all, remaining_indices)
if is_spark_dataframe
else X_train_all.iloc[remaining_indices]
if data_is_df
else X_train_all[remaining_indices]
)
y_remaining = (
iloc_pandas_on_spark(y_train_all, remaining_indices)
if is_spark_dataframe
else y_train_all.iloc[remaining_indices]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[remaining_indices]
)
X_val = concat(X_remaining, X_val)
y_val = concat(y_remaining, y_val) if data_is_df else np.concatenate([y_remaining, y_val])
# Handle sample_weight
if "sample_weight" in state.fit_kwargs:
sample_weight_source = (
state.sample_weight_all
if hasattr(state, "sample_weight_all")
else state.fit_kwargs.get("sample_weight")
)
if sample_weight_source is not None and max(indices_to_move) < len(sample_weight_source):
weights_to_move = (
sample_weight_source[indices_to_move]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[indices_to_move]
)
state.fit_kwargs["sample_weight"] = concat(
weights_to_move, state.fit_kwargs["sample_weight"]
)
if (
len(remaining_indices) > 0
and hasattr(state, "weight_val")
and state.weight_val is not None
):
# Remove and re-add weights for val
if isinstance(state.weight_val, np.ndarray):
state.weight_val = state.weight_val[val_mask]
else:
state.weight_val = state.weight_val[val_mask]
if max(remaining_indices) < len(sample_weight_source):
remaining_weights = (
sample_weight_source[remaining_indices]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[remaining_indices]
)
state.weight_val = concat(remaining_weights, state.weight_val)
if len(missing_in_val) > 0:
# Process missing labels in validation set
for label in missing_in_val:
# Find all indices for this label in the original data
if is_spark_dataframe:
label_indices = np.where(y_train_all.to_numpy() == label)[0].tolist()
else:
label_indices = np.where(np.asarray(y_train_all) == label)[0].tolist()
num_instances = len(label_indices)
if num_instances == 1:
# Single instance: must add to both train and val (unavoidable overlap)
X_single = (
iloc_pandas_on_spark(X_train_all, label_indices)
if is_spark_dataframe
else X_train_all.iloc[label_indices]
if data_is_df
else X_train_all[label_indices]
)
y_single = (
iloc_pandas_on_spark(y_train_all, label_indices)
if is_spark_dataframe
else y_train_all.iloc[label_indices]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[label_indices]
)
X_val = concat(X_single, X_val)
y_val = concat(y_single, y_val) if data_is_df else np.concatenate([y_single, y_val])
# Handle sample_weight
if "sample_weight" in state.fit_kwargs and hasattr(state, "weight_val"):
sample_weight_source = (
state.sample_weight_all
if hasattr(state, "sample_weight_all")
else state.fit_kwargs.get("sample_weight")
)
if sample_weight_source is not None and label_indices[0] < len(sample_weight_source):
single_weight = (
sample_weight_source[label_indices]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[label_indices]
)
if state.weight_val is not None:
state.weight_val = concat(single_weight, state.weight_val)
else:
# Multiple instances: move some from train to val (no overlap needed)
# Calculate how many to move to val (leave at least 1 in train)
num_to_val = max(1, min(num_instances - 1, int(num_instances * split_ratio)))
indices_to_move = label_indices[:num_to_val]
X_to_move = (
iloc_pandas_on_spark(X_train_all, indices_to_move)
if is_spark_dataframe
else X_train_all.iloc[indices_to_move]
if data_is_df
else X_train_all[indices_to_move]
)
y_to_move = (
iloc_pandas_on_spark(y_train_all, indices_to_move)
if is_spark_dataframe
else y_train_all.iloc[indices_to_move]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[indices_to_move]
)
# Add to val
X_val = concat(X_to_move, X_val)
y_val = concat(y_to_move, y_val) if data_is_df else np.concatenate([y_to_move, y_val])
# Remove from train (they are currently all in train)
if is_spark_dataframe:
train_mask = ~y_train.isin([label])
X_train = X_train[train_mask]
y_train = y_train[train_mask]
else:
train_mask = np.asarray(y_train) != label
if data_is_df:
X_train = X_train[train_mask]
y_train = y_train[train_mask]
else:
X_train = X_train[train_mask]
y_train = y_train[train_mask]
# Add remaining instances back to train
remaining_indices = label_indices[num_to_val:]
if len(remaining_indices) > 0:
X_remaining = (
iloc_pandas_on_spark(X_train_all, remaining_indices)
if is_spark_dataframe
else X_train_all.iloc[remaining_indices]
if data_is_df
else X_train_all[remaining_indices]
)
y_remaining = (
iloc_pandas_on_spark(y_train_all, remaining_indices)
if is_spark_dataframe
else y_train_all.iloc[remaining_indices]
if isinstance(y_train_all, (pd.Series, psSeries))
else y_train_all[remaining_indices]
)
X_train = concat(X_remaining, X_train)
y_train = concat(y_remaining, y_train) if data_is_df else np.concatenate([y_remaining, y_train])
# Handle sample_weight
if "sample_weight" in state.fit_kwargs:
sample_weight_source = (
state.sample_weight_all
if hasattr(state, "sample_weight_all")
else state.fit_kwargs.get("sample_weight")
)
if sample_weight_source is not None and max(indices_to_move) < len(sample_weight_source):
weights_to_move = (
sample_weight_source[indices_to_move]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[indices_to_move]
)
if hasattr(state, "weight_val") and state.weight_val is not None:
state.weight_val = concat(weights_to_move, state.weight_val)
if len(remaining_indices) > 0:
# Remove and re-add weights for train
if isinstance(state.fit_kwargs["sample_weight"], np.ndarray):
state.fit_kwargs["sample_weight"] = state.fit_kwargs["sample_weight"][train_mask]
else:
state.fit_kwargs["sample_weight"] = state.fit_kwargs["sample_weight"][train_mask]
if max(remaining_indices) < len(sample_weight_source):
remaining_weights = (
sample_weight_source[remaining_indices]
if isinstance(sample_weight_source, np.ndarray)
else sample_weight_source.iloc[remaining_indices]
)
state.fit_kwargs["sample_weight"] = concat(
remaining_weights, state.fit_kwargs["sample_weight"]
)
return X_train, X_val, y_train, y_val
def prepare_data(
self,
state,
@@ -377,6 +836,7 @@ class GenericTask(Task):
n_splits,
data_is_df,
sample_weight_full,
allow_label_overlap=True,
) -> int:
X_val, y_val = state.X_val, state.y_val
if issparse(X_val):
@@ -505,59 +965,46 @@ class GenericTask(Task):
elif self.is_classification():
# for classification, make sure the labels are complete in both
# training and validation data
label_set, first = unique_value_first_index(y_train_all)
rest = []
last = 0
first.sort()
for i in range(len(first)):
rest.extend(range(last, first[i]))
last = first[i] + 1
rest.extend(range(last, len(y_train_all)))
X_first = X_train_all.iloc[first] if data_is_df else X_train_all[first]
if len(first) < len(y_train_all) / 2:
# Get X_rest and y_rest with drop, sparse matrix can't apply np.delete
X_rest = (
np.delete(X_train_all, first, axis=0)
if isinstance(X_train_all, np.ndarray)
else X_train_all.drop(first.tolist())
if data_is_df
else X_train_all[rest]
)
y_rest = (
np.delete(y_train_all, first, axis=0)
if isinstance(y_train_all, np.ndarray)
else y_train_all.drop(first.tolist())
if data_is_df
else y_train_all[rest]
stratify = y_train_all if split_type == "stratified" else None
X_train, X_val, y_train, y_val = self._train_test_split(
state, X_train_all, y_train_all, split_ratio=split_ratio, stratify=stratify
)
# Handle missing labels using the appropriate strategy
if allow_label_overlap:
# Fast version: adds first instance to set with missing label (may create overlap)
X_train, X_val, y_train, y_val = self._handle_missing_labels_fast(
state,
X_train,
X_val,
y_train,
y_val,
X_train_all,
y_train_all,
is_spark_dataframe,
data_is_df,
)
else:
X_rest = (
iloc_pandas_on_spark(X_train_all, rest)
if is_spark_dataframe
else X_train_all.iloc[rest]
if data_is_df
else X_train_all[rest]
# Precise version: avoids overlap when possible (slower)
X_train, X_val, y_train, y_val = self._handle_missing_labels_no_overlap(
state,
X_train,
X_val,
y_train,
y_val,
X_train_all,
y_train_all,
is_spark_dataframe,
data_is_df,
split_ratio,
)
y_rest = (
iloc_pandas_on_spark(y_train_all, rest)
if is_spark_dataframe
else y_train_all.iloc[rest]
if data_is_df
else y_train_all[rest]
)
stratify = y_rest if split_type == "stratified" else None
X_train, X_val, y_train, y_val = self._train_test_split(
state, X_rest, y_rest, first, rest, split_ratio, stratify
)
X_train = concat(X_first, X_train)
y_train = concat(label_set, y_train) if data_is_df else np.concatenate([label_set, y_train])
X_val = concat(X_first, X_val)
y_val = concat(label_set, y_val) if data_is_df else np.concatenate([label_set, y_val])
if isinstance(y_train, (psDataFrame, pd.DataFrame)) and y_train.shape[1] == 1:
y_train = y_train[y_train.columns[0]]
y_val = y_val[y_val.columns[0]]
y_train.name = y_val.name = y_rest.name
# Only set name if y_train_all is a Series (not a DataFrame)
if isinstance(y_train_all, (pd.Series, psSeries)):
y_train.name = y_val.name = y_train_all.name
elif self.is_regression():
X_train, X_val, y_train, y_val = self._train_test_split(
@@ -746,7 +1193,10 @@ class GenericTask(Task):
elif isinstance(kf, TimeSeriesSplit):
kf = kf.split(X_train_split, y_train_split)
else:
kf = kf.split(X_train_split)
try:
kf = kf.split(X_train_split)
except TypeError:
kf = kf.split(X_train_split, y_train_split)
for train_index, val_index in kf:
if shuffle:

View File

@@ -151,7 +151,7 @@ class TimeSeriesTask(Task):
raise ValueError("Must supply either X_train_all and y_train_all, or dataframe and label")
try:
dataframe[self.time_col] = pd.to_datetime(dataframe[self.time_col])
dataframe.loc[:, self.time_col] = pd.to_datetime(dataframe[self.time_col])
except Exception:
raise ValueError(
f"For '{TS_FORECAST}' task, time column {self.time_col} must contain timestamp values."
@@ -386,9 +386,8 @@ class TimeSeriesTask(Task):
return X
def preprocess(self, X, transformer=None):
if isinstance(X, pd.DataFrame) or isinstance(X, np.ndarray) or isinstance(X, pd.Series):
X = X.copy()
X = normalize_ts_data(X, self.target_names, self.time_col)
if isinstance(X, (pd.DataFrame, np.ndarray, pd.Series)):
X = normalize_ts_data(X.copy(), self.target_names, self.time_col)
return self._preprocess(X, transformer)
elif isinstance(X, int):
return X
@@ -529,7 +528,7 @@ def remove_ts_duplicates(
duplicates = X.duplicated()
if any(duplicates):
logger.warning("Duplicate timestamp values found in timestamp column. " f"\n{X.loc[duplicates, X][time_col]}")
logger.warning("Duplicate timestamp values found in timestamp column. " f"\n{X.loc[duplicates, time_col]}")
X = X.drop_duplicates()
logger.warning("Removed duplicate rows based on all columns")
assert (

View File

@@ -17,24 +17,30 @@ from sklearn.preprocessing import StandardScaler
def make_lag_features(X: pd.DataFrame, y: pd.Series, lags: int):
"""Transform input data X, y into autoregressive form - shift
them appropriately based on horizon and create `lags` columns.
"""Transform input data X, y into autoregressive form by creating `lags` columns.
This function is called automatically by FLAML during the training process
to convert time series data into a format suitable for sklearn-based regression
models (e.g., lgbm, rf, xgboost). Users do NOT need to manually call this function
or create lagged features themselves.
Parameters
----------
X : pandas.DataFrame
Input features.
Input feature DataFrame, which may contain temporal features and/or exogenous variables.
y : array_like, (1d)
Target vector.
Target vector (time series values to forecast).
horizon : int
length of X for `predict` method
lags : int
Number of lagged time steps to use as features.
Returns
-------
pandas.DataFrame
shifted dataframe with `lags` columns
Shifted dataframe with `lags` columns for each original feature.
The target variable y is also lagged to prevent data leakage
(i.e., we use y(t-1), y(t-2), ..., y(t-lags) to predict y(t)).
"""
lag_features = []
@@ -55,6 +61,17 @@ def make_lag_features(X: pd.DataFrame, y: pd.Series, lags: int):
class SklearnWrapper:
"""Wrapper class for using sklearn-based models for time series forecasting.
This wrapper automatically handles the transformation of time series data into
a supervised learning format by creating lagged features. It trains separate
models for each step in the forecast horizon.
Users typically don't interact with this class directly - it's used internally
by FLAML when sklearn-based estimators (lgbm, rf, xgboost, etc.) are selected
for time series forecasting tasks.
"""
def __init__(
self,
model_class: type,
@@ -76,6 +93,8 @@ class SklearnWrapper:
self.pca = None
def fit(self, X: pd.DataFrame, y: pd.Series, **kwargs):
if "is_retrain" in kwargs:
kwargs.pop("is_retrain")
self._X = X
self._y = y
@@ -92,7 +111,14 @@ class SklearnWrapper:
for i, model in enumerate(self.models):
offset = i + self.lags
model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
if len(X) - offset > 2:
# series with length 2 will meet All features are either constant or ignored.
# TODO: see why the non-constant features are ignored. Selector?
model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
elif len(X) > offset and "catboost" not in str(model).lower():
model.fit(X_trans[: len(X) - offset], y[offset:], **fit_params)
else:
print("[INFO]: Length of data should longer than period + lags.")
return self
def predict(self, X, X_train=None, y_train=None):

View File

@@ -264,7 +264,8 @@ class TCNEstimator(TimeSeriesEstimator):
def predict(self, X):
X = self.enrich(X)
if isinstance(X, TimeSeriesDataset):
df = X.X_val
# Use X_train if X_val is empty (e.g., when computing training metrics)
df = X.X_val if len(X.test_data) > 0 else X.X_train
else:
df = X
dataset = DataframeDataset(

View File

@@ -1,3 +1,4 @@
import inspect
import time
try:
@@ -106,12 +107,17 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
def fit(self, X_train, y_train, budget=None, **kwargs):
import warnings
import pytorch_lightning as pl
try:
import lightning.pytorch as pl
from lightning.pytorch.callbacks import EarlyStopping, LearningRateMonitor
from lightning.pytorch.loggers import TensorBoardLogger
except ImportError:
import pytorch_lightning as pl
from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
from pytorch_lightning.loggers import TensorBoardLogger
import torch
from pytorch_forecasting import TemporalFusionTransformer
from pytorch_forecasting.metrics import QuantileLoss
from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
from pytorch_lightning.loggers import TensorBoardLogger
# a bit of monkey patching to fix the MacOS test
# all the log_prediction method appears to do is plot stuff, which ?breaks github tests
@@ -132,12 +138,26 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
lr_logger = LearningRateMonitor() # log the learning rate
logger = TensorBoardLogger(kwargs.get("log_dir", "lightning_logs")) # logging results to a tensorboard
default_trainer_kwargs = dict(
gpus=self._kwargs.get("gpu_per_trial", [0]) if torch.cuda.is_available() else None,
max_epochs=max_epochs,
gradient_clip_val=gradient_clip_val,
callbacks=[lr_logger, early_stop_callback],
logger=logger,
)
# PyTorch Lightning >=2.0 replaced `gpus` with `accelerator`/`devices`.
# Also, passing `gpus=None` is not accepted on newer versions.
trainer_sig_params = inspect.signature(pl.Trainer.__init__).parameters
if torch.cuda.is_available() and "gpus" in trainer_sig_params:
gpus = self._kwargs.get("gpu_per_trial", None)
if gpus is not None:
default_trainer_kwargs["gpus"] = gpus
elif torch.cuda.is_available() and "devices" in trainer_sig_params:
devices = self._kwargs.get("gpu_per_trial", None)
if devices == -1:
devices = "auto"
if devices is not None:
default_trainer_kwargs["accelerator"] = "gpu"
default_trainer_kwargs["devices"] = devices
trainer = pl.Trainer(
**default_trainer_kwargs,
)
@@ -157,7 +177,14 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
val_dataloaders=val_dataloader,
)
best_model_path = trainer.checkpoint_callback.best_model_path
best_tft = TemporalFusionTransformer.load_from_checkpoint(best_model_path)
# PyTorch 2.6 changed `torch.load` default `weights_only` from False -> True.
# Some Lightning checkpoints (including those produced here) can require full unpickling.
# This path is generated locally during training, so it's trusted.
load_sig_params = inspect.signature(TemporalFusionTransformer.load_from_checkpoint).parameters
if "weights_only" in load_sig_params:
best_tft = TemporalFusionTransformer.load_from_checkpoint(best_model_path, weights_only=False)
else:
best_tft = TemporalFusionTransformer.load_from_checkpoint(best_model_path)
train_time = time.time() - current_time
self._model = best_tft
return train_time
@@ -170,7 +197,11 @@ class TemporalFusionTransformerEstimator(TimeSeriesEstimator):
last_data_cols = self.group_ids.copy()
last_data_cols.append(self.target_names[0])
last_data = self.data[lambda x: x.time_idx == x.time_idx.max()][last_data_cols]
decoder_data = X.X_val if isinstance(X, TimeSeriesDataset) else X
# Use X_train if test_data is empty (e.g., when computing training metrics)
if isinstance(X, TimeSeriesDataset):
decoder_data = X.X_val if len(X.test_data) > 0 else X.X_train
else:
decoder_data = X
if "time_idx" not in decoder_data:
decoder_data = add_time_idx_col(decoder_data)
decoder_data["time_idx"] += encoder_data["time_idx"].max() + 1 - decoder_data["time_idx"].min()

View File

@@ -9,6 +9,7 @@ import numpy as np
try:
import pandas as pd
from pandas import DataFrame, Series, to_datetime
from pandas.api.types import is_datetime64_any_dtype
from scipy.sparse import issparse
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
@@ -120,7 +121,12 @@ class TimeSeriesDataset:
@property
def X_all(self) -> pd.DataFrame:
return pd.concat([self.X_train, self.X_val], axis=0)
# Remove empty or all-NA columns before concatenation
X_train_filtered = self.X_train.dropna(axis=1, how="all")
X_val_filtered = self.X_val.dropna(axis=1, how="all")
# Concatenate the filtered DataFrames
return pd.concat([X_train_filtered, X_val_filtered], axis=0)
@property
def y_train(self) -> pd.DataFrame:
@@ -392,6 +398,15 @@ class DataTransformerTS:
assert len(self.num_columns) == 0, "Trying to call fit() twice, something is wrong"
for column in X.columns:
# Never treat the time column as a feature for sklearn preprocessing
if column == self.time_col:
continue
# Robust datetime detection (covers datetime64[ms/us/ns], tz-aware, etc.)
if is_datetime64_any_dtype(X[column]):
self.datetime_columns.append(column)
continue
# sklearn/utils/validation.py needs int/float values
if X[column].dtype.name in ("object", "category", "string"):
if (
@@ -462,7 +477,7 @@ class DataTransformerTS:
if "__NAN__" not in X[col].cat.categories:
X[col] = X[col].cat.add_categories("__NAN__").fillna("__NAN__")
else:
X[col] = X[col].fillna("__NAN__")
X[col] = X[col].fillna("__NAN__").infer_objects(copy=False)
X[col] = X[col].astype("category")
for column in self.num_columns:
@@ -531,14 +546,12 @@ def normalize_ts_data(X_train_all, target_names, time_col, y_train_all=None):
def validate_data_basic(X_train_all, y_train_all):
assert isinstance(X_train_all, np.ndarray) or issparse(X_train_all) or isinstance(X_train_all, pd.DataFrame), (
"X_train_all must be a numpy array, a pandas dataframe, " "or Scipy sparse matrix."
)
assert isinstance(X_train_all, (np.ndarray, DataFrame)) or issparse(
X_train_all
), "X_train_all must be a numpy array, a pandas dataframe, or Scipy sparse matrix."
assert (
isinstance(y_train_all, np.ndarray)
or isinstance(y_train_all, pd.Series)
or isinstance(y_train_all, pd.DataFrame)
assert isinstance(
y_train_all, (np.ndarray, pd.Series, pd.DataFrame)
), "y_train_all must be a numpy array or a pandas series or DataFrame."
assert X_train_all.size != 0 and y_train_all.size != 0, "Input data must not be empty, use None if no data"

View File

@@ -194,7 +194,13 @@ class Orbit(TimeSeriesEstimator):
elif isinstance(X, TimeSeriesDataset):
data = X
X = data.test_data[[self.time_col] + X.regressors]
# By default we predict on the dataset's test partition.
# Some internal call paths (e.g., training-metric logging) may pass a
# dataset whose test partition is empty; fall back to train partition.
if data.test_data is not None and len(data.test_data):
X = data.test_data[data.regressors + [data.time_col]]
else:
X = data.train_data[data.regressors + [data.time_col]]
if self._model is not None:
forecast = self._model.predict(X, **kwargs)
@@ -301,7 +307,13 @@ class Prophet(TimeSeriesEstimator):
if isinstance(X, TimeSeriesDataset):
data = X
X = data.test_data[data.regressors + [data.time_col]]
# By default we predict on the dataset's test partition.
# Some internal call paths (e.g., training-metric logging) may pass a
# dataset whose test partition is empty; fall back to train partition.
if data.test_data is not None and len(data.test_data):
X = data.test_data[data.regressors + [data.time_col]]
else:
X = data.train_data[data.regressors + [data.time_col]]
X = X.rename(columns={self.time_col: "ds"})
if self._model is not None:
@@ -327,11 +339,19 @@ class StatsModelsEstimator(TimeSeriesEstimator):
if isinstance(X, TimeSeriesDataset):
data = X
X = data.test_data[data.regressors + [data.time_col]]
# By default we predict on the dataset's test partition.
# Some internal call paths (e.g., training-metric logging) may pass a
# dataset whose test partition is empty; fall back to train partition.
if data.test_data is not None and len(data.test_data):
X = data.test_data[data.regressors + [data.time_col]]
else:
X = data.train_data[data.regressors + [data.time_col]]
else:
X = X[self.regressors + [self.time_col]]
if isinstance(X, DataFrame):
if X.shape[0] == 0:
return pd.Series([], name=self.target_names[0], dtype=float)
start = X[self.time_col].iloc[0]
end = X[self.time_col].iloc[-1]
if len(self.regressors):
@@ -829,6 +849,13 @@ class TS_SKLearn(TimeSeriesEstimator):
if isinstance(X, TimeSeriesDataset):
data = X
X = data.test_data
# By default we predict on the dataset's test partition.
# Some internal call paths (e.g., training-metric logging) may pass a
# dataset whose test partition is empty; fall back to train partition.
if data.test_data is not None and len(data.test_data):
X = data.test_data
else:
X = data.train_data
if self._model is not None:
X = X[self.regressors]

View File

@@ -95,6 +95,27 @@ def flamlize_estimator(super_class, name: str, task: str, alternatives=None):
def fit(self, X, y, *args, **params):
hyperparams, estimator_name, X, y_transformed = self.suggest_hyperparams(X, y)
self.set_params(**hyperparams)
# Transform eval_set if present
if "eval_set" in params and params["eval_set"] is not None:
transformed_eval_set = []
for eval_X, eval_y in params["eval_set"]:
# Transform features
eval_X_transformed = self._feature_transformer.transform(eval_X)
# Transform labels if applicable
if self._label_transformer and estimator_name in [
"rf",
"extra_tree",
"xgboost",
"xgb_limitdepth",
"choose_xgb",
]:
eval_y_transformed = self._label_transformer.transform(eval_y)
transformed_eval_set.append((eval_X_transformed, eval_y_transformed))
else:
transformed_eval_set.append((eval_X_transformed, eval_y))
params["eval_set"] = transformed_eval_set
if self._label_transformer and estimator_name in [
"rf",
"extra_tree",

View File

@@ -32,6 +32,7 @@ def construct_portfolio(regret_matrix, meta_features, regret_bound):
if meta_features is not None:
scaler = RobustScaler()
meta_features = meta_features.loc[tasks]
meta_features = meta_features.astype(float)
meta_features.loc[:, :] = scaler.fit_transform(meta_features)
nearest_task = {}
for t in tasks:

View File

@@ -26,6 +26,7 @@ def config_predictor_tuple(tasks, configs, meta_features, regret_matrix):
# pre-processing
scaler = RobustScaler()
meta_features_norm = meta_features.loc[tasks] # this makes a copy
meta_features_norm = meta_features_norm.astype(float)
meta_features_norm.loc[:, :] = scaler.fit_transform(meta_features_norm)
proc = {

View File

@@ -1,10 +1,14 @@
import atexit
import functools
import json
import logging
import os
import pickle
import random
import sys
import tempfile
import time
import warnings
from concurrent.futures import ThreadPoolExecutor, wait
from typing import MutableMapping
import mlflow
@@ -12,14 +16,15 @@ import pandas as pd
from mlflow.entities import Metric, Param, RunTag
from mlflow.exceptions import MlflowException
from mlflow.utils.autologging_utils import AUTOLOGGING_INTEGRATIONS, autologging_is_disabled
from packaging.requirements import Requirement
from scipy.sparse import issparse
from sklearn import tree
try:
from pyspark.ml import Pipeline as SparkPipeline
from pyspark.ml import PipelineModel as SparkPipelineModel
except ImportError:
class SparkPipeline:
class SparkPipelineModel:
pass
@@ -32,6 +37,84 @@ from flaml.version import __version__
SEARCH_MAX_RESULTS = 5000 # Each train should not have more than 5000 trials
IS_RENAME_CHILD_RUN = os.environ.get("FLAML_IS_RENAME_CHILD_RUN", "false").lower() == "true"
REMOVE_REQUIREMENT_LIST = [
"synapseml-cognitive",
"synapseml-core",
"synapseml-deep-learning",
"synapseml-internal",
"synapseml-mlflow",
"synapseml-opencv",
"synapseml-vw",
"synapseml-lightgbm",
"synapseml-utils",
"nni",
"optuna",
]
OPTIONAL_REMOVE_REQUIREMENT_LIST = ["pytorch-lightning", "transformers"]
os.environ["MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR"] = os.environ.get("MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR", "false")
MLFLOW_NUM_WORKERS = int(os.environ.get("FLAML_MLFLOW_NUM_WORKERS", os.cpu_count() * 4 if os.cpu_count() else 2))
executor = ThreadPoolExecutor(max_workers=MLFLOW_NUM_WORKERS)
atexit.register(lambda: executor.shutdown(wait=True))
IS_CLEAN_LOGS = os.environ.get("FLAML_IS_CLEAN_LOGS", "1")
if IS_CLEAN_LOGS == "1":
logging.getLogger("synapse.ml").setLevel(logging.CRITICAL)
logging.getLogger("mlflow.utils").setLevel(logging.CRITICAL)
logging.getLogger("mlflow.utils.environment").setLevel(logging.CRITICAL)
logging.getLogger("mlflow.models.model").setLevel(logging.CRITICAL)
warnings.simplefilter("ignore", category=FutureWarning)
warnings.simplefilter("ignore", category=UserWarning)
def convert_requirement(requirement_list: list[str]):
ret = (
[Requirement(s.strip().lower()) for s in requirement_list]
if mlflow.__version__ <= "2.17.0"
else requirement_list
)
return ret
def time_it(func_or_code=None):
"""
Decorator or function that measures execution time.
Can be used in three ways:
1. As a decorator with no arguments: @time_it
2. As a decorator with arguments: @time_it()
3. As a function call with a string of code to execute and time: time_it("some_code()")
Args:
func_or_code (callable or str, optional): Either a function to decorate or
a string of code to execute and time.
Returns:
callable or None: Returns a decorated function if used as a decorator,
or None if used to execute a string of code.
"""
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
logger.debug(f"Execution of {func.__name__} took {end_time - start_time:.4f} seconds")
return result
return wrapper
if callable(func_or_code):
return decorator(func_or_code)
elif func_or_code is None:
return decorator
else:
start_time = time.time()
exec(func_or_code)
end_time = time.time()
logger.debug(f"Execution\n```\n{func_or_code}\n```\ntook {end_time - start_time:.4f} seconds")
def flatten_dict(d: MutableMapping, sep: str = ".") -> MutableMapping:
@@ -49,23 +132,28 @@ def is_autolog_enabled():
return not all(autologging_is_disabled(k) for k in AUTOLOGGING_INTEGRATIONS.keys())
def get_mlflow_log_latency(model_history=False):
def get_mlflow_log_latency(model_history=False, delete_run=True):
try:
FLAML_MLFLOW_LOG_LATENCY = float(os.getenv("FLAML_MLFLOW_LOG_LATENCY", 0))
except ValueError:
FLAML_MLFLOW_LOG_LATENCY = 0
if FLAML_MLFLOW_LOG_LATENCY >= 0.1:
return FLAML_MLFLOW_LOG_LATENCY
st = time.time()
with mlflow.start_run(nested=True, run_name="get_mlflow_log_latency") as run:
if model_history:
sk_model = tree.DecisionTreeClassifier()
mlflow.sklearn.log_model(sk_model, "sk_models")
mlflow.sklearn.log_model(Pipeline([("estimator", sk_model)]), "sk_pipeline")
mlflow.sklearn.log_model(sk_model, "model")
with tempfile.TemporaryDirectory() as tmpdir:
pickle_fpath = os.path.join(tmpdir, f"tmp_{int(time.time()*1000)}")
pickle_fpath = os.path.join(tmpdir, f"tmp_{int(time.time() * 1000)}")
with open(pickle_fpath, "wb") as f:
pickle.dump(sk_model, f)
mlflow.log_artifact(pickle_fpath, "sk_model1")
mlflow.log_artifact(pickle_fpath, "sk_model2")
mlflow.log_artifact(pickle_fpath, "sk_model")
mlflow.set_tag("synapseml.ui.visible", "false") # not shown inline in fabric
mlflow.delete_run(run.info.run_id)
if delete_run:
mlflow.delete_run(run.info.run_id)
et = time.time()
return et - st
return 3 * (et - st)
def infer_signature(X_train=None, y_train=None, dataframe=None, label=None):
@@ -98,12 +186,76 @@ def infer_signature(X_train=None, y_train=None, dataframe=None, label=None):
)
def update_and_install_requirements(
run_id=None,
model_name=None,
model_version=None,
remove_list=None,
artifact_path="model",
dst_path=None,
install_with_ipython=False,
):
if not (run_id or (model_name and model_version)):
raise ValueError(
"Please provide `run_id` or both `model_name` and `model_version`. If all three are provided, `run_id` will be used."
)
if install_with_ipython:
from IPython import get_ipython
if not remove_list:
remove_list = [
"synapseml-cognitive",
"synapseml-core",
"synapseml-deep-learning",
"synapseml-internal",
"synapseml-mlflow",
"synapseml-opencv",
"synapseml-vw",
"synapseml-lightgbm",
"synapseml-utils",
"flaml", # flaml is needed for AutoML models, should be pre-installed in the runtime
"pyspark", # fabric internal pyspark should be pre-installed in the runtime
]
# Download model artifacts
client = mlflow.MlflowClient()
if not run_id:
run_id = client.get_model_version(model_name, model_version).run_id
if not dst_path:
dst_path = os.path.join(tempfile.gettempdir(), "model_artifacts")
os.makedirs(dst_path, exist_ok=True)
client.download_artifacts(run_id, artifact_path, dst_path)
requirements_path = os.path.join(dst_path, artifact_path, "requirements.txt")
with open(requirements_path) as f:
reqs = f.read().splitlines()
old_reqs = [Requirement(req) for req in reqs if req]
old_reqs_dict = {req.name: str(req) for req in old_reqs}
for req in remove_list:
req = Requirement(req)
if req.name in old_reqs_dict:
old_reqs_dict.pop(req.name, None)
new_reqs_list = list(old_reqs_dict.values())
with open(requirements_path, "w") as f:
f.write("\n".join(new_reqs_list))
if install_with_ipython:
get_ipython().run_line_magic("pip", f"install -r {requirements_path} -q")
else:
logger.info(f"You can run `pip install -r {requirements_path}` to install dependencies.")
return requirements_path
def _mlflow_wrapper(evaluation_func, mlflow_exp_id, mlflow_config=None, extra_tags=None, autolog=False):
def wrapped(*args, **kwargs):
if mlflow_config is not None:
from synapse.ml.mlflow import set_mlflow_env_config
try:
from synapse.ml.mlflow import set_mlflow_env_config
set_mlflow_env_config(mlflow_config)
set_mlflow_env_config(mlflow_config)
except Exception:
pass
import mlflow
if mlflow_exp_id is not None:
@@ -124,7 +276,20 @@ def _mlflow_wrapper(evaluation_func, mlflow_exp_id, mlflow_config=None, extra_ta
def _get_notebook_name():
return None
try:
import re
from synapse.ml.mlflow import get_mlflow_env_config
from synapse.ml.mlflow.shared_platform_utils import get_artifact
notebook_id = get_mlflow_env_config(False).artifact_id
current_notebook = get_artifact(notebook_id)
notebook_name = re.sub("\\W+", "-", current_notebook.displayName).strip()
return notebook_name
except Exception as e:
logger.debug(f"Failed to get notebook name: {e}")
return None
def safe_json_dumps(obj):
@@ -163,6 +328,8 @@ class MLflowIntegration:
self.has_model = False
self.only_history = False
self._do_log_model = True
self.futures = {}
self.futures_log_model = {}
self.extra_tag = (
extra_tag
@@ -170,6 +337,9 @@ class MLflowIntegration:
else {"extra_tag.sid": f"flaml_{__version__}_{int(time.time())}_{random.randint(1001, 9999)}"}
)
self.start_time = time.time()
self.experiment_type = experiment_type
self.update_autolog_state()
self.mlflow_client = mlflow.tracking.MlflowClient()
parent_run_info = mlflow.active_run().info if mlflow.active_run() is not None else None
if parent_run_info:
@@ -188,8 +358,6 @@ class MLflowIntegration:
mlflow.set_experiment(experiment_name=mlflow_exp_name)
self.experiment_id = mlflow.tracking.fluent._active_experiment_id
self.experiment_name = mlflow.get_experiment(self.experiment_id).name
self.experiment_type = experiment_type
self.update_autolog_state()
if self.autolog:
# only end user created parent run in autolog scenario
@@ -197,9 +365,12 @@ class MLflowIntegration:
def set_mlflow_config(self):
if self.driver_mlflow_env_config is not None:
from synapse.ml.mlflow import set_mlflow_env_config
try:
from synapse.ml.mlflow import set_mlflow_env_config
set_mlflow_env_config(self.driver_mlflow_env_config)
set_mlflow_env_config(self.driver_mlflow_env_config)
except Exception:
pass
def wrap_evaluation_function(self, evaluation_function):
wrapped_evaluation_function = _mlflow_wrapper(
@@ -267,6 +438,7 @@ class MLflowIntegration:
else:
_tags = []
self.mlflow_client.log_batch(run_id=target_id, metrics=_metrics, params=[], tags=_tags)
return f"Successfully copy_mlflow_run run_id {src_id} to run_id {target_id}"
def record_trial(self, result, trial, metric):
if isinstance(result, dict):
@@ -334,12 +506,31 @@ class MLflowIntegration:
self.copy_mlflow_run(best_mlflow_run_id, self.parent_run_id)
self.has_summary = True
def log_model(self, model, estimator, signature=None):
def log_model(self, model, estimator, signature=None, run_id=None):
if not self._do_log_model:
return
logger.debug(f"logging model {estimator}")
ret_message = f"Successfully log_model {estimator} to run_id {run_id}"
optional_remove_list = (
[] if estimator in ["transformer", "transformer_ms", "tcn", "tft"] else OPTIONAL_REMOVE_REQUIREMENT_LIST
)
run = mlflow.active_run()
if run and run.info.run_id == self.parent_run_id:
logger.debug(
f"Current active run_id {run.info.run_id} == parent_run_id {self.parent_run_id}, Starting run_id {run_id}"
)
mlflow.start_run(run_id=run_id, nested=True)
elif run and run.info.run_id != run_id:
ret_message = (
f"Error: Should log_model {estimator} to run_id {run_id}, but logged to run_id {run.info.run_id}"
)
logger.error(ret_message)
else:
logger.debug(f"No active run, start run_id {run_id}")
mlflow.start_run(run_id=run_id)
logger.debug(f"logged model {estimator} to run_id {mlflow.active_run().info.run_id}")
if estimator.endswith("_spark"):
mlflow.spark.log_model(model, estimator, signature=signature)
# mlflow.spark.log_model(model, estimator, signature=signature)
mlflow.spark.log_model(model, "model", signature=signature)
elif estimator in ["lgbm"]:
mlflow.lightgbm.log_model(model, estimator, signature=signature)
@@ -352,42 +543,93 @@ class MLflowIntegration:
elif estimator in ["prophet"]:
mlflow.prophet.log_model(model, estimator, signature=signature)
elif estimator in ["orbit"]:
pass
logger.warning(f"Unsupported model: {estimator}. No model logged.")
else:
mlflow.sklearn.log_model(model, estimator, signature=signature)
future = executor.submit(
lambda: mlflow.models.model.update_model_requirements(
model_uri=f"runs:/{run_id}/{'model' if estimator.endswith('_spark') else estimator}",
operation="remove",
requirement_list=convert_requirement(REMOVE_REQUIREMENT_LIST + optional_remove_list),
)
)
self.futures[future] = f"run_{run_id}_requirements_updated"
if not run or run.info.run_id == self.parent_run_id:
logger.debug(f"Ending current run_id {mlflow.active_run().info.run_id}")
mlflow.end_run()
return ret_message
def _pickle_and_log_artifact(self, obj, artifact_name, pickle_fname="temp_.pkl"):
def _pickle_and_log_artifact(self, obj, artifact_name, pickle_fname="temp_.pkl", run_id=None):
if not self._do_log_model:
return
return True
with tempfile.TemporaryDirectory() as tmpdir:
pickle_fpath = os.path.join(tmpdir, pickle_fname)
try:
with open(pickle_fpath, "wb") as f:
pickle.dump(obj, f)
mlflow.log_artifact(pickle_fpath, artifact_name)
self.mlflow_client.log_artifact(run_id, pickle_fpath, artifact_name)
return True
except Exception as e:
logger.debug(f"Failed to pickle and log artifact {artifact_name}, error: {e}")
logger.debug(f"Failed to pickle and log {artifact_name}, error: {e}")
return False
def pickle_and_log_automl_artifacts(self, automl, model, estimator, signature=None):
def _log_pipeline(self, pipeline, flavor_name, pipeline_name, signature, run_id, estimator=None):
logger.debug(f"logging pipeline {flavor_name}:{pipeline_name}:{estimator}")
ret_message = f"Successfully _log_pipeline {flavor_name}:{pipeline_name}:{estimator} to run_id {run_id}"
optional_remove_list = (
[] if estimator in ["transformer", "transformer_ms", "tcn", "tft"] else OPTIONAL_REMOVE_REQUIREMENT_LIST
)
run = mlflow.active_run()
if run and run.info.run_id == self.parent_run_id:
logger.debug(
f"Current active run_id {run.info.run_id} == parent_run_id {self.parent_run_id}, Starting run_id {run_id}"
)
mlflow.start_run(run_id=run_id, nested=True)
elif run and run.info.run_id != run_id:
ret_message = f"Error: Should _log_pipeline {flavor_name}:{pipeline_name}:{estimator} model to run_id {run_id}, but logged to run_id {run.info.run_id}"
logger.error(ret_message)
else:
logger.debug(f"No active run, start run_id {run_id}")
mlflow.start_run(run_id=run_id)
logger.debug(
f"logging pipeline {flavor_name}:{pipeline_name}:{estimator} to run_id {mlflow.active_run().info.run_id}"
)
if flavor_name == "sklearn":
mlflow.sklearn.log_model(pipeline, pipeline_name, signature=signature)
elif flavor_name == "spark":
mlflow.spark.log_model(pipeline, pipeline_name, signature=signature)
else:
logger.warning(f"Unsupported pipeline flavor: {flavor_name}. No model logged.")
future = executor.submit(
lambda: mlflow.models.model.update_model_requirements(
model_uri=f"runs:/{run_id}/{pipeline_name}",
operation="remove",
requirement_list=convert_requirement(REMOVE_REQUIREMENT_LIST + optional_remove_list),
)
)
self.futures[future] = f"run_{run_id}_requirements_updated"
if not run or run.info.run_id == self.parent_run_id:
logger.debug(f"Ending current run_id {mlflow.active_run().info.run_id}")
mlflow.end_run()
return ret_message
def pickle_and_log_automl_artifacts(self, automl, model, estimator, signature=None, run_id=None):
"""log automl artifacts to mlflow
load back with `automl = mlflow.pyfunc.load_model(model_run_id_or_uri)`, then do prediction with `automl.predict(X)`
"""
logger.debug(f"logging automl artifacts {estimator}")
self._pickle_and_log_artifact(automl.feature_transformer, "feature_transformer", "feature_transformer.pkl")
self._pickle_and_log_artifact(automl.label_transformer, "label_transformer", "label_transformer.pkl")
# Test test_mlflow 1 and 4 will get error: TypeError: cannot pickle '_io.TextIOWrapper' object
# try:
# self._pickle_and_log_artifact(automl, "automl", "automl.pkl")
# except TypeError:
# pass
logger.debug(f"logging automl estimator {estimator}")
# self._pickle_and_log_artifact(
# automl.feature_transformer, "feature_transformer", "feature_transformer.pkl", run_id
# )
# self._pickle_and_log_artifact(automl.label_transformer, "label_transformer", "label_transformer.pkl", run_id)
if estimator.endswith("_spark"):
# spark pipeline is not supported yet
return
feature_transformer = automl.feature_transformer
if isinstance(feature_transformer, Pipeline):
if isinstance(feature_transformer, Pipeline) and not estimator.endswith("_spark"):
pipeline = feature_transformer
pipeline.steps.append(("estimator", model))
elif isinstance(feature_transformer, SparkPipeline):
elif isinstance(feature_transformer, SparkPipelineModel) and estimator.endswith("_spark"):
pipeline = feature_transformer
pipeline.stages.append(model)
elif not estimator.endswith("_spark"):
@@ -395,24 +637,26 @@ class MLflowIntegration:
steps.append(("estimator", model))
pipeline = Pipeline(steps)
else:
stages = [feature_transformer]
stages = []
if feature_transformer is not None:
stages.append(feature_transformer)
stages.append(model)
pipeline = SparkPipeline(stages=stages)
if isinstance(pipeline, SparkPipeline):
pipeline = SparkPipelineModel(stages=stages)
if isinstance(pipeline, SparkPipelineModel):
logger.debug(f"logging spark pipeline {estimator}")
mlflow.spark.log_model(pipeline, "automl_pipeline", signature=signature)
self._log_pipeline(pipeline, "spark", "model", signature, run_id, estimator)
else:
# Add a log named "model" to fit default settings
logger.debug(f"logging sklearn pipeline {estimator}")
mlflow.sklearn.log_model(pipeline, "automl_pipeline", signature=signature)
mlflow.sklearn.log_model(pipeline, "model", signature=signature)
self._log_pipeline(pipeline, "sklearn", "model", signature, run_id, estimator)
return f"Successfully pickle_and_log_automl_artifacts {estimator} to run_id {run_id}"
def record_state(self, automl, search_state, estimator):
@time_it
def record_state(self, automl, search_state, estimator, is_log_model=True):
_st = time.time()
automl_metric_name = (
automl._state.metric if isinstance(automl._state.metric, str) else automl._state.error_metric
)
if automl._state.error_metric.startswith("1-"):
automl_metric_value = 1 - search_state.val_loss
elif automl._state.error_metric.startswith("-"):
@@ -425,6 +669,8 @@ class MLflowIntegration:
else:
config = search_state.config
self.automl_user_configurations = safe_json_dumps(automl._automl_user_configurations)
info = {
"metrics": {
"iter_counter": automl._track_iter,
@@ -445,7 +691,7 @@ class MLflowIntegration:
"flaml.meric": automl_metric_name,
"flaml.run_source": "flaml-automl",
"flaml.log_type": self.log_type,
"flaml.automl_user_configurations": safe_json_dumps(automl._automl_user_configurations),
"flaml.automl_user_configurations": self.automl_user_configurations,
},
"params": {
"sample_size": search_state.sample_size,
@@ -472,33 +718,70 @@ class MLflowIntegration:
run_name = f"{self.parent_run_name}_child_{self.child_counter}"
else:
run_name = None
_t1 = time.time()
wait(self.futures_log_model)
_t2 = time.time() - _t1
logger.debug(f"wait futures_log_model in record_state took {_t2} seconds")
with mlflow.start_run(nested=True, run_name=run_name) as child_run:
self._log_info_to_run(info, child_run.info.run_id, log_params=True)
if automl._state.model_history:
self.log_model(
search_state.trained_estimator._model, estimator, signature=automl.estimator_signature
)
self.pickle_and_log_automl_artifacts(
automl, search_state.trained_estimator, estimator, signature=automl.pipeline_signature
)
future = executor.submit(lambda: self._log_info_to_run(info, child_run.info.run_id, log_params=True))
self.futures[future] = f"iter_{automl._track_iter}_log_info_to_run"
future = executor.submit(lambda: self._log_automl_configurations(child_run.info.run_id))
self.futures[future] = f"iter_{automl._track_iter}_log_automl_configurations"
if automl._state.model_history and is_log_model:
if estimator.endswith("_spark"):
future = executor.submit(
lambda: self.log_model(
search_state.trained_estimator._model,
estimator,
automl.estimator_signature,
child_run.info.run_id,
)
)
self.futures_log_model[future] = f"record_state-log_model_{estimator}"
else:
future = executor.submit(
lambda: self.pickle_and_log_automl_artifacts(
automl,
search_state.trained_estimator,
estimator,
automl.pipeline_signature,
child_run.info.run_id,
)
)
self.futures_log_model[future] = f"record_state-pickle_and_log_automl_artifacts_{estimator}"
self.manual_run_ids.append(child_run.info.run_id)
self.child_counter += 1
return f"Successfully record_state iteration {automl._track_iter}"
@time_it
def log_automl(self, automl):
self.set_best_iter(automl)
if self.autolog:
if self.parent_run_id is not None:
mlflow.start_run(run_id=self.parent_run_id, experiment_id=self.experiment_id)
mlflow.log_metric("best_validation_loss", automl._state.best_loss)
mlflow.log_metric("best_iteration", automl._best_iteration)
mlflow.log_metric("num_child_runs", len(self.infos))
if automl._trained_estimator is not None and not self.has_model:
self.log_model(
automl._trained_estimator._model, automl.best_estimator, signature=automl.estimator_signature
)
self.pickle_and_log_automl_artifacts(
automl, automl.model, automl.best_estimator, signature=automl.pipeline_signature
)
mlflow.log_metrics(
{
"best_validation_loss": automl._state.best_loss,
"best_iteration": automl._best_iteration,
"num_child_runs": len(self.infos),
}
)
if (
automl._trained_estimator is not None
and not self.has_model
and automl._trained_estimator._model is not None
):
if automl.best_estimator.endswith("_spark"):
self.log_model(
automl._trained_estimator._model,
automl.best_estimator,
automl.estimator_signature,
self.parent_run_id,
)
else:
self.pickle_and_log_automl_artifacts(
automl, automl.model, automl.best_estimator, automl.pipeline_signature, self.parent_run_id
)
self.has_model = True
self.adopt_children(automl)
@@ -514,31 +797,68 @@ class MLflowIntegration:
conf = automl._config_history[automl._best_iteration][1].copy()
if "ml" in conf.keys():
conf = conf["ml"]
mlflow.log_params(conf)
mlflow.log_param("best_learner", automl._best_estimator)
params_arr = [
Param(key, str(value)) for key, value in {**conf, "best_learner": automl._best_estimator}.items()
]
self.mlflow_client.log_batch(run_id=self.parent_run_id, metrics=[], params=params_arr, tags=[])
if not self.has_summary:
logger.info(f"logging best model {automl.best_estimator}")
self.copy_mlflow_run(best_mlflow_run_id, self.parent_run_id)
future = executor.submit(lambda: self.copy_mlflow_run(best_mlflow_run_id, self.parent_run_id))
self.futures[future] = "log_automl_copy_mlflow_run"
future = executor.submit(lambda: self._log_automl_configurations(self.parent_run_id))
self.futures[future] = "log_automl_log_automl_configurations"
self.has_summary = True
if automl._trained_estimator is not None and not self.has_model:
self.log_model(
automl._trained_estimator._model,
automl.best_estimator,
signature=automl.estimator_signature,
)
self.pickle_and_log_automl_artifacts(
automl, automl.model, automl.best_estimator, signature=automl.pipeline_signature
)
_t1 = time.time()
wait(self.futures_log_model)
_t2 = time.time() - _t1
logger.debug(f"wait futures_log_model in log_automl took {_t2} seconds")
if (
automl._trained_estimator is not None
and not self.has_model
and automl._trained_estimator._model is not None
):
if automl.best_estimator.endswith("_spark"):
future = executor.submit(
lambda: self.log_model(
automl._trained_estimator._model,
automl.best_estimator,
signature=automl.estimator_signature,
run_id=self.parent_run_id,
)
)
self.futures_log_model[future] = f"log_automl-log_model_{automl.best_estimator}"
else:
future = executor.submit(
lambda: self.pickle_and_log_automl_artifacts(
automl,
automl.model,
automl.best_estimator,
signature=automl.pipeline_signature,
run_id=self.parent_run_id,
)
)
self.futures_log_model[
future
] = f"log_automl-pickle_and_log_automl_artifacts_{automl.best_estimator}"
self.has_model = True
def resume_mlflow(self):
if len(self.resume_params) > 0:
mlflow.autolog(**self.resume_params)
def _log_automl_configurations(self, run_id):
self.mlflow_client.log_text(
run_id=run_id,
text=self.automl_user_configurations,
artifact_file="automl_configurations/automl_user_configurations.json",
)
return f"Successfully _log_automl_configurations to run_id {run_id}"
def _log_info_to_run(self, info, run_id, log_params=False):
_metrics = [Metric(key, value, int(time.time() * 1000), 0) for key, value in info["metrics"].items()]
_tags = [RunTag(key, str(value)) for key, value in info["tags"].items()]
_tags = [
RunTag(key, str(value)[:5000]) for key, value in info["tags"].items()
] # AML will raise error if value length > 5000
_params = [
Param(key, str(value))
for key, value in info["params"].items()
@@ -554,6 +874,7 @@ class MLflowIntegration:
_tags = [RunTag("mlflow.parentRunId", run_id)]
self.mlflow_client.log_batch(run_id=run.info.run_id, metrics=_metrics, params=[], tags=_tags)
del info["submetrics"]["values"]
return f"Successfully _log_info_to_run to run_id {run_id}"
def adopt_children(self, result=None):
"""
@@ -575,6 +896,7 @@ class MLflowIntegration:
),
)
self.child_counter = 0
num_infos = len(self.infos)
# From latest to earliest, remove duplicate cross-validation runs
_exist_child_run_params = [] # for deduplication of cross-validation child runs
@@ -639,22 +961,37 @@ class MLflowIntegration:
)
self.mlflow_client.set_tag(child_run_id, "flaml.child_counter", self.child_counter)
# merge autolog child run and corresponding manual run
flaml_info = self.infos[self.child_counter]
child_run = self.mlflow_client.get_run(child_run_id)
self._log_info_to_run(flaml_info, child_run_id, log_params=False)
# Merge autolog child run and corresponding FLAML trial info (if available).
# In nested scenarios (e.g., Tune -> AutoML -> MLflow autolog), MLflow can create
# more child runs than the number of FLAML trials recorded in self.infos.
# TODO: need more tests in nested scenarios.
flaml_info = None
child_run = None
if self.child_counter < num_infos:
flaml_info = self.infos[self.child_counter]
child_run = self.mlflow_client.get_run(child_run_id)
self._log_info_to_run(flaml_info, child_run_id, log_params=False)
if self.experiment_type == "automl":
if "learner" not in child_run.data.params:
self.mlflow_client.log_param(child_run_id, "learner", flaml_info["params"]["learner"])
if "sample_size" not in child_run.data.params:
self.mlflow_client.log_param(
child_run_id, "sample_size", flaml_info["params"]["sample_size"]
)
if self.experiment_type == "automl":
if "learner" not in child_run.data.params:
self.mlflow_client.log_param(child_run_id, "learner", flaml_info["params"]["learner"])
if "sample_size" not in child_run.data.params:
self.mlflow_client.log_param(
child_run_id, "sample_size", flaml_info["params"]["sample_size"]
)
else:
logger.debug(
"No corresponding FLAML info for MLflow child run %s (child_counter=%s, infos=%s); skipping merge.",
child_run_id,
self.child_counter,
num_infos,
)
if self.child_counter == best_iteration:
if flaml_info is not None and self.child_counter == best_iteration:
self.mlflow_client.set_tag(child_run_id, "flaml.best_run", True)
if result is not None:
if child_run is None:
child_run = self.mlflow_client.get_run(child_run_id)
result.best_run_id = child_run_id
result.best_run_name = child_run.info.run_name
self.best_run_id = child_run_id
@@ -678,7 +1015,7 @@ class MLflowIntegration:
self.resume_mlflow()
def register_automl_pipeline(automl, model_name=None, signature=None):
def register_automl_pipeline(automl, model_name=None, signature=None, artifact_path="model"):
pipeline = automl.automl_pipeline
if pipeline is None:
logger.warning("pipeline not found, cannot register it")
@@ -688,7 +1025,7 @@ def register_automl_pipeline(automl, model_name=None, signature=None):
if automl.best_run_id is None:
mlflow.sklearn.log_model(
pipeline,
"automl_pipeline",
artifact_path,
registered_model_name=model_name,
signature=automl.pipeline_signature if signature is None else signature,
)
@@ -698,5 +1035,5 @@ def register_automl_pipeline(automl, model_name=None, signature=None):
return mvs[0]
else:
best_run = mlflow.get_run(automl.best_run_id)
model_uri = f"runs:/{best_run.info.run_id}/automl_pipeline"
model_uri = f"runs:/{best_run.info.run_id}/{artifact_path}"
return mlflow.register_model(model_uri, model_name)

View File

@@ -1,6 +1,6 @@
# ChaCha for Online AutoML
FLAML includes *ChaCha* which is an automatic hyperparameter tuning solution for online machine learning. Online machine learning has the following properties: (1) data comes in sequential order; and (2) the performance of the machine learning model is evaluated online, i.e., at every iteration. *ChaCha* performs online AutoML respecting the aforementioned properties of online learning, and at the same time respecting the following constraints: (1) only a small constant number of 'live' models are allowed to perform online learning at the same time; and (2) no model persistence or offline training is allowed, which means that once we decide to replace a 'live' model with a new one, the replaced model can no longer be retrieved.
FLAML includes *ChaCha* which is an automatic hyperparameter tuning solution for online machine learning. Online machine learning has the following properties: (1) data comes in sequential order; and (2) the performance of the machine learning model is evaluated online, i.e., at every iteration. *ChaCha* performs online AutoML respecting the aforementioned properties of online learning, and at the same time respecting the following constraints: (1) only a small constant number of 'live' models are allowed to perform online learning at the same time; and (2) no model persistence or offline training is allowed, which means that once we decide to replace a 'live' model with a new one, the replaced model can no longer be retrieved.
For more technical details about *ChaCha*, please check our paper.

37
flaml/tune/logger.py Normal file
View File

@@ -0,0 +1,37 @@
import logging
import os
class ColoredFormatter(logging.Formatter):
# ANSI escape codes for colors
COLORS = {
# logging.DEBUG: "\033[36m", # Cyan
# logging.INFO: "\033[32m", # Green
logging.WARNING: "\033[33m", # Yellow
logging.ERROR: "\033[31m", # Red
logging.CRITICAL: "\033[1;31m", # Bright Red
}
RESET = "\033[0m" # Reset to default
def __init__(self, fmt, datefmt, use_color=True):
super().__init__(fmt, datefmt)
self.use_color = use_color
def format(self, record):
formatted = super().format(record)
if self.use_color:
color = self.COLORS.get(record.levelno, "")
if color:
return f"{color}{formatted}{self.RESET}"
return formatted
logger = logging.getLogger(__name__)
use_color = True
if os.getenv("FLAML_LOG_NO_COLOR"):
use_color = False
logger_formatter = ColoredFormatter(
"[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S", use_color
)
logger.propagate = False

View File

@@ -217,7 +217,24 @@ class BlendSearch(Searcher):
if global_search_alg is not None:
self._gs = global_search_alg
elif getattr(self, "__name__", None) != "CFO":
if space and self._ls.hierarchical:
# Use define-by-run for OptunaSearch when needed:
# - Hierarchical/conditional spaces are best supported via define-by-run.
# - Ray Tune domain/grid specs can trigger an "unresolved search space" warning
# unless we switch to define-by-run.
use_define_by_run = bool(getattr(self._ls, "hierarchical", False))
if (not use_define_by_run) and isinstance(space, dict) and space:
try:
from .variant_generator import parse_spec_vars
_, domain_vars, grid_vars = parse_spec_vars(space)
use_define_by_run = bool(domain_vars or grid_vars)
except Exception:
# Be conservative: if we can't determine whether the space is
# unresolved, fall back to the original behavior.
use_define_by_run = False
self._use_define_by_run = use_define_by_run
if use_define_by_run:
from functools import partial
gs_space = partial(define_by_run_func, space=space)
@@ -244,13 +261,32 @@ class BlendSearch(Searcher):
evaluated_rewards=evaluated_rewards,
)
except (AssertionError, ValueError):
self._gs = GlobalSearch(
space=gs_space,
metric=metric,
mode=mode,
seed=gs_seed,
sampler=sampler,
)
try:
self._gs = GlobalSearch(
space=gs_space,
metric=metric,
mode=mode,
seed=gs_seed,
sampler=sampler,
)
except ValueError:
# Ray Tune's OptunaSearch converts Tune domains into Optuna
# distributions. Optuna disallows integer log distributions
# with step != 1 (e.g., qlograndint with q>1), which can
# raise here. Fall back to FLAML's OptunaSearch wrapper,
# which handles these spaces more permissively.
if getattr(GlobalSearch, "__module__", "").startswith("ray.tune"):
from .suggestion import OptunaSearch as _FallbackOptunaSearch
self._gs = _FallbackOptunaSearch(
space=gs_space,
metric=metric,
mode=mode,
seed=gs_seed,
sampler=sampler,
)
else:
raise
self._gs.space = space
else:
self._gs = None
@@ -468,7 +504,7 @@ class BlendSearch(Searcher):
self._ls_bound_max,
self._subspace.get(trial_id, self._ls.space),
)
if self._gs is not None and self._experimental and (not self._ls.hierarchical):
if self._gs is not None and self._experimental and (not getattr(self, "_use_define_by_run", False)):
self._gs.add_evaluated_point(flatten_dict(config), objective)
# TODO: recover when supported
# converted = convert_key(config, self._gs.space)

View File

@@ -641,8 +641,10 @@ class FLOW2(Searcher):
else:
# key must be in space
domain = space[key]
if self.hierarchical and not (
domain is None or type(domain) in (str, int, float) or isinstance(domain, sample.Domain)
if (
self.hierarchical
and domain is not None
and not isinstance(domain, (str, int, float, sample.Domain))
):
# not domain or hashable
# get rid of list type for hierarchical search space.

View File

@@ -207,7 +207,7 @@ class ChampionFrontierSearcher(BaseSearcher):
hyperparameter_config_groups.append(partial_new_configs)
# does not have searcher_trial_ids
searcher_trial_ids_groups.append([])
elif isinstance(config_domain, Float) or isinstance(config_domain, Categorical):
elif isinstance(config_domain, (Float, Categorical)):
# otherwise we need to deal with them in group
nonpoly_config[k] = v
if k not in self._space_of_nonpoly_hp:

View File

@@ -25,6 +25,31 @@ from .flow2 import FLOW2
logger = logging.getLogger(__name__)
def _recursive_dict_update(target: Dict, source: Dict) -> None:
"""Recursively update target dictionary with source dictionary.
Unlike dict.update(), this function merges nested dictionaries instead of
replacing them entirely. This is crucial for configurations with nested
structures (e.g., XGBoost params).
Args:
target: The dictionary to be updated (modified in place).
source: The dictionary containing values to merge into target.
Example:
>>> target = {'params': {'eta': 0.1, 'max_depth': 3}}
>>> source = {'params': {'verbosity': 0}}
>>> _recursive_dict_update(target, source)
>>> target
{'params': {'eta': 0.1, 'max_depth': 3, 'verbosity': 0}}
"""
for key, value in source.items():
if isinstance(value, dict) and key in target and isinstance(target[key], dict):
_recursive_dict_update(target[key], value)
else:
target[key] = value
class SearchThread:
"""Class of global or local search thread."""
@@ -65,7 +90,7 @@ class SearchThread:
try:
config = self._search_alg.suggest(trial_id)
if isinstance(self._search_alg._space, dict):
config.update(self._const)
_recursive_dict_update(config, self._const)
else:
# define by run
config, self.space = unflatten_hierarchical(config, self._space)

View File

@@ -35,6 +35,73 @@ from ..sample import (
Quantized,
Uniform,
)
# If Ray is installed, flaml.tune may re-export Ray Tune sampling functions.
# In that case, the search space contains Ray Tune Domain/Sampler objects,
# which should be accepted by our Optuna search-space conversion.
try:
from ray import __version__ as _ray_version # type: ignore
if str(_ray_version).startswith("1."):
from ray.tune.sample import ( # type: ignore
Categorical as _RayCategorical,
)
from ray.tune.sample import (
Domain as _RayDomain,
)
from ray.tune.sample import (
Float as _RayFloat,
)
from ray.tune.sample import (
Integer as _RayInteger,
)
from ray.tune.sample import (
LogUniform as _RayLogUniform,
)
from ray.tune.sample import (
Quantized as _RayQuantized,
)
from ray.tune.sample import (
Uniform as _RayUniform,
)
else:
from ray.tune.search.sample import ( # type: ignore
Categorical as _RayCategorical,
)
from ray.tune.search.sample import (
Domain as _RayDomain,
)
from ray.tune.search.sample import (
Float as _RayFloat,
)
from ray.tune.search.sample import (
Integer as _RayInteger,
)
from ray.tune.search.sample import (
LogUniform as _RayLogUniform,
)
from ray.tune.search.sample import (
Quantized as _RayQuantized,
)
from ray.tune.search.sample import (
Uniform as _RayUniform,
)
_FLOAT_TYPES = (Float, _RayFloat)
_INTEGER_TYPES = (Integer, _RayInteger)
_CATEGORICAL_TYPES = (Categorical, _RayCategorical)
_DOMAIN_TYPES = (Domain, _RayDomain)
_QUANTIZED_TYPES = (Quantized, _RayQuantized)
_UNIFORM_TYPES = (Uniform, _RayUniform)
_LOGUNIFORM_TYPES = (LogUniform, _RayLogUniform)
except Exception: # pragma: no cover
_FLOAT_TYPES = (Float,)
_INTEGER_TYPES = (Integer,)
_CATEGORICAL_TYPES = (Categorical,)
_DOMAIN_TYPES = (Domain,)
_QUANTIZED_TYPES = (Quantized,)
_UNIFORM_TYPES = (Uniform,)
_LOGUNIFORM_TYPES = (LogUniform,)
from ..trial import flatten_dict, unflatten_dict
from .variant_generator import parse_spec_vars
@@ -850,19 +917,22 @@ class OptunaSearch(Searcher):
def resolve_value(domain: Domain) -> ot.distributions.BaseDistribution:
quantize = None
sampler = domain.get_sampler()
if isinstance(sampler, Quantized):
# Ray Tune Domains and FLAML Domains both provide get_sampler(), but
# fall back to the .sampler attribute for robustness.
sampler = domain.get_sampler() if hasattr(domain, "get_sampler") else getattr(domain, "sampler", None)
if isinstance(sampler, _QUANTIZED_TYPES) or type(sampler).__name__ == "Quantized":
quantize = sampler.q
sampler = sampler.sampler
if isinstance(sampler, LogUniform):
sampler = getattr(sampler, "sampler", None) or sampler.get_sampler()
if isinstance(sampler, _LOGUNIFORM_TYPES) or type(sampler).__name__ == "LogUniform":
logger.warning(
"Optuna does not handle quantization in loguniform "
"sampling. The parameter will be passed but it will "
"probably be ignored."
)
if isinstance(domain, Float):
if isinstance(sampler, LogUniform):
if isinstance(domain, _FLOAT_TYPES) or type(domain).__name__ == "Float":
if isinstance(sampler, _LOGUNIFORM_TYPES) or type(sampler).__name__ == "LogUniform":
if quantize:
logger.warning(
"Optuna does not support both quantization and "
@@ -870,17 +940,17 @@ class OptunaSearch(Searcher):
)
return ot.distributions.LogUniformDistribution(domain.lower, domain.upper)
elif isinstance(sampler, Uniform):
elif isinstance(sampler, _UNIFORM_TYPES) or type(sampler).__name__ == "Uniform":
if quantize:
return ot.distributions.DiscreteUniformDistribution(domain.lower, domain.upper, quantize)
return ot.distributions.UniformDistribution(domain.lower, domain.upper)
elif isinstance(domain, Integer):
if isinstance(sampler, LogUniform):
elif isinstance(domain, _INTEGER_TYPES) or type(domain).__name__ == "Integer":
if isinstance(sampler, _LOGUNIFORM_TYPES) or type(sampler).__name__ == "LogUniform":
# ``step`` argument Deprecated in v2.0.0. ``step`` argument should be 1 in Log Distribution
# The removal of this feature is currently scheduled for v4.0.0,
return ot.distributions.IntLogUniformDistribution(domain.lower, domain.upper - 1, step=1)
elif isinstance(sampler, Uniform):
elif isinstance(sampler, _UNIFORM_TYPES) or type(sampler).__name__ == "Uniform":
# Upper bound should be inclusive for quantization and
# exclusive otherwise
return ot.distributions.IntUniformDistribution(
@@ -888,16 +958,16 @@ class OptunaSearch(Searcher):
domain.upper - int(bool(not quantize)),
step=quantize or 1,
)
elif isinstance(domain, Categorical):
if isinstance(sampler, Uniform):
elif isinstance(domain, _CATEGORICAL_TYPES) or type(domain).__name__ == "Categorical":
if isinstance(sampler, _UNIFORM_TYPES) or type(sampler).__name__ == "Uniform":
return ot.distributions.CategoricalDistribution(domain.categories)
raise ValueError(
"Optuna search does not support parameters of type "
"`{}` with samplers of type `{}`".format(type(domain).__name__, type(domain.sampler).__name__)
"`{}` with samplers of type `{}`".format(type(domain).__name__, type(sampler).__name__)
)
# Parameter name is e.g. "a/b/c" for nested dicts
values = {"/".join(path): resolve_value(domain) for path, domain in domain_vars}
return values
return values

View File

@@ -261,7 +261,7 @@ def add_cost_to_space(space: Dict, low_cost_point: Dict, choice_cost: Dict):
low_cost[i] = point
if len(low_cost) > len(domain.categories):
if domain.ordered:
low_cost[-1] = int(np.where(ind == low_cost[-1])[0])
low_cost[-1] = int(np.where(ind == low_cost[-1])[0].item())
domain.low_cost_point = low_cost[-1]
return
if low_cost:

View File

@@ -162,6 +162,10 @@ def broadcast_code(custom_code="", file_name="mylearner"):
assert isinstance(MyLargeLGBM(), LGBMEstimator)
```
"""
# Check if Spark is available
spark_available, _ = check_spark()
# Write to local driver file system
flaml_path = os.path.dirname(os.path.abspath(__file__))
custom_code = textwrap.dedent(custom_code)
custom_path = os.path.join(flaml_path, file_name + ".py")
@@ -169,6 +173,24 @@ def broadcast_code(custom_code="", file_name="mylearner"):
with open(custom_path, "w") as f:
f.write(custom_code)
# If using Spark, broadcast the code content to executors
if spark_available:
spark = SparkSession.builder.getOrCreate()
bc_code = spark.sparkContext.broadcast(custom_code)
# Execute a job to ensure the code is distributed to all executors
def _write_code(bc):
code = bc.value
import os
module_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), file_name + ".py")
os.makedirs(os.path.dirname(module_path), exist_ok=True)
with open(module_path, "w") as f:
f.write(code)
return True
spark.sparkContext.parallelize(range(1)).map(lambda _: _write_code(bc_code)).collect()
return custom_path

View File

@@ -21,11 +21,11 @@ except (ImportError, AssertionError):
from .analysis import ExperimentAnalysis as EA
else:
ray_available = True
import logging
from flaml.tune.spark.utils import PySparkOvertimeMonitor, check_spark
from .logger import logger, logger_formatter
from .result import DEFAULT_METRIC
from .trial import Trial
@@ -41,8 +41,6 @@ except ImportError:
internal_mlflow = False
logger = logging.getLogger(__name__)
logger.propagate = False
_use_ray = True
_runner = None
_verbose = 0
@@ -521,10 +519,6 @@ def run(
elif not logger.hasHandlers():
# Add the console handler.
_ch = logging.StreamHandler(stream=sys.stdout)
logger_formatter = logging.Formatter(
"[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s",
"%m-%d %H:%M:%S",
)
_ch.setFormatter(logger_formatter)
logger.addHandler(_ch)
if verbose <= 2:
@@ -752,10 +746,16 @@ def run(
max_concurrent = max(1, search_alg.max_concurrent)
else:
max_concurrent = max(1, max_spark_parallelism)
passed_in_n_concurrent_trials = max(n_concurrent_trials, max_concurrent)
n_concurrent_trials = min(
n_concurrent_trials if n_concurrent_trials > 0 else num_executors,
max_concurrent,
)
if n_concurrent_trials < passed_in_n_concurrent_trials:
logger.warning(
f"The actual concurrent trials is {n_concurrent_trials}. You can set the environment "
f"variable `FLAML_MAX_CONCURRENT` to '{passed_in_n_concurrent_trials}' to override the detected num of executors."
)
with parallel_backend("spark"):
with Parallel(n_jobs=n_concurrent_trials, verbose=max(0, (verbose - 1) * 50)) as parallel:
try:
@@ -776,8 +776,8 @@ def run(
and (num_samples < 0 or num_trials < num_samples)
and num_failures < upperbound_num_failures
):
if automl_info and automl_info[0] > 0 and time_budget_s < np.inf:
time_budget_s -= automl_info[0]
if automl_info and automl_info[1] == "all" and automl_info[0] > 0 and time_budget_s < np.inf:
time_budget_s -= automl_info[0] * n_concurrent_trials
logger.debug(f"Remaining time budget with mlflow log latency: {time_budget_s} seconds.")
while len(_runner.running_trials) < n_concurrent_trials:
# suggest trials for spark
@@ -802,9 +802,17 @@ def run(
)
results = None
with PySparkOvertimeMonitor(time_start, time_budget_s, force_cancel, parallel=parallel):
results = parallel(
delayed(evaluation_function)(trial_to_run.config) for trial_to_run in trials_to_run
)
try:
results = parallel(
delayed(evaluation_function)(trial_to_run.config) for trial_to_run in trials_to_run
)
except RuntimeError as e:
logger.warning(f"RuntimeError: {e}")
results = None
logger.info(
"Encountered RuntimeError. Waiting 10 seconds for Spark cluster to recover before retrying."
)
time.sleep(10)
# results = [evaluation_function(trial_to_run.config) for trial_to_run in trials_to_run]
while results:
result = results.pop(0)

View File

@@ -1 +1 @@
__version__ = "2.3.4"
__version__ = "2.5.0"

View File

@@ -2,7 +2,6 @@
license_file = "LICENSE"
description-file = "README.md"
[tool.pytest.ini_options]
addopts = '-m "not conda"'
markers = [

3
pytest.ini Normal file
View File

@@ -0,0 +1,3 @@
[pytest]
markers =
spark: mark a test as requiring Spark

View File

@@ -51,60 +51,59 @@ setuptools.setup(
"joblib<=1.3.2",
],
"test": [
"numpy>=1.17,<2.0.0; python_version<'3.13'",
"numpy>=1.17; python_version>='3.13'",
"jupyter",
"lightgbm>=2.3.1",
"xgboost>=0.90,<2.0.0",
"xgboost>=0.90,<2.0.0; python_version<'3.11'",
"xgboost>=2.0.0; python_version>='3.11'",
"scipy>=1.4.1",
"pandas>=1.1.4,<2.0.0; python_version<'3.10'",
"pandas>=1.1.4; python_version>='3.10'",
"scikit-learn>=1.0.0",
"scikit-learn>=1.2.0",
"thop",
"pytest>=6.1.1",
"pytest-rerunfailures>=13.0",
"coverage>=5.3",
"pre-commit",
"torch",
"torchvision",
"catboost>=0.26,<1.2; python_version<'3.11'",
"catboost>=0.26; python_version>='3.11'",
"catboost>=0.26",
"rgf-python",
"optuna>=2.8.0,<=3.6.1",
"openml",
"statsmodels>=0.12.2",
"psutil==5.8.0",
"psutil",
"dataclasses",
"transformers[torch]==4.26",
"transformers[torch]",
"datasets",
"nltk<=3.8.1", # 3.8.2 doesn't work with mlflow
"evaluate",
"nltk!=3.8.2", # 3.8.2 doesn't work with mlflow
"rouge_score",
"hcrystalball==0.1.10",
"hcrystalball",
"seqeval",
"pytorch-forecasting>=0.9.0,<=0.10.1; python_version<'3.11'",
# "pytorch-forecasting==0.10.1; python_version=='3.11'",
"mlflow==2.15.1",
"pytorch-forecasting",
"mlflow-skinny<=2.22.1", # Refer to https://mvnrepository.com/artifact/org.mlflow/mlflow-spark
"joblibspark>=0.5.0",
"joblib<=1.3.2",
"nbconvert",
"nbformat",
"ipykernel",
"pytorch-lightning<1.9.1", # test_forecast_panel
"tensorboardX==2.6", # test_forecast_panel
"requests<2.29.0", # https://github.com/docker/docker-py/issues/3113
"pytorch-lightning", # test_forecast_panel
"tensorboardX", # test_forecast_panel
"requests", # https://github.com/docker/docker-py/issues/3113
"packaging",
"pydantic==1.10.9",
"sympy",
"wolframalpha",
"dill", # a drop in replacement of pickle
],
"catboost": [
"catboost>=0.26,<1.2; python_version<'3.11'",
"catboost>=0.26,<=1.2.5; python_version>='3.11'",
"catboost>=0.26",
],
"blendsearch": [
"optuna>=2.8.0,<=3.6.1",
"packaging",
],
"ray": [
"ray[tune]~=1.13",
"ray[tune]>=1.13,<2.5.0",
],
"azureml": [
"azureml-mlflow",
@@ -117,47 +116,35 @@ setuptools.setup(
"scikit-learn",
],
"hf": [
"transformers[torch]==4.26",
"transformers[torch]>=4.26",
"datasets",
"nltk<=3.8.1",
"rouge_score",
"seqeval",
],
"nlp": [ # for backward compatibility; hf is the new option name
"transformers[torch]==4.26",
"transformers[torch]>=4.26",
"datasets",
"nltk<=3.8.1",
"rouge_score",
"seqeval",
],
"ts_forecast": [
"holidays<0.14", # to prevent installation error for prophet
"prophet>=1.0.1",
"holidays",
"prophet>=1.1.5",
"statsmodels>=0.12.2",
"hcrystalball==0.1.10",
"hcrystalball>=0.1.10",
],
"forecast": [
"holidays<0.14", # to prevent installation error for prophet
"prophet>=1.0.1",
"holidays",
"prophet>=1.1.5",
"statsmodels>=0.12.2",
"hcrystalball==0.1.10",
"pytorch-forecasting>=0.9.0; python_version<'3.11'",
# "pytorch-forecasting==0.10.1; python_version=='3.11'",
"pytorch-lightning==1.9.0",
"tensorboardX==2.6",
"hcrystalball>=0.1.10",
"pytorch-forecasting>=0.10.4",
"pytorch-lightning>=1.9.0",
"tensorboardX>=2.6",
],
"benchmark": ["catboost>=0.26", "psutil==5.8.0", "xgboost==1.3.3", "pandas==1.1.4"],
"openai": ["openai==0.27.8", "diskcache"],
"autogen": ["openai==0.27.8", "diskcache", "termcolor"],
"mathchat": ["openai==0.27.8", "diskcache", "termcolor", "sympy", "pydantic==1.10.9", "wolframalpha"],
"retrievechat": [
"openai==0.27.8",
"diskcache",
"termcolor",
"chromadb",
"tiktoken",
"sentence_transformers",
],
"synapse": [
"joblibspark>=0.5.0",
"optuna>=2.8.0,<=3.6.1",
@@ -170,10 +157,9 @@ setuptools.setup(
"Operating System :: OS Independent",
# Specify the Python versions you support here.
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
],
python_requires=">=3.8",
python_requires=">=3.10",
)

View File

@@ -4,8 +4,17 @@ import pytest
from flaml import AutoML, tune
try:
import transformers
@pytest.mark.skipif(sys.platform == "darwin", reason="do not run on mac os")
_transformers_installed = True
except ImportError:
_transformers_installed = False
@pytest.mark.skipif(
sys.platform == "darwin" or not _transformers_installed, reason="do not run on mac os or transformers not installed"
)
def test_custom_hp_nlp():
from test.nlp.utils import get_automl_settings, get_toy_data_seqclassification
@@ -63,5 +72,39 @@ def test_custom_hp():
print(automl.best_config_per_estimator)
def test_lgbm_objective():
"""Test that objective parameter can be set via custom_hp for LGBMEstimator"""
import numpy as np
# Create a simple regression dataset
np.random.seed(42)
X_train = np.random.rand(100, 5)
y_train = np.random.rand(100) * 100 # Scale to avoid division issues with MAPE
automl = AutoML()
settings = {
"time_budget": 3,
"metric": "mape",
"task": "regression",
"estimator_list": ["lgbm"],
"verbose": 0,
"custom_hp": {"lgbm": {"objective": {"domain": "mape"}}}, # Fixed value, not tuned
}
automl.fit(X_train, y_train, **settings)
# Verify that objective was set correctly
assert "objective" in automl.best_config, "objective should be in best_config"
assert automl.best_config["objective"] == "mape", "objective should be 'mape'"
# Verify the model has the correct objective
if hasattr(automl.model, "estimator") and hasattr(automl.model.estimator, "get_params"):
model_params = automl.model.estimator.get_params()
assert model_params.get("objective") == "mape", "Model should use 'mape' objective"
print("Test passed: objective parameter works correctly with LGBMEstimator")
if __name__ == "__main__":
test_custom_hp()
test_lgbm_objective()

View File

@@ -1,3 +1,4 @@
import atexit
import os
import sys
import unittest
@@ -15,8 +16,18 @@ from sklearn.model_selection import train_test_split
from flaml import AutoML
from flaml.automl.ml import sklearn_metric_loss_score
from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
from flaml.tune.spark.utils import check_spark
try:
import pytorch_lightning
_pl_installed = True
except ImportError:
_pl_installed = False
pytestmark = pytest.mark.spark
leaderboard = defaultdict(dict)
warnings.simplefilter(action="ignore")
@@ -37,7 +48,7 @@ else:
.config(
"spark.jars.packages",
(
"com.microsoft.azure:synapseml_2.12:1.0.2,"
"com.microsoft.azure:synapseml_2.12:1.1.0,"
"org.apache.hadoop:hadoop-azure:3.3.5,"
"com.microsoft.azure:azure-storage:8.6.6,"
f"org.mlflow:mlflow-spark_2.12:{mlflow.__version__}"
@@ -61,6 +72,9 @@ else:
except ImportError:
skip_spark = True
spark, ansi_conf, adjusted = disable_spark_ansi_mode()
atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
def _test_regular_models(estimator_list, task):
if isinstance(estimator_list, str):
@@ -174,7 +188,11 @@ def _test_sparse_matrix_classification(estimator):
"n_jobs": 1,
"model_history": True,
}
X_train = scipy.sparse.random(1554, 21, dtype=int)
# NOTE: Avoid `dtype=int` here. On some NumPy/SciPy combinations (notably
# Windows + Python 3.13), `scipy.sparse.random(..., dtype=int)` may trigger
# integer sampling paths which raise "low is out of bounds for int32".
# A float sparse matrix is sufficient to validate sparse-input support.
X_train = scipy.sparse.random(1554, 21, dtype=np.float32)
y_train = np.random.randint(3, size=1554)
automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)
@@ -269,7 +287,11 @@ class TestExtraModel(unittest.TestCase):
@unittest.skipIf(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_default_spark(self):
_test_spark_models(None, "classification")
# TODO: remove the estimator assignment once SynapseML supports spark 4+.
from flaml.automl.spark.utils import _spark_major_minor_version
estimator_list = ["rf_spark"] if _spark_major_minor_version[0] >= 4 else None
_test_spark_models(estimator_list, "classification")
def test_svc(self):
_test_regular_models("svc", "classification")
@@ -300,7 +322,7 @@ class TestExtraModel(unittest.TestCase):
def test_avg(self):
_test_forecast("avg")
@unittest.skipIf(skip_spark, reason="Skip on Mac or Windows")
@unittest.skipIf(skip_spark or not _pl_installed, reason="Skip on Mac or Windows or no pytorch_lightning.")
def test_tcn(self):
_test_forecast("tcn")

View File

@@ -10,7 +10,7 @@ from flaml import AutoML
from flaml.automl.task.time_series_task import TimeSeriesTask
def test_forecast_automl(budget=10, estimators_when_no_prophet=["arima", "sarimax", "holt-winters"]):
def test_forecast_automl(budget=20, estimators_when_no_prophet=["arima", "sarimax", "holt-winters"]):
# using dataframe
import statsmodels.api as sm
@@ -477,7 +477,10 @@ def test_forecast_classification(budget=5):
def get_stalliion_data():
from pytorch_forecasting.data.examples import get_stallion_data
data = get_stallion_data()
# data = get_stallion_data()
data = pd.read_parquet(
"https://raw.githubusercontent.com/sktime/pytorch-forecasting/refs/heads/main/examples/data/stallion.parquet"
)
# add time index - For datasets with no missing values, FLAML will automate this process
data["time_idx"] = data["date"].dt.year * 12 + data["date"].dt.month
data["time_idx"] -= data["time_idx"].min()
@@ -507,8 +510,12 @@ def get_stalliion_data():
"3.11" in sys.version,
reason="do not run on py 3.11",
)
def test_forecast_panel(budget=5):
data, special_days = get_stalliion_data()
def test_forecast_panel(budget=30):
try:
data, special_days = get_stalliion_data()
except ImportError:
print("pytorch_forecasting not installed")
return
time_horizon = 6 # predict six months
training_cutoff = data["time_idx"].max() - time_horizon
data["time_idx"] = data["time_idx"].astype("int")
@@ -674,11 +681,55 @@ def test_cv_step():
print("yahoo!")
def test_log_training_metric_ts_models():
"""Test that log_training_metric=True works with time series models (arima, sarimax, holt-winters)."""
import statsmodels.api as sm
from flaml.automl.task.time_series_task import TimeSeriesTask
estimators_all = TimeSeriesTask("forecast").estimators.keys()
estimators_to_test = ["xgboost", "arima", "lassolars", "tcn", "snaive", "prophet", "orbit"]
estimators = [
est for est in estimators_to_test if est in estimators_all
] # not all estimators available in current python env
print(f"Testing estimators: {estimators}")
# Prepare data
data = sm.datasets.co2.load_pandas().data["co2"]
data = data.resample("MS").mean()
data = data.bfill().ffill()
data = data.to_frame().reset_index()
data = data.rename(columns={"index": "ds", "co2": "y"})
num_samples = data.shape[0]
time_horizon = 12
split_idx = num_samples - time_horizon
df = data[:split_idx]
# Test each time series model with log_training_metric=True
for estimator in estimators:
print(f"\nTesting {estimator} with log_training_metric=True")
automl = AutoML()
settings = {
"time_budget": 3,
"metric": "mape",
"task": "forecast",
"eval_method": "holdout",
"label": "y",
"log_training_metric": True, # This should not cause errors
"estimator_list": [estimator],
}
automl.fit(dataframe=df, **settings, period=time_horizon, force_cancel=True)
print(f"{estimator} SUCCESS with log_training_metric=True")
if automl.best_estimator:
assert automl.best_estimator == estimator
if __name__ == "__main__":
# test_forecast_automl(60)
# test_multivariate_forecast_num(5)
# test_multivariate_forecast_cat(5)
test_numpy()
# test_numpy()
# test_forecast_classification(5)
# test_forecast_panel(5)
# test_cv_step()
test_log_training_metric_ts_models()

View File

@@ -0,0 +1,51 @@
import mlflow
import numpy as np
import pandas as pd
from flaml import AutoML
def test_max_iter_1():
date_rng = pd.date_range(start="2024-01-01", periods=100, freq="H")
X = pd.DataFrame({"ds": date_rng})
y_train_24h = np.random.rand(len(X)) * 100
# AutoML
settings = {
"max_iter": 1,
"estimator_list": ["xgboost", "lgbm"],
"starting_points": {"xgboost": {}, "lgbm": {}},
"task": "ts_forecast",
"log_file_name": "test_max_iter_1.log",
"seed": 41,
"mlflow_exp_name": "TestExp-max_iter-1",
"use_spark": False,
"n_concurrent_trials": 1,
"verbose": 1,
"featurization": "off",
"metric": "rmse",
"mlflow_logging": True,
}
automl = AutoML(**settings)
with mlflow.start_run(run_name="AutoMLModel-XGBoost-and-LGBM-max_iter_1"):
automl.fit(
X_train=X,
y_train=y_train_24h,
period=24,
X_val=X,
y_val=y_train_24h,
split_ratio=0,
force_cancel=False,
)
assert automl.model is not None, "AutoML failed to return a model"
assert automl.best_run_id is not None, "Best run ID should not be None with mlflow logging"
print("Best model:", automl.model)
print("Best run ID:", automl.best_run_id)
if __name__ == "__main__":
test_max_iter_1()

View File

@@ -10,6 +10,18 @@ from flaml import AutoML
class TestMLFlowLoggingParam:
def test_update_and_install_requirements(self):
import mlflow
from sklearn import tree
from flaml.fabric.mlflow import update_and_install_requirements
with mlflow.start_run(run_name="test") as run:
sk_model = tree.DecisionTreeClassifier()
mlflow.sklearn.log_model(sk_model, "model", registered_model_name="test")
update_and_install_requirements(run_id=run.info.run_id)
def test_should_start_new_run_by_default(self, automl_settings):
with mlflow.start_run() as parent_run:
automl = AutoML()

View File

@@ -181,6 +181,49 @@ class TestMultiClass(unittest.TestCase):
}
automl.fit(X_train=X_train, y_train=y_train, **settings)
def test_ensemble_final_estimator_params_not_tuned(self):
"""Test that final_estimator parameters in ensemble are not automatically tuned.
This test verifies that when a custom final_estimator is provided with specific
parameters, those parameters are used as-is without any hyperparameter tuning.
"""
from sklearn.linear_model import LogisticRegression
automl = AutoML()
X_train, y_train = load_wine(return_X_y=True)
# Create a LogisticRegression with specific non-default parameters
custom_params = {
"C": 0.5, # Non-default value
"max_iter": 50, # Non-default value
"random_state": 42,
}
final_est = LogisticRegression(**custom_params)
settings = {
"time_budget": 5,
"estimator_list": ["rf", "lgbm"],
"task": "classification",
"ensemble": {
"final_estimator": final_est,
"passthrough": False,
},
"n_jobs": 1,
}
automl.fit(X_train=X_train, y_train=y_train, **settings)
# Verify that the final estimator in the stacker uses the exact parameters we specified
if hasattr(automl.model, "final_estimator_"):
# The model is a StackingClassifier
fitted_final_estimator = automl.model.final_estimator_
assert (
abs(fitted_final_estimator.C - custom_params["C"]) < 1e-9
), f"Expected C={custom_params['C']}, but got {fitted_final_estimator.C}"
assert (
fitted_final_estimator.max_iter == custom_params["max_iter"]
), f"Expected max_iter={custom_params['max_iter']}, but got {fitted_final_estimator.max_iter}"
print("✓ Final estimator parameters were preserved (not tuned)")
def test_dataframe(self):
self.test_classification(True)
@@ -235,6 +278,34 @@ class TestMultiClass(unittest.TestCase):
except ImportError:
pass
def test_invalid_custom_metric(self):
"""Test that proper error is raised when custom_metric is called instead of passed."""
from sklearn.datasets import load_iris
X_train, y_train = load_iris(return_X_y=True)
# Test with non-callable metric in __init__
with self.assertRaises(ValueError) as context:
automl = AutoML(metric=123) # passing an int instead of function
self.assertIn("must be either a string or a callable function", str(context.exception))
self.assertIn("but got int", str(context.exception))
# Test with non-callable metric in fit
automl = AutoML()
with self.assertRaises(ValueError) as context:
automl.fit(X_train=X_train, y_train=y_train, metric=[], task="classification", time_budget=1)
self.assertIn("must be either a string or a callable function", str(context.exception))
self.assertIn("but got list", str(context.exception))
# Test with tuple (simulating result of calling a function that returns tuple)
with self.assertRaises(ValueError) as context:
automl = AutoML()
automl.fit(
X_train=X_train, y_train=y_train, metric=(0.5, {"loss": 0.5}), task="classification", time_budget=1
)
self.assertIn("must be either a string or a callable function", str(context.exception))
self.assertIn("but got tuple", str(context.exception))
def test_classification(self, as_frame=False):
automl_experiment = AutoML()
automl_settings = {
@@ -368,7 +439,11 @@ class TestMultiClass(unittest.TestCase):
"n_jobs": 1,
"model_history": True,
}
X_train = scipy.sparse.random(1554, 21, dtype=int)
# NOTE: Avoid `dtype=int` here. On some NumPy/SciPy combinations (notably
# Windows + Python 3.13), `scipy.sparse.random(..., dtype=int)` may trigger
# integer sampling paths which raise "low is out of bounds for int32".
# A float sparse matrix is sufficient to validate sparse-input support.
X_train = scipy.sparse.random(1554, 21, dtype=np.float32)
y_train = np.random.randint(3, size=1554)
automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)
print(automl_experiment.classes_)
@@ -531,6 +606,32 @@ class TestMultiClass(unittest.TestCase):
print(f"Best accuracy on validation data: {new_automl_val_accuracy:.4g}")
# print('Training duration of best run: {0:.4g} s'.format(new_automl_experiment.best_config_train_time))
def test_starting_points_should_improve_performance(self):
N = 10000 # a large N is needed to see the improvement
X_train, y_train = load_iris(return_X_y=True)
X_train = np.concatenate([X_train + 0.1 * i for i in range(N)], axis=0)
y_train = np.concatenate([y_train] * N, axis=0)
am1 = AutoML()
am1.fit(X_train, y_train, estimator_list=["lgbm"], time_budget=3, seed=11)
am2 = AutoML()
am2.fit(
X_train,
y_train,
estimator_list=["lgbm"],
time_budget=2,
seed=11,
starting_points=am1.best_config_per_estimator,
)
print(f"am1.best_loss: {am1.best_loss:.4f}")
print(f"am2.best_loss: {am2.best_loss:.4f}")
assert np.round(am2.best_loss, 4) <= np.round(
am1.best_loss, 4
), "Starting points should help improve the performance!"
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,272 @@
"""Test to ensure correct label overlap handling for classification tasks"""
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris, make_classification
from flaml import AutoML
def test_allow_label_overlap_true():
"""Test with allow_label_overlap=True (fast mode, default)"""
# Load iris dataset
dic_data = load_iris(as_frame=True)
iris_data = dic_data["frame"]
# Prepare data
x_train = iris_data[["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"]].to_numpy()
y_train = iris_data["target"]
# Train with fast mode (default)
automl = AutoML()
automl_settings = {
"max_iter": 5,
"metric": "accuracy",
"task": "classification",
"estimator_list": ["lgbm"],
"eval_method": "holdout",
"split_type": "stratified",
"keep_search_state": True,
"retrain_full": False,
"auto_augment": False,
"verbose": 0,
"allow_label_overlap": True, # Fast mode
}
automl.fit(x_train, y_train, **automl_settings)
# Check results
input_size = len(x_train)
train_size = len(automl._state.X_train)
val_size = len(automl._state.X_val)
# With stratified split on balanced data, fast mode may have no overlap
assert (
train_size + val_size >= input_size
), f"Inconsistent sizes. Input: {input_size}, Train: {train_size}, Val: {val_size}"
# Verify all classes are represented in both sets
train_labels = set(np.unique(automl._state.y_train))
val_labels = set(np.unique(automl._state.y_val))
all_labels = set(np.unique(y_train))
assert train_labels == all_labels, f"Not all labels in train. All: {all_labels}, Train: {train_labels}"
assert val_labels == all_labels, f"Not all labels in val. All: {all_labels}, Val: {val_labels}"
print(
f"✓ Test passed (fast mode): Input: {input_size}, Train: {train_size}, Val: {val_size}, "
f"Overlap: {train_size + val_size - input_size}"
)
def test_allow_label_overlap_false():
"""Test with allow_label_overlap=False (precise mode)"""
# Load iris dataset
dic_data = load_iris(as_frame=True)
iris_data = dic_data["frame"]
# Prepare data
x_train = iris_data[["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"]].to_numpy()
y_train = iris_data["target"]
# Train with precise mode
automl = AutoML()
automl_settings = {
"max_iter": 5,
"metric": "accuracy",
"task": "classification",
"estimator_list": ["lgbm"],
"eval_method": "holdout",
"split_type": "stratified",
"keep_search_state": True,
"retrain_full": False,
"auto_augment": False,
"verbose": 0,
"allow_label_overlap": False, # Precise mode
}
automl.fit(x_train, y_train, **automl_settings)
# Check that there's no overlap (or minimal overlap for single-instance classes)
input_size = len(x_train)
train_size = len(automl._state.X_train)
val_size = len(automl._state.X_val)
# Verify all classes are represented
all_labels = set(np.unique(y_train))
# Should have no overlap or minimal overlap
overlap = train_size + val_size - input_size
assert overlap <= len(all_labels), f"Excessive overlap: {overlap}"
# Verify all classes are represented
train_labels = set(np.unique(automl._state.y_train))
val_labels = set(np.unique(automl._state.y_val))
combined_labels = train_labels.union(val_labels)
assert combined_labels == all_labels, f"Not all labels present. All: {all_labels}, Combined: {combined_labels}"
print(
f"✓ Test passed (precise mode): Input: {input_size}, Train: {train_size}, Val: {val_size}, "
f"Overlap: {overlap}"
)
def test_uniform_split_with_overlap_control():
"""Test with uniform split and both overlap modes"""
# Load iris dataset
dic_data = load_iris(as_frame=True)
iris_data = dic_data["frame"]
# Prepare data
x_train = iris_data[["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"]].to_numpy()
y_train = iris_data["target"]
# Test precise mode with uniform split
automl = AutoML()
automl_settings = {
"max_iter": 5,
"metric": "accuracy",
"task": "classification",
"estimator_list": ["lgbm"],
"eval_method": "holdout",
"split_type": "uniform",
"keep_search_state": True,
"retrain_full": False,
"auto_augment": False,
"verbose": 0,
"allow_label_overlap": False, # Precise mode
}
automl.fit(x_train, y_train, **automl_settings)
input_size = len(x_train)
train_size = len(automl._state.X_train)
val_size = len(automl._state.X_val)
# Verify all classes are represented
train_labels = set(np.unique(automl._state.y_train))
val_labels = set(np.unique(automl._state.y_val))
all_labels = set(np.unique(y_train))
combined_labels = train_labels.union(val_labels)
assert combined_labels == all_labels, "Not all labels present with uniform split"
print(f"✓ Test passed (uniform split): Input: {input_size}, Train: {train_size}, Val: {val_size}")
def test_with_sample_weights():
"""Test label overlap handling with sample weights"""
# Create a simple dataset
X, y = make_classification(
n_samples=200,
n_features=10,
n_informative=5,
n_redundant=2,
n_classes=3,
n_clusters_per_class=1,
random_state=42,
)
# Create sample weights (giving more weight to some samples)
sample_weight = np.random.uniform(0.5, 2.0, size=len(y))
# Test fast mode with sample weights
automl_fast = AutoML()
automl_fast.fit(
X,
y,
task="classification",
metric="accuracy",
estimator_list=["lgbm"],
eval_method="holdout",
split_type="stratified",
max_iter=3,
keep_search_state=True,
retrain_full=False,
auto_augment=False,
verbose=0,
allow_label_overlap=True, # Fast mode
sample_weight=sample_weight,
)
# Verify all labels present
train_labels_fast = set(np.unique(automl_fast._state.y_train))
val_labels_fast = set(np.unique(automl_fast._state.y_val))
all_labels = set(np.unique(y))
assert train_labels_fast == all_labels, "Not all labels in train (fast mode with weights)"
assert val_labels_fast == all_labels, "Not all labels in val (fast mode with weights)"
# Test precise mode with sample weights
automl_precise = AutoML()
automl_precise.fit(
X,
y,
task="classification",
metric="accuracy",
estimator_list=["lgbm"],
eval_method="holdout",
split_type="stratified",
max_iter=3,
keep_search_state=True,
retrain_full=False,
auto_augment=False,
verbose=0,
allow_label_overlap=False, # Precise mode
sample_weight=sample_weight,
)
# Verify all labels present
train_labels_precise = set(np.unique(automl_precise._state.y_train))
val_labels_precise = set(np.unique(automl_precise._state.y_val))
combined_labels = train_labels_precise.union(val_labels_precise)
assert combined_labels == all_labels, "Not all labels present (precise mode with weights)"
print("✓ Test passed with sample weights (fast and precise modes)")
def test_single_instance_class():
"""Test handling of single-instance classes"""
# Create imbalanced dataset where one class has only 1 instance
X = np.random.randn(50, 4)
y = np.array([0] * 40 + [1] * 9 + [2] * 1) # Class 2 has only 1 instance
# Test precise mode - should add single instance to both sets
automl = AutoML()
automl.fit(
X,
y,
task="classification",
metric="accuracy",
estimator_list=["lgbm"],
eval_method="holdout",
split_type="uniform",
max_iter=3,
keep_search_state=True,
retrain_full=False,
auto_augment=False,
verbose=0,
allow_label_overlap=False, # Precise mode
)
# Verify all labels present
train_labels = set(np.unique(automl._state.y_train))
val_labels = set(np.unique(automl._state.y_val))
all_labels = set(np.unique(y))
# Single-instance class should be in both sets
combined_labels = train_labels.union(val_labels)
assert combined_labels == all_labels, "Not all labels present with single-instance class"
# Check that single-instance class (label 2) is in both sets
assert 2 in train_labels, "Single-instance class not in train"
assert 2 in val_labels, "Single-instance class not in val"
print("✓ Test passed with single-instance class")
if __name__ == "__main__":
test_allow_label_overlap_true()
test_allow_label_overlap_false()
test_uniform_split_with_overlap_control()
test_with_sample_weights()
test_single_instance_class()
print("\n✓ All tests passed!")

View File

@@ -1,8 +1,23 @@
import sys
import pytest
from minio.error import ServerError
from openml.exceptions import OpenMLServerException
try:
from minio.error import ServerError
except ImportError:
class ServerError(Exception):
pass
try:
from openml.exceptions import OpenMLServerException
except ImportError:
class OpenMLServerException(Exception):
pass
from requests.exceptions import ChunkedEncodingError, SSLError
@@ -64,6 +79,9 @@ def test_automl(budget=5, dataset_format="dataframe", hpo_method=None):
automl.fit(X_train=X_train, y_train=y_train, **settings)
""" retrieve best config and best learner """
print("Best ML leaner:", automl.best_estimator)
if not automl.best_estimator:
print("Training budget is not sufficient")
return
print("Best hyperparmeter config:", automl.best_config)
print(f"Best accuracy on validation data: {1 - automl.best_loss:.4g}")
print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")

View File

@@ -0,0 +1,236 @@
"""Tests for the public preprocessor APIs."""
import unittest
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer, load_diabetes
from flaml import AutoML
class TestPreprocessAPI(unittest.TestCase):
"""Test cases for the public preprocess() API methods."""
def test_automl_preprocess_before_fit(self):
"""Test that calling preprocess before fit raises an error."""
automl = AutoML()
X_test = np.array([[1, 2, 3], [4, 5, 6]])
with self.assertRaises(AttributeError) as context:
automl.preprocess(X_test)
# Check that an error is raised about not being fitted
self.assertIn("fit()", str(context.exception))
def test_automl_preprocess_classification(self):
"""Test task-level preprocessing for classification."""
# Load dataset
X, y = load_breast_cancer(return_X_y=True)
X_train, y_train = X[:400], y[:400]
X_test = X[400:450]
# Train AutoML
automl = AutoML()
automl_settings = {
"max_iter": 5,
"task": "classification",
"metric": "accuracy",
"estimator_list": ["lgbm"],
"verbose": 0,
}
automl.fit(X_train, y_train, **automl_settings)
# Test task-level preprocessing
X_preprocessed = automl.preprocess(X_test)
# Verify the output is not None and has the right shape
self.assertIsNotNone(X_preprocessed)
self.assertEqual(X_preprocessed.shape[0], X_test.shape[0])
def test_automl_preprocess_regression(self):
"""Test task-level preprocessing for regression."""
# Load dataset
X, y = load_diabetes(return_X_y=True)
X_train, y_train = X[:300], y[:300]
X_test = X[300:350]
# Train AutoML
automl = AutoML()
automl_settings = {
"max_iter": 5,
"task": "regression",
"metric": "r2",
"estimator_list": ["lgbm"],
"verbose": 0,
}
automl.fit(X_train, y_train, **automl_settings)
# Test task-level preprocessing
X_preprocessed = automl.preprocess(X_test)
# Verify the output
self.assertIsNotNone(X_preprocessed)
self.assertEqual(X_preprocessed.shape[0], X_test.shape[0])
def test_automl_preprocess_with_dataframe(self):
"""Test task-level preprocessing with pandas DataFrame."""
# Create a simple dataset
X_train = pd.DataFrame(
{
"feature1": [1, 2, 3, 4, 5] * 20,
"feature2": [5, 4, 3, 2, 1] * 20,
"category": ["a", "b", "a", "b", "a"] * 20,
}
)
y_train = pd.Series([0, 1, 0, 1, 0] * 20)
X_test = pd.DataFrame(
{
"feature1": [6, 7, 8],
"feature2": [1, 2, 3],
"category": ["a", "b", "a"],
}
)
# Train AutoML
automl = AutoML()
automl_settings = {
"max_iter": 5,
"task": "classification",
"metric": "accuracy",
"estimator_list": ["lgbm"],
"verbose": 0,
}
automl.fit(X_train, y_train, **automl_settings)
# Test preprocessing
X_preprocessed = automl.preprocess(X_test)
# Verify the output - check the number of rows matches
self.assertIsNotNone(X_preprocessed)
preprocessed_len = len(X_preprocessed) if hasattr(X_preprocessed, "__len__") else X_preprocessed.shape[0]
self.assertEqual(preprocessed_len, len(X_test))
def test_estimator_preprocess(self):
"""Test estimator-level preprocessing."""
# Load dataset
X, y = load_breast_cancer(return_X_y=True)
X_train, y_train = X[:400], y[:400]
X_test = X[400:450]
# Train AutoML
automl = AutoML()
automl_settings = {
"max_iter": 5,
"task": "classification",
"metric": "accuracy",
"estimator_list": ["lgbm"],
"verbose": 0,
}
automl.fit(X_train, y_train, **automl_settings)
# Get the trained estimator
estimator = automl.model
self.assertIsNotNone(estimator)
# First apply task-level preprocessing
X_task_preprocessed = automl.preprocess(X_test)
# Then apply estimator-level preprocessing
X_estimator_preprocessed = estimator.preprocess(X_task_preprocessed)
# Verify the output
self.assertIsNotNone(X_estimator_preprocessed)
self.assertEqual(X_estimator_preprocessed.shape[0], X_test.shape[0])
def test_preprocess_pipeline(self):
"""Test the complete preprocessing pipeline (task-level then estimator-level)."""
# Load dataset
X, y = load_breast_cancer(return_X_y=True)
X_train, y_train = X[:400], y[:400]
X_test = X[400:450]
# Train AutoML
automl = AutoML()
automl_settings = {
"max_iter": 5,
"task": "classification",
"metric": "accuracy",
"estimator_list": ["lgbm"],
"verbose": 0,
}
automl.fit(X_train, y_train, **automl_settings)
# Apply the complete preprocessing pipeline
X_task_preprocessed = automl.preprocess(X_test)
X_final = automl.model.preprocess(X_task_preprocessed)
# Verify predictions work with preprocessed data
# The internal predict already does this preprocessing,
# but we verify our manual preprocessing gives consistent results
y_pred_manual = automl.model._model.predict(X_final)
y_pred_auto = automl.predict(X_test)
# Both should give the same predictions
np.testing.assert_array_equal(y_pred_manual, y_pred_auto)
def test_preprocess_with_mixed_types(self):
"""Test preprocessing with mixed data types."""
# Create dataset with mixed types
X_train = pd.DataFrame(
{
"numeric1": np.random.rand(100),
"numeric2": np.random.randint(0, 100, 100),
"categorical": np.random.choice(["cat", "dog", "bird"], 100),
"boolean": np.random.choice([True, False], 100),
}
)
y_train = pd.Series(np.random.randint(0, 2, 100))
X_test = pd.DataFrame(
{
"numeric1": np.random.rand(10),
"numeric2": np.random.randint(0, 100, 10),
"categorical": np.random.choice(["cat", "dog", "bird"], 10),
"boolean": np.random.choice([True, False], 10),
}
)
# Train AutoML
automl = AutoML()
automl_settings = {
"max_iter": 5,
"task": "classification",
"metric": "accuracy",
"estimator_list": ["lgbm"],
"verbose": 0,
}
automl.fit(X_train, y_train, **automl_settings)
# Test preprocessing
X_preprocessed = automl.preprocess(X_test)
# Verify the output
self.assertIsNotNone(X_preprocessed)
def test_estimator_preprocess_without_automl(self):
"""Test that estimator.preprocess() can be used independently."""
from flaml.automl.model import LGBMEstimator
# Create a simple estimator
X_train = np.random.rand(100, 5)
y_train = np.random.randint(0, 2, 100)
estimator = LGBMEstimator(task="classification")
estimator.fit(X_train, y_train)
# Test preprocessing
X_test = np.random.rand(10, 5)
X_preprocessed = estimator.preprocess(X_test)
# Verify the output
self.assertIsNotNone(X_preprocessed)
self.assertEqual(X_preprocessed.shape, X_test.shape)
if __name__ == "__main__":
unittest.main()

View File

@@ -38,7 +38,7 @@ class TestLogging(unittest.TestCase):
"keep_search_state": True,
"learner_selector": "roundrobin",
}
X_train, y_train = fetch_california_housing(return_X_y=True)
X_train, y_train = fetch_california_housing(return_X_y=True, data_home="test")
n = len(y_train) >> 1
print(automl.model, automl.classes_, automl.predict(X_train))
automl.fit(

View File

@@ -47,7 +47,7 @@ class TestRegression(unittest.TestCase):
"n_jobs": 1,
"model_history": True,
}
X_train, y_train = fetch_california_housing(return_X_y=True)
X_train, y_train = fetch_california_housing(return_X_y=True, data_home="test")
n = int(len(y_train) * 9 // 10)
automl.fit(X_train=X_train[:n], y_train=y_train[:n], X_val=X_train[n:], y_val=y_train[n:], **automl_settings)
assert automl._state.eval_method == "holdout"
@@ -130,7 +130,7 @@ class TestRegression(unittest.TestCase):
)
automl.fit(X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **settings)
def test_parallel(self, hpo_method=None):
def test_parallel_and_pickle(self, hpo_method=None):
automl_experiment = AutoML()
automl_settings = {
"time_budget": 10,
@@ -141,7 +141,7 @@ class TestRegression(unittest.TestCase):
"n_concurrent_trials": 10,
"hpo_method": hpo_method,
}
X_train, y_train = fetch_california_housing(return_X_y=True)
X_train, y_train = fetch_california_housing(return_X_y=True, data_home="test")
try:
automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)
print(automl_experiment.predict(X_train))
@@ -153,6 +153,18 @@ class TestRegression(unittest.TestCase):
except ImportError:
return
# test pickle and load_pickle, should work for prediction
automl_experiment.pickle("automl_xgboost_spark.pkl")
automl_loaded = AutoML().load_pickle("automl_xgboost_spark.pkl")
assert automl_loaded.best_estimator == automl_experiment.best_estimator
assert automl_loaded.best_loss == automl_experiment.best_loss
automl_loaded.predict(X_train)
import shutil
shutil.rmtree("automl_xgboost_spark.pkl", ignore_errors=True)
shutil.rmtree("automl_xgboost_spark.pkl.flaml_artifacts", ignore_errors=True)
def test_sparse_matrix_regression_holdout(self):
X_train = scipy.sparse.random(8, 100)
y_train = np.random.uniform(size=8)
@@ -268,7 +280,7 @@ def test_reproducibility_of_regression_models(estimator: str):
"skip_transform": True,
"retrain_full": True,
}
X, y = fetch_california_housing(return_X_y=True, as_frame=True)
X, y = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
automl.fit(X_train=X, y_train=y, **automl_settings)
best_model = automl.model
assert best_model is not None
@@ -314,7 +326,7 @@ def test_reproducibility_of_catboost_regression_model():
"skip_transform": True,
"retrain_full": True,
}
X, y = fetch_california_housing(return_X_y=True, as_frame=True)
X, y = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
automl.fit(X_train=X, y_train=y, **automl_settings)
best_model = automl.model
assert best_model is not None
@@ -360,7 +372,7 @@ def test_reproducibility_of_lgbm_regression_model():
"skip_transform": True,
"retrain_full": True,
}
X, y = fetch_california_housing(return_X_y=True, as_frame=True)
X, y = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
automl.fit(X_train=X, y_train=y, **automl_settings)
best_model = automl.model
assert best_model is not None
@@ -424,7 +436,7 @@ def test_reproducibility_of_underlying_regression_models(estimator: str):
"skip_transform": True,
"retrain_full": False,
}
X, y = fetch_california_housing(return_X_y=True, as_frame=True)
X, y = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
automl.fit(X_train=X, y_train=y, **automl_settings)
best_model = automl.model
assert best_model is not None

View File

@@ -142,7 +142,7 @@ class TestScore:
def test_regression(self):
automl_experiment = AutoML()
X_train, y_train = fetch_california_housing(return_X_y=True)
X_train, y_train = fetch_california_housing(return_X_y=True, data_home="test")
n = int(len(y_train) * 9 // 10)
for each_estimator in [

View File

@@ -0,0 +1,89 @@
"""Test sklearn 1.7+ compatibility for estimator type detection.
This test ensures that FLAML estimators are properly recognized as
regressors or classifiers by sklearn's is_regressor() and is_classifier()
functions, which is required for sklearn 1.7+ ensemble methods.
"""
import pytest
from sklearn.base import is_classifier, is_regressor
from flaml.automl.model import (
ExtraTreesEstimator,
LGBMEstimator,
RandomForestEstimator,
XGBoostSklearnEstimator,
)
def test_extra_trees_regressor_type():
"""Test that ExtraTreesEstimator with regression task is recognized as regressor."""
est = ExtraTreesEstimator(task="regression")
assert is_regressor(est), "ExtraTreesEstimator(task='regression') should be recognized as a regressor"
assert not is_classifier(est), "ExtraTreesEstimator(task='regression') should not be recognized as a classifier"
def test_extra_trees_classifier_type():
"""Test that ExtraTreesEstimator with classification task is recognized as classifier."""
est = ExtraTreesEstimator(task="binary")
assert is_classifier(est), "ExtraTreesEstimator(task='binary') should be recognized as a classifier"
assert not is_regressor(est), "ExtraTreesEstimator(task='binary') should not be recognized as a regressor"
est = ExtraTreesEstimator(task="multiclass")
assert is_classifier(est), "ExtraTreesEstimator(task='multiclass') should be recognized as a classifier"
assert not is_regressor(est), "ExtraTreesEstimator(task='multiclass') should not be recognized as a regressor"
def test_random_forest_regressor_type():
"""Test that RandomForestEstimator with regression task is recognized as regressor."""
est = RandomForestEstimator(task="regression")
assert is_regressor(est), "RandomForestEstimator(task='regression') should be recognized as a regressor"
assert not is_classifier(est), "RandomForestEstimator(task='regression') should not be recognized as a classifier"
def test_random_forest_classifier_type():
"""Test that RandomForestEstimator with classification task is recognized as classifier."""
est = RandomForestEstimator(task="binary")
assert is_classifier(est), "RandomForestEstimator(task='binary') should be recognized as a classifier"
assert not is_regressor(est), "RandomForestEstimator(task='binary') should not be recognized as a regressor"
def test_lgbm_regressor_type():
"""Test that LGBMEstimator with regression task is recognized as regressor."""
est = LGBMEstimator(task="regression")
assert is_regressor(est), "LGBMEstimator(task='regression') should be recognized as a regressor"
assert not is_classifier(est), "LGBMEstimator(task='regression') should not be recognized as a classifier"
def test_lgbm_classifier_type():
"""Test that LGBMEstimator with classification task is recognized as classifier."""
est = LGBMEstimator(task="binary")
assert is_classifier(est), "LGBMEstimator(task='binary') should be recognized as a classifier"
assert not is_regressor(est), "LGBMEstimator(task='binary') should not be recognized as a regressor"
def test_xgboost_regressor_type():
"""Test that XGBoostSklearnEstimator with regression task is recognized as regressor."""
est = XGBoostSklearnEstimator(task="regression")
assert is_regressor(est), "XGBoostSklearnEstimator(task='regression') should be recognized as a regressor"
assert not is_classifier(est), "XGBoostSklearnEstimator(task='regression') should not be recognized as a classifier"
def test_xgboost_classifier_type():
"""Test that XGBoostSklearnEstimator with classification task is recognized as classifier."""
est = XGBoostSklearnEstimator(task="binary")
assert is_classifier(est), "XGBoostSklearnEstimator(task='binary') should be recognized as a classifier"
assert not is_regressor(est), "XGBoostSklearnEstimator(task='binary') should not be recognized as a regressor"
if __name__ == "__main__":
# Run all tests
test_extra_trees_regressor_type()
test_extra_trees_classifier_type()
test_random_forest_regressor_type()
test_random_forest_classifier_type()
test_lgbm_regressor_type()
test_lgbm_classifier_type()
test_xgboost_regressor_type()
test_xgboost_classifier_type()
print("All sklearn 1.7+ compatibility tests passed!")

View File

@@ -30,7 +30,7 @@ class TestTrainingLog(unittest.TestCase):
"keep_search_state": True,
"estimator_list": estimator_list,
}
X_train, y_train = fetch_california_housing(return_X_y=True)
X_train, y_train = fetch_california_housing(return_X_y=True, data_home="test")
automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
# Check if the training log file is populated.
self.assertTrue(os.path.exists(filename))

View File

@@ -108,7 +108,14 @@ class TestWarmStart(unittest.TestCase):
def test_FLAML_sample_size_in_starting_points(self):
from minio.error import ServerError
from openml.exceptions import OpenMLServerException
try:
from openml.exceptions import OpenMLServerException
except ImportError:
class OpenMLServerException(Exception):
pass
from requests.exceptions import ChunkedEncodingError, SSLError
from flaml import AutoML

BIN
test/cal_housing_py3.pkz Normal file

Binary file not shown.

60
test/check_dependency.py Normal file
View File

@@ -0,0 +1,60 @@
import subprocess
from importlib.metadata import distributions
installed_libs = sorted(f"{dist.metadata['Name']}=={dist.version}" for dist in distributions())
first_tier_dependencies = [
"numpy",
"jupyter",
"lightgbm",
"xgboost",
"scipy",
"pandas",
"scikit-learn",
"thop",
"pytest",
"pytest-rerunfailures",
"coverage",
"pre-commit",
"torch",
"torchvision",
"catboost",
"rgf-python",
"optuna",
"openml",
"statsmodels",
"psutil",
"dataclasses",
"transformers[torch]",
"transformers",
"datasets",
"evaluate",
"nltk",
"rouge_score",
"hcrystalball",
"seqeval",
"pytorch-forecasting",
"mlflow-skinny",
"joblibspark",
"joblib",
"nbconvert",
"nbformat",
"ipykernel",
"pytorch-lightning",
"tensorboardX",
"requests",
"packaging",
"dill",
"ray",
"prophet",
]
for lib in installed_libs:
lib_name = lib.split("==")[0]
if lib_name in first_tier_dependencies:
print(lib)
# print current commit hash
commit_hash = subprocess.check_output(["git", "rev-parse", "HEAD"]).decode("utf-8").strip()
print(f"Current commit hash: {commit_hash}")

View File

@@ -2,11 +2,24 @@ from typing import Any, Dict, List, Union
import numpy as np
import pandas as pd
from catboost import CatBoostClassifier, CatBoostRegressor, Pool
import pytest
from sklearn.metrics import f1_score, r2_score
try:
from catboost import CatBoostClassifier, CatBoostRegressor, Pool
except ImportError: # pragma: no cover
CatBoostClassifier = None
CatBoostRegressor = None
Pool = None
def evaluate_cv_folds_with_underlying_model(X_train_all, y_train_all, kf, model: Any, task: str) -> pd.DataFrame:
def _is_catboost_model_type(model_type: type) -> bool:
if CatBoostClassifier is not None and CatBoostRegressor is not None:
return model_type is CatBoostClassifier or model_type is CatBoostRegressor
return getattr(model_type, "__module__", "").startswith("catboost")
def evaluate_cv_folds_with_underlying_model(X_train_all, y_train_all, kf, model: Any, task: str) -> List[float]:
"""Mimic the FLAML CV process to calculate the metrics across each fold.
:param X_train_all: X training data
@@ -17,7 +30,7 @@ def evaluate_cv_folds_with_underlying_model(X_train_all, y_train_all, kf, model:
:return: An array containing the metrics
"""
rng = np.random.RandomState(2020)
all_fold_metrics: List[Dict[str, Union[int, float]]] = []
all_fold_metrics: List[float] = []
for train_index, val_index in kf.split(X_train_all, y_train_all):
X_train_split, y_train_split = X_train_all, y_train_all
train_index = rng.permutation(train_index)
@@ -25,9 +38,11 @@ def evaluate_cv_folds_with_underlying_model(X_train_all, y_train_all, kf, model:
X_val = X_train_split.iloc[val_index]
y_train, y_val = y_train_split[train_index], y_train_split[val_index]
model_type = type(model)
if model_type is not CatBoostClassifier and model_type is not CatBoostRegressor:
if not _is_catboost_model_type(model_type):
model.fit(X_train, y_train)
else:
if Pool is None:
pytest.skip("catboost is not installed")
use_best_model = True
n = max(int(len(y_train) * 0.9), len(y_train) - 1000) if use_best_model else len(y_train)
X_tr, y_tr = (X_train)[:n], y_train[:n]
@@ -38,5 +53,5 @@ def evaluate_cv_folds_with_underlying_model(X_train_all, y_train_all, kf, model:
reproduced_metric = 1 - f1_score(y_val, y_pred_classes)
else:
reproduced_metric = 1 - r2_score(y_val, y_pred_classes)
all_fold_metrics.append(reproduced_metric)
all_fold_metrics.append(float(reproduced_metric))
return all_fold_metrics

View File

@@ -60,7 +60,7 @@ def test_housing(as_frame=True):
"starting_points": "data",
"max_iter": 0,
}
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=as_frame)
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=as_frame, data_home="test")
automl.fit(X_train, y_train, **automl_settings)
@@ -115,7 +115,7 @@ def test_suggest_classification():
def test_suggest_regression():
location = "test/default"
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True)
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
suggested = suggest_hyperparams("regression", X_train, y_train, "lgbm", location=location)
print(suggested)
suggested = preprocess_and_suggest_hyperparams("regression", X_train, y_train, "xgboost", location=location)
@@ -137,7 +137,7 @@ def test_rf():
print(rf)
location = "test/default"
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True)
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
rf = RandomForestRegressor(default_location=location)
rf.fit(X_train[:100], y_train[:100])
rf.predict(X_train)
@@ -155,7 +155,7 @@ def test_extratrees():
print(classifier)
location = "test/default"
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True)
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
regressor = ExtraTreesRegressor(default_location=location)
regressor.fit(X_train[:100], y_train[:100])
regressor.predict(X_train)
@@ -175,7 +175,7 @@ def test_lgbm():
print(classifier.classes_)
location = "test/default"
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True)
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
regressor = LGBMRegressor(default_location=location)
regressor.fit(X_train, y_train)
regressor.predict(X_train)
@@ -183,6 +183,8 @@ def test_lgbm():
def test_xgboost():
import numpy as np
from flaml.default import XGBClassifier, XGBRegressor
X_train, y_train = load_breast_cancer(return_X_y=True, as_frame=True)
@@ -194,12 +196,71 @@ def test_xgboost():
print(classifier.classes_)
location = "test/default"
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True)
X_train, y_train = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
regressor = XGBRegressor(default_location=location)
regressor.fit(X_train[:100], y_train[:100])
regressor.predict(X_train)
print(regressor)
# Test eval_set with categorical features (Issue: eval_set not preprocessed)
np.random.seed(42)
n = 500
df = pd.DataFrame(
{
"num1": np.random.randn(n),
"num2": np.random.rand(n) * 10,
"cat1": np.random.choice(["A", "B", "C"], size=n),
"cat2": np.random.choice(["X", "Y"], size=n),
"target": np.random.choice([0, 1], size=n),
}
)
X = df.drop(columns="target")
y = df["target"]
X_train_cat, X_valid_cat, y_train_cat, y_valid_cat = train_test_split(X, y, test_size=0.2, random_state=0)
# Convert categorical columns to pandas 'category' dtype
for col in X_train_cat.select_dtypes(include="object").columns:
X_train_cat[col] = X_train_cat[col].astype("category")
X_valid_cat[col] = X_valid_cat[col].astype("category")
# Test XGBClassifier with eval_set
classifier_eval = XGBClassifier(
tree_method="hist",
enable_categorical=True,
eval_metric="logloss",
use_label_encoder=False,
early_stopping_rounds=10,
random_state=0,
n_estimators=10,
)
classifier_eval.fit(X_train_cat, y_train_cat, eval_set=[(X_valid_cat, y_valid_cat)], verbose=False)
y_pred = classifier_eval.predict(X_valid_cat)
assert len(y_pred) == len(y_valid_cat)
# Test XGBRegressor with eval_set
y_reg = df["num1"] # Use num1 as target for regression
X_reg = df.drop(columns=["num1", "target"])
X_train_reg, X_valid_reg, y_train_reg, y_valid_reg = train_test_split(X_reg, y_reg, test_size=0.2, random_state=0)
for col in X_train_reg.select_dtypes(include="object").columns:
X_train_reg[col] = X_train_reg[col].astype("category")
X_valid_reg[col] = X_valid_reg[col].astype("category")
regressor_eval = XGBRegressor(
tree_method="hist",
enable_categorical=True,
eval_metric="rmse",
early_stopping_rounds=10,
random_state=0,
n_estimators=10,
)
regressor_eval.fit(X_train_reg, y_train_reg, eval_set=[(X_valid_reg, y_valid_reg)], verbose=False)
y_pred = regressor_eval.predict(X_valid_reg)
assert len(y_pred) == len(y_valid_reg)
def test_nobudget():
X_train, y_train = load_breast_cancer(return_X_y=True, as_frame=True)

View File

@@ -3,6 +3,12 @@ import shutil
import sys
import pytest
try:
import transformers
except ImportError:
pytest.skip("transformers not installed", allow_module_level=True)
from utils import (
get_automl_settings,
get_toy_data_binclassification,
@@ -24,6 +30,8 @@ model_path_list = [
if sys.platform.startswith("darwin") and sys.version_info[0] == 3 and sys.version_info[1] == 11:
pytest.skip("skipping Python 3.11 on MacOS", allow_module_level=True)
pytestmark = pytest.mark.spark # set to spark as parallel testing raised RuntimeError
def test_switch_1_1():
data_idx, model_path_idx = 0, 0

View File

@@ -5,8 +5,20 @@ import sys
import pytest
from utils import get_automl_settings, get_toy_data_seqclassification
try:
import transformers
@pytest.mark.skipif(sys.platform in ["darwin", "win32"], reason="do not run on mac os or windows")
_transformers_installed = True
except ImportError:
_transformers_installed = False
pytestmark = pytest.mark.spark # set to spark as parallel testing raised MlflowException of changing parameter
@pytest.mark.skipif(
sys.platform in ["darwin", "win32"] or not _transformers_installed,
reason="do not run on mac os or windows or transformers not installed",
)
def test_cv():
import requests

View File

@@ -5,8 +5,18 @@ import sys
import pytest
from utils import get_automl_settings, get_toy_data_multiplechoiceclassification
try:
import transformers
@pytest.mark.skipif(sys.platform in ["darwin", "win32"], reason="do not run on mac os or windows")
_transformers_installed = True
except ImportError:
_transformers_installed = False
@pytest.mark.skipif(
sys.platform in ["darwin", "win32"] or not _transformers_installed,
reason="do not run on mac os or windows or transformers not installed",
)
def test_mcc():
import requests

View File

@@ -7,8 +7,24 @@ from utils import get_automl_settings, get_toy_data_seqclassification
from flaml.default import portfolio
if sys.platform.startswith("darwin") and sys.version_info[0] == 3 and sys.version_info[1] == 11:
pytest.skip("skipping Python 3.11 on MacOS", allow_module_level=True)
try:
import transformers
_transformers_installed = True
except ImportError:
_transformers_installed = False
if (
sys.platform.startswith("darwin")
and sys.version_info >= (3, 11)
or not _transformers_installed
or sys.platform == "win32"
):
pytest.skip("skipping Python 3.11 on MacOS or without transformers or on Windows", allow_module_level=True)
pytestmark = (
pytest.mark.spark
) # set to spark as parallel testing raised ValueError: Feature NonExisting not implemented.
def pop_args(fit_kwargs):
@@ -24,23 +40,34 @@ def test_build_portfolio(path="./test/nlp/default", strategy="greedy"):
portfolio.main()
@pytest.mark.skipif(sys.platform == "win32", reason="do not run on windows")
def test_starting_point_not_in_search_space():
from flaml import AutoML
"""Regression test for invalid starting points and custom_hp.
This test must not require network access to Hugging Face.
"""
"""
test starting_points located outside of the search space, and custom_hp is not set
"""
from flaml.automl.state import SearchState
from flaml.automl.task.factory import task_factory
this_estimator_name = "transformer"
X_train, y_train, X_val, y_val, _ = get_toy_data_seqclassification()
X_train, y_train, _, _, _ = get_toy_data_seqclassification()
task = task_factory("seq-classification", X_train, y_train)
estimator_class = task.estimator_class_from_str(this_estimator_name)
estimator_class.init()
automl = AutoML()
automl_settings = get_automl_settings(estimator_name=this_estimator_name)
automl_settings["starting_points"] = {this_estimator_name: [{"learning_rate": 2e-3}]}
automl.fit(X_train, y_train, **automl_settings)
assert automl._search_states[this_estimator_name].init_config[0]["learning_rate"] != 2e-3
# SearchState is where invalid starting points are filtered out when max_iter > 1.
search_state = SearchState(
learner_class=estimator_class,
data=X_train,
task=task,
starting_point={"learning_rate": 2e-3},
max_iter=3,
budget=10,
)
assert search_state.init_config and search_state.init_config[0].get("learning_rate") != 2e-3
"""
test starting_points located outside of the search space, and custom_hp is set
@@ -48,39 +75,60 @@ def test_starting_point_not_in_search_space():
from flaml import tune
X_train, y_train, X_val, y_val, _ = get_toy_data_seqclassification()
X_train, y_train, _, _, _ = get_toy_data_seqclassification()
this_estimator_name = "transformer_ms"
automl = AutoML()
automl_settings = get_automl_settings(estimator_name=this_estimator_name)
task = task_factory("seq-classification", X_train, y_train)
estimator_class = task.estimator_class_from_str(this_estimator_name)
estimator_class.init()
automl_settings["custom_hp"] = {
this_estimator_name: {
"model_path": {
"domain": "albert-base-v2",
},
"learning_rate": {
"domain": tune.choice([1e-4, 1e-5]),
},
"per_device_train_batch_size": {
"domain": 2,
},
}
custom_hp = {
"model_path": {
"domain": "albert-base-v2",
},
"learning_rate": {
"domain": tune.choice([1e-4, 1e-5]),
},
"per_device_train_batch_size": {
"domain": 2,
},
}
automl_settings["starting_points"] = "data:test/nlp/default/"
automl.fit(X_train, y_train, **automl_settings)
assert len(automl._search_states[this_estimator_name].init_config[0]) == len(
automl._search_states[this_estimator_name]._search_space_domain
) - len(automl_settings["custom_hp"][this_estimator_name]), (
# Simulate a suggested starting point (e.g. from portfolio) which becomes invalid
# after custom_hp constrains the space.
invalid_starting_points = [
{
"learning_rate": 1e-5,
"num_train_epochs": 1.0,
"per_device_train_batch_size": 8,
"seed": 43,
"global_max_steps": 100,
"model_path": "google/electra-base-discriminator",
}
]
search_state = SearchState(
learner_class=estimator_class,
data=X_train,
task=task,
starting_point=invalid_starting_points,
custom_hp=custom_hp,
max_iter=3,
budget=10,
)
assert search_state.init_config, "Expected a non-empty init_config list"
init_config0 = search_state.init_config[0]
assert init_config0 is not None
assert len(init_config0) == len(search_state._search_space_domain) - len(custom_hp), (
"The search space is updated with the custom_hp on {} hyperparameters of "
"the specified estimator without an initial value. Thus a valid init config "
"should only contain the cardinality of the search space minus {}".format(
len(automl_settings["custom_hp"][this_estimator_name]),
len(automl_settings["custom_hp"][this_estimator_name]),
len(custom_hp),
len(custom_hp),
)
)
assert automl._search_states[this_estimator_name].search_space["model_path"] == "albert-base-v2"
assert search_state.search_space["model_path"] == "albert-base-v2"
if os.path.exists("test/data/output/"):
try:
@@ -89,7 +137,6 @@ def test_starting_point_not_in_search_space():
print("PermissionError when deleting test/data/output/")
@pytest.mark.skipif(sys.platform == "win32", reason="do not run on windows")
def test_points_to_evaluate():
from flaml import AutoML
@@ -102,7 +149,13 @@ def test_points_to_evaluate():
automl_settings["custom_hp"] = {"transformer_ms": {"model_path": {"domain": "google/electra-small-discriminator"}}}
automl.fit(X_train, y_train, **automl_settings)
try:
automl.fit(X_train, y_train, **automl_settings)
except OSError as e:
message = str(e)
if "Too Many Requests" in message or "rate limit" in message.lower():
pytest.skip(f"Skipping HF model load/training: {message}")
raise
if os.path.exists("test/data/output/"):
try:
@@ -112,7 +165,6 @@ def test_points_to_evaluate():
# TODO: implement _test_zero_shot_model
@pytest.mark.skipif(sys.platform == "win32", reason="do not run on windows")
def test_zero_shot_nomodel():
from flaml.default import preprocess_and_suggest_hyperparams
@@ -137,7 +189,14 @@ def test_zero_shot_nomodel():
fit_kwargs = automl_settings.pop("fit_kwargs_by_estimator", {}).get(estimator_name)
fit_kwargs.update(automl_settings)
pop_args(fit_kwargs)
model.fit(X_train, y_train, **fit_kwargs)
try:
model.fit(X_train, y_train, **fit_kwargs)
except OSError as e:
message = str(e)
if "Too Many Requests" in message or "rate limit" in message.lower():
pytest.skip(f"Skipping HF model load/training: {message}")
raise
if os.path.exists("test/data/output/"):
try:

View File

@@ -7,7 +7,7 @@ from sklearn.model_selection import train_test_split
from flaml import tune
from flaml.automl.model import LGBMEstimator
data = fetch_california_housing(return_X_y=False, as_frame=True)
data = fetch_california_housing(return_X_y=False, as_frame=True, data_home="test")
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
X_train_ref = ray.put(X_train)

View File

@@ -11,7 +11,7 @@ automl_settings = {
"task": "regression",
"log_file_name": "test/california.log",
}
X_train, y_train = fetch_california_housing(return_X_y=True)
X_train, y_train = fetch_california_housing(return_X_y=True, data_home="test")
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
print(automl.model)

View File

@@ -1,13 +1,17 @@
import atexit
import os
import sys
import warnings
import mlflow
import numpy as np
import pytest
import sklearn.datasets as skds
from packaging.version import Version
from flaml import AutoML
from flaml.automl.data import auto_convert_dtypes_pandas, auto_convert_dtypes_spark, get_random_dataframe
from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
from flaml.tune.spark.utils import check_spark
warnings.simplefilter(action="ignore")
@@ -27,7 +31,7 @@ else:
.config(
"spark.jars.packages",
(
"com.microsoft.azure:synapseml_2.12:1.0.4,"
"com.microsoft.azure:synapseml_2.12:1.1.0,"
"org.apache.hadoop:hadoop-azure:3.3.5,"
"com.microsoft.azure:azure-storage:8.6.6,"
f"org.mlflow:mlflow-spark_2.12:{mlflow.__version__}"
@@ -53,15 +57,25 @@ else:
except ImportError:
skip_spark = True
spark, ansi_conf, adjusted = disable_spark_ansi_mode()
atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
if sys.version_info >= (3, 11):
skip_py311 = True
else:
skip_py311 = False
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
def _test_spark_synapseml_lightgbm(spark=None, task="classification"):
# TODO: remove the estimator assignment once SynapseML supports spark 4+.
from flaml.automl.spark.utils import _spark_major_minor_version
if _spark_major_minor_version[0] >= 4:
# skip synapseml lightgbm test for spark 4+
return
if task == "classification":
metric = "accuracy"
X_train, y_train = skds.load_iris(return_X_y=True, as_frame=True)
@@ -151,27 +165,32 @@ def test_spark_synapseml_rank():
_test_spark_synapseml_lightgbm(spark, "rank")
def test_spark_input_df():
df = (
spark.read.format("csv")
.option("header", True)
.option("inferSchema", True)
.load("wasbs://publicwasb@mmlspark.blob.core.windows.net/company_bankruptcy_prediction_data.csv")
)
def test_spark_input_df_and_pickle():
import pandas as pd
file_url = "https://mmlspark.blob.core.windows.net/publicwasb/company_bankruptcy_prediction_data.csv"
df = pd.read_csv(file_url)
df = spark.createDataFrame(df)
train, test = df.randomSplit([0.8, 0.2], seed=1)
feature_cols = df.columns[1:]
featurizer = VectorAssembler(inputCols=feature_cols, outputCol="features")
train_data = featurizer.transform(train)["Bankrupt?", "features"]
test_data = featurizer.transform(test)["Bankrupt?", "features"]
automl = AutoML()
# TODO: remove the estimator assignment once SynapseML supports spark 4+.
from flaml.automl.spark.utils import _spark_major_minor_version
estimator_list = ["rf_spark"] if _spark_major_minor_version[0] >= 4 else None
settings = {
"time_budget": 30, # total running time in seconds
"metric": "roc_auc",
# "estimator_list": ["lgbm_spark"], # list of ML learners; we tune lightgbm in this example
"task": "classification", # task type
"log_file_name": "flaml_experiment.log", # flaml log file
"seed": 7654321, # random seed
"eval_method": "holdout",
"estimator_list": estimator_list, # TODO: remove once SynapseML supports spark 4+
}
df = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))
@@ -182,6 +201,22 @@ def test_spark_input_df():
**settings,
)
# test pickle and load_pickle, should work for prediction
automl.pickle("automl_spark.pkl")
automl_loaded = AutoML().load_pickle("automl_spark.pkl")
assert automl_loaded.best_estimator == automl.best_estimator
assert automl_loaded.best_loss == automl.best_loss
automl_loaded.predict(df)
automl_loaded.model.estimator.transform(test_data)
import shutil
shutil.rmtree("automl_spark.pkl", ignore_errors=True)
shutil.rmtree("automl_spark.pkl.flaml_artifacts", ignore_errors=True)
if estimator_list == ["rf_spark"]:
return
try:
model = automl.model.estimator
predictions = model.transform(test_data)
@@ -296,11 +331,88 @@ def _test_spark_large_df():
print("time cost in minutes: ", (end_time - start_time) / 60)
def test_get_random_dataframe():
# Test with default parameters
df = get_random_dataframe(n_rows=50, ratio_none=0.2, seed=123)
assert df.shape == (50, 14) # Default is 200 rows and 14 columns
# Test column types
assert "timestamp" in df.columns and np.issubdtype(df["timestamp"].dtype, np.datetime64)
assert "id" in df.columns and np.issubdtype(df["id"].dtype, np.integer)
assert "score" in df.columns and np.issubdtype(df["score"].dtype, np.floating)
assert "category" in df.columns and df["category"].dtype.name == "category"
def test_auto_convert_dtypes_pandas():
# Create a test DataFrame with various types
import pandas as pd
test_df = pd.DataFrame(
{
"int_col": ["1", "2", "3", "4", "5", "6", "6"],
"float_col": ["1.1", "2.2", "3.3", "NULL", "5.5", "6.6", "6.6"],
"date_col": ["2021-01-01", "2021-02-01", "NA", "2021-04-01", "2021-05-01", "2021-06-01", "2021-06-01"],
"cat_col": ["A", "B", "A", "A", "B", "A", "B"],
"string_col": ["text1", "text2", "text3", "text4", "text5", "text6", "text7"],
}
)
# Convert dtypes
converted_df, schema = auto_convert_dtypes_pandas(test_df)
# Check conversions
assert schema["int_col"] == "int"
assert schema["float_col"] == "double"
assert schema["date_col"] == "timestamp"
assert schema["cat_col"] == "category"
assert schema["string_col"] == "string"
def test_auto_convert_dtypes_spark():
"""Test auto_convert_dtypes_spark function with various data types."""
import pandas as pd
# Create a test DataFrame with various types
test_pdf = pd.DataFrame(
{
"int_col": ["1", "2", "3", "4", "NA"],
"float_col": ["1.1", "2.2", "3.3", "NULL", "5.5"],
"date_col": ["2021-01-01", "2021-02-01", "NA", "2021-04-01", "2021-05-01"],
"cat_col": ["A", "B", "A", "C", "B"],
"string_col": ["text1", "text2", "text3", "text4", "text5"],
}
)
# Convert pandas DataFrame to Spark DataFrame
test_df = spark.createDataFrame(test_pdf)
# Convert dtypes
converted_df, schema = auto_convert_dtypes_spark(test_df)
# Check conversions
assert schema["int_col"] == "int"
assert schema["float_col"] == "double"
assert schema["date_col"] == "timestamp"
assert schema["cat_col"] == "string" # Conceptual category in schema
assert schema["string_col"] == "string"
# Verify the actual data types from the Spark DataFrame
spark_dtypes = dict(converted_df.dtypes)
assert spark_dtypes["int_col"] == "int"
assert spark_dtypes["float_col"] == "double"
assert spark_dtypes["date_col"] == "timestamp"
assert spark_dtypes["cat_col"] == "string" # In Spark, categories are still strings
assert spark_dtypes["string_col"] == "string"
if __name__ == "__main__":
test_spark_synapseml_classification()
test_spark_synapseml_regression()
test_spark_synapseml_rank()
test_spark_input_df()
# test_spark_synapseml_classification()
# test_spark_synapseml_regression()
# test_spark_synapseml_rank()
test_spark_input_df_and_pickle()
# test_get_random_dataframe()
# test_auto_convert_dtypes_pandas()
# test_auto_convert_dtypes_spark()
# import cProfile
# import pstats

View File

@@ -25,13 +25,13 @@ os.environ["FLAML_MAX_CONCURRENT"] = "2"
spark_available, _ = check_spark()
skip_spark = not spark_available
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
def test_parallel_xgboost(hpo_method=None, data_size=1000):
def test_parallel_xgboost_and_pickle(hpo_method=None, data_size=1000):
automl_experiment = AutoML()
automl_settings = {
"time_budget": 10,
"time_budget": 30,
"metric": "ap",
"task": "classification",
"log_file_name": "test/sparse_classification.log",
@@ -53,15 +53,27 @@ def test_parallel_xgboost(hpo_method=None, data_size=1000):
print(automl_experiment.best_iteration)
print(automl_experiment.best_estimator)
# test pickle and load_pickle, should work for prediction
automl_experiment.pickle("automl_xgboost_spark.pkl")
automl_loaded = AutoML().load_pickle("automl_xgboost_spark.pkl")
assert automl_loaded.best_estimator == automl_experiment.best_estimator
assert automl_loaded.best_loss == automl_experiment.best_loss
automl_loaded.predict(X_train)
import shutil
shutil.rmtree("automl_xgboost_spark.pkl", ignore_errors=True)
shutil.rmtree("automl_xgboost_spark.pkl.flaml_artifacts", ignore_errors=True)
def test_parallel_xgboost_others():
# use random search as the hpo_method
test_parallel_xgboost(hpo_method="random")
test_parallel_xgboost_and_pickle(hpo_method="random")
@pytest.mark.skip(reason="currently not supporting too large data, will support spark dataframe in the future")
def test_large_dataset():
test_parallel_xgboost(data_size=90000000)
test_parallel_xgboost_and_pickle(data_size=90000000)
@pytest.mark.skipif(
@@ -95,10 +107,10 @@ def test_custom_learner(data_size=1000):
if __name__ == "__main__":
test_parallel_xgboost()
test_parallel_xgboost_others()
# test_large_dataset()
if skip_my_learner:
print("please run pytest in the root directory of FLAML, i.e., the directory that contains the setup.py file")
else:
test_custom_learner()
test_parallel_xgboost_and_pickle()
# test_parallel_xgboost_others()
# # test_large_dataset()
# if skip_my_learner:
# print("please run pytest in the root directory of FLAML, i.e., the directory that contains the setup.py file")
# else:
# test_custom_learner()

View File

@@ -1,6 +1,7 @@
import os
import unittest
import pytest
from sklearn.datasets import load_wine
from flaml import AutoML
@@ -24,6 +25,8 @@ if os.path.exists(os.path.join(os.getcwd(), "test", "spark", "custom_mylearner.p
else:
skip_my_learner = True
pytestmark = pytest.mark.spark
class TestEnsemble(unittest.TestCase):
def setUp(self) -> None:

View File

@@ -9,7 +9,7 @@ from flaml.tune.spark.utils import check_spark
spark_available, _ = check_spark()
skip_spark = not spark_available
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
os.environ["FLAML_MAX_CONCURRENT"] = "2"
@@ -22,7 +22,7 @@ def base_automl(n_concurrent_trials=1, use_ray=False, use_spark=False, verbose=0
except (ServerError, Exception):
from sklearn.datasets import fetch_california_housing
X_train, y_train = fetch_california_housing(return_X_y=True)
X_train, y_train = fetch_california_housing(return_X_y=True, data_home="test")
automl = AutoML()
settings = {
"time_budget": 3, # total running time in seconds

View File

@@ -1,3 +1,4 @@
import atexit
import importlib
import os
import sys
@@ -13,6 +14,7 @@ from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
import flaml
from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
from flaml.automl.spark.utils import to_pandas_on_spark
try:
@@ -21,6 +23,7 @@ try:
from pyspark.ml.feature import VectorAssembler
except ImportError:
pass
pytestmark = pytest.mark.spark
warnings.filterwarnings("ignore")
skip_spark = importlib.util.find_spec("pyspark") is None
@@ -119,6 +122,29 @@ def _check_mlflow_logging(possible_num_runs, metric, is_parent_run, experiment_i
# mlflow.delete_experiment(experiment_id)
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_automl_nonsparkdata_noautolog_noparentrun():
experiment_id = _test_automl_nonsparkdata(is_autolog=False, is_parent_run=False)
_check_mlflow_logging(0, "r2", False, experiment_id, is_automl=True) # no logging
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_automl_sparkdata_noautolog_noparentrun():
experiment_id = _test_automl_sparkdata(is_autolog=False, is_parent_run=False)
_check_mlflow_logging(0, "mse", False, experiment_id, is_automl=True) # no logging
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_tune_noautolog_noparentrun_parallel():
experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=True)
_check_mlflow_logging(0, "r2", False, experiment_id)
def test_tune_noautolog_noparentrun_nonparallel():
experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=False)
_check_mlflow_logging(3, "r2", False, experiment_id, skip_tags=True)
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_tune_autolog_parentrun_parallel():
experiment_id = _test_tune(is_autolog=True, is_parent_run=True, is_parallel=True)
@@ -130,6 +156,16 @@ def test_tune_autolog_parentrun_nonparallel():
_check_mlflow_logging(3, "r2", True, experiment_id)
def test_tune_autolog_noparentrun_nonparallel():
experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=False)
_check_mlflow_logging(3, "r2", False, experiment_id)
def test_tune_noautolog_parentrun_nonparallel():
experiment_id = _test_tune(is_autolog=False, is_parent_run=True, is_parallel=False)
_check_mlflow_logging(3, "r2", True, experiment_id)
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_tune_autolog_noparentrun_parallel():
experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=True)
@@ -142,28 +178,12 @@ def test_tune_noautolog_parentrun_parallel():
_check_mlflow_logging([4, 3], "r2", True, experiment_id)
def test_tune_autolog_noparentrun_nonparallel():
experiment_id = _test_tune(is_autolog=True, is_parent_run=False, is_parallel=False)
_check_mlflow_logging(3, "r2", False, experiment_id)
def test_tune_noautolog_parentrun_nonparallel():
experiment_id = _test_tune(is_autolog=False, is_parent_run=True, is_parallel=False)
_check_mlflow_logging(3, "r2", True, experiment_id)
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_tune_noautolog_noparentrun_parallel():
experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=True)
_check_mlflow_logging(0, "r2", False, experiment_id)
def test_tune_noautolog_noparentrun_nonparallel():
experiment_id = _test_tune(is_autolog=False, is_parent_run=False, is_parallel=False)
_check_mlflow_logging(3, "r2", False, experiment_id, skip_tags=True)
def _test_automl_sparkdata(is_autolog, is_parent_run):
# TODO: remove the estimator assignment once SynapseML supports spark 4+.
from flaml.automl.spark.utils import _spark_major_minor_version
estimator_list = ["rf_spark"] if _spark_major_minor_version[0] >= 4 else None
mlflow.end_run()
mlflow_exp_name = f"test_mlflow_integration_{int(time.time())}"
mlflow_experiment = mlflow.set_experiment(mlflow_exp_name)
@@ -174,6 +194,9 @@ def _test_automl_sparkdata(is_autolog, is_parent_run):
if is_parent_run:
mlflow.start_run(run_name=f"automl_sparkdata_autolog_{is_autolog}")
spark = pyspark.sql.SparkSession.builder.getOrCreate()
spark, ansi_conf, adjusted = disable_spark_ansi_mode()
atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
pd_df = load_diabetes(as_frame=True).frame
df = spark.createDataFrame(pd_df)
df = df.repartition(4).cache()
@@ -192,6 +215,7 @@ def _test_automl_sparkdata(is_autolog, is_parent_run):
"log_type": "all",
"n_splits": 2,
"model_history": True,
"estimator_list": estimator_list,
}
df = to_pandas_on_spark(to_pandas_on_spark(train_data).to_spark(index_col="index"))
automl.fit(
@@ -251,12 +275,6 @@ def test_automl_sparkdata_noautolog_parentrun():
_check_mlflow_logging(3, "mse", True, experiment_id, is_automl=True)
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_automl_sparkdata_noautolog_noparentrun():
experiment_id = _test_automl_sparkdata(is_autolog=False, is_parent_run=False)
_check_mlflow_logging(0, "mse", False, experiment_id, is_automl=True) # no logging
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_automl_nonsparkdata_autolog_parentrun():
experiment_id = _test_automl_nonsparkdata(is_autolog=True, is_parent_run=True)
@@ -275,12 +293,6 @@ def test_automl_nonsparkdata_noautolog_parentrun():
_check_mlflow_logging([4, 3], "r2", True, experiment_id, is_automl=True)
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_automl_nonsparkdata_noautolog_noparentrun():
experiment_id = _test_automl_nonsparkdata(is_autolog=False, is_parent_run=False)
_check_mlflow_logging(0, "r2", False, experiment_id, is_automl=True) # no logging
@pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
def test_exit_pyspark_autolog():
import pyspark
@@ -318,6 +330,9 @@ def _init_spark_for_main():
"https://mmlspark.blob.core.windows.net/publicwasb/log_model_allowlist.txt",
)
spark, ansi_conf, adjusted = disable_spark_ansi_mode()
atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
if __name__ == "__main__":
_init_spark_for_main()

View File

@@ -2,6 +2,7 @@ import os
import unittest
import numpy as np
import pytest
import scipy.sparse
from sklearn.datasets import load_iris, load_wine
@@ -12,6 +13,7 @@ from flaml.tune.spark.utils import check_spark
spark_available, _ = check_spark()
skip_spark = not spark_available
pytestmark = pytest.mark.spark
os.environ["FLAML_MAX_CONCURRENT"] = "2"
@@ -260,7 +262,11 @@ class TestMultiClass(unittest.TestCase):
"n_concurrent_trials": 2,
"use_spark": True,
}
X_train = scipy.sparse.random(1554, 21, dtype=int)
# NOTE: Avoid `dtype=int` here. On some NumPy/SciPy combinations (notably
# Windows + Python 3.13), `scipy.sparse.random(..., dtype=int)` may trigger
# integer sampling paths which raise "low is out of bounds for int32".
# A float sparse matrix is sufficient to validate sparse-input support.
X_train = scipy.sparse.random(1554, 21, dtype=np.float32)
y_train = np.random.randint(3, size=1554)
automl_experiment.fit(X_train=X_train, y_train=y_train, **automl_settings)
print(automl_experiment.classes_)

View File

@@ -9,7 +9,7 @@ from flaml.tune.spark.utils import check_spark
spark_available, _ = check_spark()
skip_spark = not spark_available
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
here = os.path.abspath(os.path.dirname(__file__))
os.environ["FLAML_MAX_CONCURRENT"] = "2"

View File

@@ -25,7 +25,7 @@ try:
except ImportError:
skip_spark = True
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
def test_overtime():

View File

@@ -2,8 +2,23 @@ import os
import sys
import pytest
from minio.error import ServerError
from openml.exceptions import OpenMLServerException
try:
from minio.error import ServerError
except ImportError:
class ServerError(Exception):
pass
try:
from openml.exceptions import OpenMLServerException
except ImportError:
class OpenMLServerException(Exception):
pass
from requests.exceptions import ChunkedEncodingError, SSLError
from flaml.tune.spark.utils import check_spark
@@ -11,19 +26,19 @@ from flaml.tune.spark.utils import check_spark
spark_available, _ = check_spark()
skip_spark = not spark_available
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
os.environ["FLAML_MAX_CONCURRENT"] = "2"
def run_automl(budget=3, dataset_format="dataframe", hpo_method=None):
def run_automl(budget=30, dataset_format="dataframe", hpo_method=None):
import urllib3
from flaml.automl.data import load_openml_dataset
performance_check_budget = 3600
if sys.platform == "darwin" or "nt" in os.name or "3.10" not in sys.version:
budget = 3 # revise the buget if the platform is not linux + python 3.10
budget = 30 # revise the buget if the platform is not linux + python 3.10
if budget >= performance_check_budget:
max_iter = 60
performance_check_budget = None
@@ -76,6 +91,11 @@ def run_automl(budget=3, dataset_format="dataframe", hpo_method=None):
print("Best ML leaner:", automl.best_estimator)
print("Best hyperparmeter config:", automl.best_config)
print(f"Best accuracy on validation data: {1 - automl.best_loss:.4g}")
if performance_check_budget is not None and automl.best_estimator is None:
# skip the performance check if no model is trained
# this happens sometimes in github actions ubuntu python 3.12 environment
print("Warning: no model is trained, skip performance check")
return
print(f"Training duration of best run: {automl.best_config_train_time:.4g} s")
print(automl.model.estimator)
print(automl.best_config_per_estimator)

View File

@@ -14,7 +14,7 @@ from flaml.tune.spark.utils import check_spark
spark_available, _ = check_spark()
skip_spark = not spark_available
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
os.environ["FLAML_MAX_CONCURRENT"] = "2"
X, y = load_breast_cancer(return_X_y=True)

View File

@@ -1,3 +1,4 @@
import atexit
import os
from functools import partial
from timeit import timeit
@@ -14,6 +15,7 @@ try:
from pyspark.sql import SparkSession
from flaml.automl.ml import sklearn_metric_loss_score
from flaml.automl.spark import disable_spark_ansi_mode, restore_spark_ansi_mode
from flaml.automl.spark.metrics import spark_metric_loss_score
from flaml.automl.spark.utils import (
iloc_pandas_on_spark,
@@ -24,6 +26,7 @@ try:
unique_value_first_index,
)
from flaml.tune.spark.utils import (
_spark_major_minor_version,
check_spark,
get_broadcast_data,
get_n_cpus,
@@ -35,8 +38,39 @@ try:
except ImportError:
print("Spark is not installed. Skip all spark tests.")
skip_spark = True
_spark_major_minor_version = (0, 0)
pytestmark = pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests.")
pytestmark = [pytest.mark.skipif(skip_spark, reason="Spark is not installed. Skip all spark tests."), pytest.mark.spark]
@pytest.mark.skipif(_spark_major_minor_version[0] < 4, reason="Requires Spark 4.0+")
def test_to_pandas_on_spark_temp_override():
import pyspark.pandas as ps
from pyspark.sql import Row
from flaml.automl.spark.utils import to_pandas_on_spark
spark_session = SparkSession.builder.getOrCreate()
spark, ansi_conf, adjusted = disable_spark_ansi_mode()
atexit.register(restore_spark_ansi_mode, spark, ansi_conf, adjusted)
# Ensure we can toggle options
orig = ps.get_option("compute.fail_on_ansi_mode")
try:
spark_session.conf.set("spark.sql.ansi.enabled", "true")
ps.set_option("compute.fail_on_ansi_mode", True)
# create tiny spark df
sdf = spark_session.createDataFrame([Row(a=1, b=2)])
# Should not raise as our function temporarily disables fail_on_ansi_mode
pds = to_pandas_on_spark(sdf)
assert "a" in pds.columns
finally:
# restore test environment
ps.set_option("compute.fail_on_ansi_mode", orig)
spark_session.conf.set("spark.sql.ansi.enabled", "false")
def test_with_parameters_spark():

View File

@@ -5,17 +5,38 @@ import sys
import unittest
import numpy as np
import openml
try:
import openml
except ImportError:
openml = None
import pandas as pd
import pytest
import scipy.sparse
from minio.error import ServerError
try:
from minio.error import ServerError
except ImportError:
class ServerError(Exception):
pass
from requests.exceptions import SSLError
from sklearn.metrics import mean_absolute_error, mean_squared_error
from flaml import AutoVW
from flaml.tune import loguniform, polynomial_expansion_set
try:
from vowpalwabbit import pyvw
except ImportError:
skip_vw_test = True
else:
skip_vw_test = False
pytest.skip("skipping if no openml", allow_module_level=True) if openml is None else None
VW_DS_DIR = "test/data/"
NS_LIST = list(string.ascii_lowercase) + list(string.ascii_uppercase)
logger = logging.getLogger(__name__)
@@ -351,14 +372,9 @@ def get_vw_tuning_problem(tuning_hp="NamesapceInteraction"):
return vw_oml_problem_args, vw_online_aml_problem
@pytest.mark.skipif(
"3.10" in sys.version or "3.11" in sys.version,
reason="do not run on py >= 3.10",
)
@pytest.mark.skipif(skip_vw_test, reason="vowpalwabbit not installed")
class TestAutoVW(unittest.TestCase):
def test_vw_oml_problem_and_vanilla_vw(self):
from vowpalwabbit import pyvw
try:
vw_oml_problem_args, vw_online_aml_problem = get_vw_tuning_problem()
except (SSLError, ServerError, Exception) as e:

View File

@@ -4,10 +4,17 @@ from collections import defaultdict
import numpy as np
import pytest
import thop
import torch
import torch.nn as nn
import torch.nn.functional as F
try:
import thop
import torch
import torch.nn as nn
import torch.nn.functional as F
except ImportError:
thop = None
torch = None
nn = None
F = None
try:
import torchvision
@@ -16,6 +23,11 @@ except ImportError:
from flaml import tune
if thop is None or torch is None or nn is None or F is None or torchvision is None:
pytest.skip(
"skipping test_lexiflow.py because torch, torchvision or thop is not installed.", allow_module_level=True
)
DEVICE = torch.device("cpu")
BATCHSIZE = 128
N_TRAIN_EXAMPLES = BATCHSIZE * 30

View File

@@ -0,0 +1,99 @@
"""Tests for SearchThread nested dictionary update fix."""
import pytest
from flaml.tune.searcher.search_thread import _recursive_dict_update
def test_recursive_dict_update_simple():
"""Test simple non-nested dictionary update."""
target = {"a": 1, "b": 2}
source = {"c": 3}
_recursive_dict_update(target, source)
assert target == {"a": 1, "b": 2, "c": 3}
def test_recursive_dict_update_override():
"""Test that source values override target values for non-dict values."""
target = {"a": 1, "b": 2}
source = {"b": 3}
_recursive_dict_update(target, source)
assert target == {"a": 1, "b": 3}
def test_recursive_dict_update_nested():
"""Test nested dictionary merge (the main use case for XGBoost params)."""
target = {
"num_boost_round": 10,
"params": {
"max_depth": 12,
"eta": 0.020168455186106736,
"min_child_weight": 1.4504723523894132,
"scale_pos_weight": 3.794258636185337,
"gamma": 0.4985070123025904,
},
}
source = {
"params": {
"verbosity": 3,
"booster": "gbtree",
"eval_metric": "auc",
"tree_method": "hist",
"objective": "binary:logistic",
}
}
_recursive_dict_update(target, source)
# Check that sampled params are preserved
assert target["params"]["max_depth"] == 12
assert target["params"]["eta"] == 0.020168455186106736
assert target["params"]["min_child_weight"] == 1.4504723523894132
assert target["params"]["scale_pos_weight"] == 3.794258636185337
assert target["params"]["gamma"] == 0.4985070123025904
# Check that const params are added
assert target["params"]["verbosity"] == 3
assert target["params"]["booster"] == "gbtree"
assert target["params"]["eval_metric"] == "auc"
assert target["params"]["tree_method"] == "hist"
assert target["params"]["objective"] == "binary:logistic"
# Check top-level param is preserved
assert target["num_boost_round"] == 10
def test_recursive_dict_update_deeply_nested():
"""Test deeply nested dictionary merge."""
target = {"a": {"b": {"c": 1, "d": 2}}}
source = {"a": {"b": {"e": 3}}}
_recursive_dict_update(target, source)
assert target == {"a": {"b": {"c": 1, "d": 2, "e": 3}}}
def test_recursive_dict_update_mixed_types():
"""Test that non-dict values in source replace dict values in target."""
target = {"a": {"b": 1}}
source = {"a": 2}
_recursive_dict_update(target, source)
assert target == {"a": 2}
def test_recursive_dict_update_empty_dicts():
"""Test with empty dictionaries."""
target = {}
source = {"a": 1}
_recursive_dict_update(target, source)
assert target == {"a": 1}
target = {"a": 1}
source = {}
_recursive_dict_update(target, source)
assert target == {"a": 1}
def test_recursive_dict_update_none_values():
"""Test that None values are properly handled."""
target = {"a": 1, "b": None}
source = {"b": 2, "c": None}
_recursive_dict_update(target, source)
assert target == {"a": 1, "b": 2, "c": None}

View File

@@ -324,3 +324,26 @@ def test_no_optuna():
import flaml.tune.searcher.suggestion
subprocess.check_call([sys.executable, "-m", "pip", "install", "optuna==2.8.0"])
def test_unresolved_search_space(caplog):
import logging
from flaml import tune
from flaml.tune.searcher.blendsearch import BlendSearch
if caplog is not None:
caplog.set_level(logging.INFO)
BlendSearch(metric="loss", mode="min", space={"lr": tune.uniform(0.001, 0.1), "depth": tune.randint(1, 10)})
try:
text = caplog.text
except AttributeError:
text = ""
assert (
"unresolved search space" not in text and text
), "BlendSearch should not produce warning about unresolved search space"
if __name__ == "__main__":
test_unresolved_search_space(None)

View File

@@ -53,6 +53,11 @@ def _easy_objective(config):
def test_nested_run():
"""
nested tuning example: Tune -> AutoML -> MLflow autolog
mlflow logging is complicated in nested tuning. It's better to turn off mlflow autologging to avoid
potential issues in FLAML's mlflow_integration.adopt_children() function.
"""
from flaml import AutoML, tune
data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)

View File

@@ -6,12 +6,12 @@ from sklearn.model_selection import train_test_split
from flaml import tune
from flaml.automl.model import LGBMEstimator
data = fetch_california_housing(return_X_y=False, as_frame=True)
data = fetch_california_housing(return_X_y=False, as_frame=True, data_home="test")
df, X, y = data.frame, data.data, data.target
df_train, _, X_train, X_test, _, y_test = train_test_split(df, X, y, test_size=0.33, random_state=42)
csv_file_name = "test/housing.csv"
df_train.to_csv(csv_file_name, index=False)
# X, y = fetch_california_housing(return_X_y=True, as_frame=True)
# X, y = fetch_california_housing(return_X_y=True, as_frame=True, data_home="test")
# X_train, X_test, y_train, y_test = train_test_split(
# X, y, test_size=0.33, random_state=42
# )

View File

@@ -4,7 +4,7 @@
**Date and Time**: 09.09.2024, 15:30-17:00
Location: Sorbonne University, 4 place Jussieu, 75005 Paris
Location: Sorbonne University, 4 place Jussieu, 75005 Paris
Duration: 1.5 hours

View File

@@ -4,7 +4,7 @@
**Date and Time**: 04-26, 09:0010:30 PT.
Location: Microsoft Conference Center, Seattle, WA.
Location: Microsoft Conference Center, Seattle, WA.
Duration: 1.5 hours

View File

@@ -0,0 +1,159 @@
# Best Practices
This page collects practical guidance for using FLAML effectively across common tasks.
## General tips
- Start simple: set `task`, `time_budget`, and keep `metric="auto"` unless you have a strong reason to override.
- Prefer correct splits: ensure your evaluation strategy matches your data (time series vs i.i.d., grouped data, etc.).
- Keep estimator lists explicit when debugging: start with a small `estimator_list` and expand.
- Use built-in discovery helpers to avoid stale hardcoded lists:
```python
from flaml import AutoML
from flaml.automl.task.factory import task_factory
automl = AutoML()
print("Built-in sklearn metrics:", sorted(automl.supported_metrics[0]))
print(
"classification estimators:",
sorted(task_factory("classification").estimators.keys()),
)
```
## Classification
- **Metric**: for binary classification, `metric="roc_auc"` is common; for multiclass, `metric="log_loss"` is often robust.
- **Imbalanced data**:
- pass `sample_weight` to `AutoML.fit()`;
- consider setting class weights via `custom_hp` / `fit_kwargs_by_estimator` for specific estimators (see [FAQ](FAQ)).
- **Probability vs label metrics**: use `roc_auc` / `log_loss` when you care about calibrated probabilities.
- **Label overlap control** (holdout evaluation only):
- By default, FLAML uses a fast strategy (`allow_label_overlap=True`) that ensures all labels are present in both training and validation sets by adding missing labels' first instances to both sets. This is efficient but may create minor overlap.
- For strict no-overlap validation, use `allow_label_overlap=False`. This slower but more precise strategy intelligently re-splits multi-instance classes to avoid overlap while maintaining label completeness.
```python
from flaml import AutoML
# Fast version (default): allows overlap for efficiency
automl_fast = AutoML()
automl_fast.fit(
X_train,
y_train,
task="classification",
eval_method="holdout",
allow_label_overlap=True,
) # default
# Precise version: avoids overlap when possible
automl_precise = AutoML()
automl_precise.fit(
X_train,
y_train,
task="classification",
eval_method="holdout",
allow_label_overlap=False,
) # slower but more precise
```
Note: This only affects holdout evaluation. CV and custom validation sets are unaffected.
## Regression
- **Default metric**: `metric="r2"` (minimizes `1 - r2`).
- If your target scale matters (e.g., dollar error), consider `mae`/`rmse`.
## Learning to rank
- Use `task="rank"` with group information (`groups` / `groups_val`) so metrics like `ndcg` and `ndcg@k` are meaningful.
- If you pass `metric="ndcg@10"`, also pass `groups` so FLAML can compute group-aware NDCG.
## Time series forecasting
- Use time-aware splitting. For holdout validation, set `eval_method="holdout"` and use a time-ordered dataset.
- Prefer supplying a DataFrame with a clear time column when possible.
- Optional time-series estimators depend on optional dependencies. To list what is available in your environment:
```python
from flaml.automl.task.factory import task_factory
print("forecast:", sorted(task_factory("forecast").estimators.keys()))
```
## NLP (Transformers)
- Install the optional dependency: `pip install "flaml[hf]"`.
- When you provide a custom metric, ensure it returns `(metric_to_minimize, metrics_to_log)` with stable keys.
## Speed, stability, and tricky settings
- **Time budget vs convergence**: if you see warnings about not all estimators converging, increase `time_budget` or reduce `estimator_list`.
- **Memory pressure / OOM**:
- set `free_mem_ratio` (e.g., `0.2`) to keep free memory above a threshold;
- set `model_history=False` to reduce stored artifacts;
- **Reproducibility**: set `seed` and keep `n_jobs` fixed; expect some runtime variance.
## Persisting models
FLAML supports **both** MLflow logging and pickle-based persistence. For production deployment, MLflow logging is typically the most important option because it plugs into the MLflow ecosystem (tracking, model registry, serving, governance). For quick local reuse, persisting the whole `AutoML` object via pickle is often the most convenient.
### Option 1: MLflow logging (recommended for production)
When you run `AutoML.fit()` inside an MLflow run, FLAML can log metrics/params automatically (disable via `mlflow_logging=False` if needed). To persist the trained `AutoML` object as a model artifact and reuse MLflow tooling end-to-end:
```python
import mlflow
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from flaml import AutoML
X, y = load_iris(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
automl = AutoML()
mlflow.set_experiment("flaml")
with mlflow.start_run(run_name="flaml_run") as run:
automl.fit(X_train, y_train, task="classification", time_budget=3)
run_id = run.info.run_id
# Later (or in a different process)
automl2 = mlflow.sklearn.load_model(f"runs:/{run_id}/model")
assert np.array_equal(automl2.predict(X_test), automl.predict(X_test))
```
### Option 2: Pickle the full `AutoML` instance (convenient)
Pickling stores the *entire* `AutoML` instance (not just the best estimator). This is useful when you prefer not to rely on MLflow or when you want to reuse additional attributes of the AutoML object without retraining.
In Microsoft Fabric scenarios, additional attributes is particularly important for re-plotting visualization figures without requiring model retraining.
```python
import mlflow
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from flaml import AutoML
X, y = load_iris(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
automl = AutoML()
mlflow.set_experiment("flaml")
with mlflow.start_run(run_name="flaml_run") as run:
automl.fit(X_train, y_train, task="classification", time_budget=3)
automl.pickle("automl.pkl")
automl2 = AutoML.load_pickle("automl.pkl")
assert np.array_equal(automl2.predict(X_test), automl.predict(X_test))
assert automl.best_config == automl2.best_config
assert automl.best_loss == automl2.best_loss
assert automl.mlflow_integration.infos == automl2.mlflow_integration.infos
```
See also: [Task-Oriented AutoML](Use-Cases/Task-Oriented-AutoML) and [FAQ](FAQ).

View File

@@ -49,7 +49,7 @@ print(flaml.__version__)
```
- Please ensure all **code snippets and error messages are formatted in
appropriate code blocks**. See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks)
appropriate code blocks**. See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks)
for more details.
## Becoming a Reviewer
@@ -62,10 +62,10 @@ There is currently no formal reviewer solicitation process. Current reviewers id
```bash
git clone https://github.com/microsoft/FLAML.git
pip install -e FLAML[notebook,autogen]
pip install -e ".[notebook]"
```
In case the `pip install` command fails, try escaping the brackets such as `pip install -e FLAML\[notebook,autogen\]`.
In case the `pip install` command fails, try escaping the brackets such as `pip install -e .\[notebook\]`.
### Docker
@@ -88,7 +88,7 @@ Run `pre-commit install` to install pre-commit into your git hooks. Before you c
### Coverage
Any code you commit should not decrease coverage. To run all unit tests, install the \[test\] option under FLAML/:
Any code you commit should not decrease coverage. To run all unit tests, install the [test] option under FLAML/:
```bash
pip install -e."[test]"

View File

@@ -2,7 +2,7 @@
### Prerequisites
Install the \[automl\] option.
Install the [automl] option.
```bash
pip install "flaml[automl]"

Some files were not shown because too many files have changed in this diff Show More