23 Commits

Author SHA1 Message Date
Li Jiang
1c9835dc0a Add support to Python 3.12, Sync Fabric till dc382961 (#1467)
* Merged PR 1686010: Bump version to 2.3.5.post2, Distribute source and wheel, Fix license-file, Only log better models

- Fix license-file
- Bump version to 2.3.5.post2
- Distribute source and wheel
- Log better models only
- Add artifact_path to register_automl_pipeline
- Improve logging of _automl_user_configurations

----
This pull request fixes the project’s configuration by updating the license metadata for compliance with FLAML OSS 2.3.5.

The changes in `/pyproject.toml` update the project’s license and readme metadata by replacing deprecated keys with the new structured fields.
- `/pyproject.toml`: Replaced `license_file` with `license = { text = "MIT" }`.
- `/pyproject.toml`: Replaced `description-file` with `readme = "README.md"`.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4252053

* Merged PR 1688479: Handle feature_importances_ is None, Catch RuntimeError and wait for spark cluster to recover

- Add warning message when feature_importances_ is None (#3982120)
- Catch RuntimeError and wait for spark cluster to recover (#3982133)

----
Bug fix.

This pull request prevents an AttributeError in the feature importance plotting function by adding a check for a `None` value with an informative warning message.
- `flaml/fabric/visualization.py`: Checks if `result.feature_importances_` is `None`, logs a warning with possible reasons, and returns early.
- `flaml/fabric/visualization.py`: Imports `logger` from `flaml.automl.logger` to support the warning message.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #3982120, #3982133

* Removed deprecated metadata section

* Fix log_params, log_artifact doesn't support run_id in mlflow 2.6.0

* Remove autogen

* Remove autogen

* Remove autogen

* Merged PR 1776547: Fix flaky test test_automl

Don't throw error when time budget is not enough

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix addressing a failing test in the AutoML notebook example.

#### PR Summary
This PR fixes a flaky test by adding a conditional check in the AutoML test that prints a message and exits early if no best estimator is set, thereby preventing unpredictable test failures.
- `test/automl/test_notebook_example.py`: Introduced a check to print "Training budget is not sufficient" and return if `automl.best_estimator` is not found.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4573514

* Merged PR 1777952: Fix unrecognized or malformed field 'license-file' when uploading wheel to feed

Try to fix InvalidDistribution: Invalid distribution metadata: unrecognized or malformed field 'license-file'

----
Bug fix addressing package metadata configuration.

This pull request fixes the error with unrecognized or malformed license file fields during wheel uploads by updating the setup configuration.
- In `setup.py`, added `license="MIT"` and `license_files=["LICENSE"]` to provide proper license metadata.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4560034

* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0

* Cherry-pick Merged PR 1890869: Improve time_budget estimation for mlflow logging

* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0

* Disable openai workflow

* Add python 3.12 to test envs

* Manually trigger openai

* Support markdown files with underscore-prefixed file names

* Improve save dependencies

* SynapseML is not installed

* Fix syntax error:Module !flaml/autogen was never imported

* macos 3.12 also hangs

* fix syntax error

* Update python version in actions

* Install setuptools for using pkg_resources

* Fix test_automl_performance in Github actions

* Fix test_nested_run
2026-01-10 12:17:21 +08:00
Li Jiang
f27f98c6d7 Fix test mac os python 3.11 (#1328)
* add test

* Skip test_autohf_classificationhead.py for MacOS py311

* Skip test/nlp/test_default.py for MacOS py311

* Check test_tune

* Check test_lexiflow

* Check test_tune

* Remove checks

* Skip test_nested_run for macos py311)

* Skip test_nested_space for macos py311

* Test tune on MacOS Python 3.11 w/o pytest

* Split tests by folder

* Skip test lexiflow for MacOS py311

* Enable test_tune for MacOS py311

* Clean up
2024-08-06 05:50:44 +00:00
Gleb Levitski
3de0dc667e Add ruff sort to pre-commit and sort imports in the library (#1259)
* lint

* bump ver

* bump ver

* fixed circular import

---------

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2024-03-12 21:28:57 +00:00
Shaokun
7a64148676 support string alg in tune (#1093)
* support string alg in tune

* add test, enforce string feasible, support lexico in set_search_priorities in CFO

* fix bug

* fix bug

* fix bug

* fix bug

* fix bugs

* fix yiran

---------

Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
2023-07-01 03:01:14 +00:00
Jirka Borovec
a701cd82f8 set black with 120 line length (#975)
* set black with 120 line length

* apply pre-commit

* apply black
2023-04-10 19:50:40 +00:00
Chi Wang
90aea9c28b create dir for log file name (#867) 2022-12-30 10:21:30 -08:00
Chi Wang
860cbc233e move searcher and scheduler into tune (#746)
* move into tune

* correct path

* correct path

* import path
2022-10-04 16:03:22 -07:00
Chi Wang
d60d38b3e9 log_file_name in tune.run() (#681)
* log_file_name in tune.run()

* use_ray validates log_file_name

* assert no ray_args when not use_ray

* import os and use os.path
2022-08-15 06:15:31 -07:00
Chi Wang
1111d6d43a backup & recover global vars for nested tune.run (#584)
* backup & recover global vars for nested tune.run

* ensure recovering global vars before return
2022-06-14 11:03:54 -07:00
Chi Wang
a1c49ca27b allow evaluated_rewards shorter than points_to_evaluate (#522)
* allow evaluated_rewards shorter than points_to_evaluate

* docstr update
2022-04-23 16:22:34 -07:00
Chi Wang
9128c8811a handle failing trials (#505)
* handle failing trials

* clarify when to return {}

* skip ensemble in accuracy check
2022-03-28 16:57:52 -07:00
Chi Wang
6960a833ec Gpu support for xgboost (#442)
* xgboost gpu support

* test xgboost gpu

* test sparse data

* add xgboost test

* remove ray.init to avoid pytest error
2022-01-30 13:02:18 -08:00
Xueqing Liu
438ccaa0c9 adding catch for HTTP error (#432) 2022-01-29 22:53:32 -08:00
Qingyun Wu
17b17d084f tune api for schedulers (#322)
* revise api and tests

* rename prune_attr

* update finetune notebook

* add scheduler test and notebook

* update tune api for scheduler

* remove scheduler notebook

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* docstr

* fix imports

* clear notebook output

* fix ray import

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* improve docstr

* Update flaml/searcher/blendsearch.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* remove redundant import

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-04 21:52:20 -05:00
Xueqing Liu
42de3075e9 Make NLP tasks available from AutoML.fit() (#210)
Sequence classification and regression: "seq-classification" and "seq-regression"

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-11-16 11:06:20 -08:00
Chi Wang
e46573a01d warmstart blendsearch (#186)
* increase test coverage

* use define by run only when needed

* warmstart bs

* classification -> binary, multi

* warm start with evaluated rewards

* data transformer; resource attr for gs

* BlendSearchTuner bug fix and unittest

* bug fix

* docstr and import

* task type
2021-09-04 01:42:21 -07:00
Qingyun Wu
a229a6112a Support parallel and add random search (#167)
* non hashable value out of signature

* parallel trials

* add random in _search_parallel

* fix bug in retraining

* check memory constraint before training

* retrain_full

* log custom metric

* retraining budget check

* sample size check before retrain

* remove 'time2eval' from result

* report 'total_search_time' in result

* rename total_search_time to wall_clock_time

* rename train_loss boolean to log_training_metric

* set default train_loss to None

* exclude oom result

* log retrained model

* no subsample

* doc str

* notebook

* predicted value is NaN for sarimax

* version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2021-08-23 16:36:51 -07:00
Qingyun Wu
10082b9262 v0.5.12 (#150)
* remove extra comma

* exclusive bound

* log file name

* add cost to space

* dataset_format

* add load_openml_dataset test

* docstr

* revise test format

* simplify restore

* order categories

* openml server exception in test

* process space

* add warning

* log format

* reduce n_cpu

* nested space

* hierarchical search space for CFO

* non hierarchical for bs

* unflatten hierarchical config

* connection error

* random sample

* config signature

* check ray version

* preprocess numpy array

* catboost preprocess

* time budget

* seed, verbose, hpo_method

* test cfocat

* shallow copy in flatten_dict
prevent lgbm model duplication

* match estimator name

* quantize and log

* test qloguniform and qrandint

* test qlograndint

* thread.running

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
2021-08-11 23:02:22 -07:00
Xueqing Liu
eeaf5b5963 space -> main (#148)
* subspace in flow2

* search space and trainable from AutoML

* experimental features: multivariate TPE, grouping, add_evaluated_points

* test experimental features

* readme

* define by run

* set time_budget_s for bs

Co-authored-by: liususan091219 <Xqq630517>

* version

* acl

* test define_by_run_func

* size

* constraints

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-08-02 16:10:26 -07:00
Eduardo Büll
46752083a2 fix UnboundLocalError in tune.run (#142) (#145)
Fix UnboundLocalError exception in tune.run when training_function returns a value.

Resolves #142
2021-08-01 17:55:38 -07:00
Qingyun Wu
e24265ee5d automl fit with starting points (#141)
* add starting point in fit

* add estimator best config

* add test

* add doc string

* when there are multiple points_to_evaluate in CFO, use the best one to start local search; after that use low cost partial config as the start point; then, remove the points whose performance is worse than the converged, and start local search from the remaining ones ordered by their performance.

Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-31 13:39:31 -07:00
Chi Wang
b3bb00966d coverage (#135)
* coverage

* readme

* timeout
2021-07-20 17:00:44 -07:00
Chi Wang
0925e2b308 constraints (#88)
* pre-training constraints

* metric constraints after training
2021-05-18 15:57:42 -07:00