307 Commits

Author SHA1 Message Date
Copilot
fc4efe3510 Fix sklearn 1.7+ compatibility: BaseEstimator type detection for ensemble (#1512)
* Initial plan

* Fix ExtraTreesEstimator regression ensemble error with sklearn 1.7+

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: improve __sklearn_tags__ implementation

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix format error

* Emphasize pre-commit

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-23 10:20:59 +08:00
Copilot
d9e74031e0 Expose task-level and estimator-level preprocessors as public API (#1497)
* Initial plan

* Add public preprocess() API methods for AutoML and estimators

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add documentation for preprocess() API methods

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add example script demonstrating preprocess() API usage

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback - fix type hints and simplify test logic

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix formatting issues with pre-commit hooks

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Remove example.py, make tests faster

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-21 14:38:25 +08:00
Copilot
9233a52736 Add configurable label overlap handling for classification holdout strategy (#1491)
* Initial plan

* Fix training/test set overlap in holdout classification by only adding missing labels when needed

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: add bounds checking and fix edge cases

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix bounds checking: use correct comparison operator for array indexing

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix potential ValueError with max() on empty lists and simplify test assertions

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add extra bounds checking for label_matches indices

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix pandas_on_spark compatibility by using iloc_pandas_on_spark util method

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Run pre-commit to fix formatting issues

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Improve missing label handling to avoid overlap when possible

For classes with multiple instances that end up in one set, properly
re-split them between train and val instead of duplicating. Only add
to both sets when the class has exactly 1 instance (unavoidable overlap).

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix: ensure at least 1 instance remains in original set when re-splitting

Addresses comments on lines 580 and 724 - prevents moving ALL instances
from validation to training or vice versa by using min(num_instances - 1, ...).

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Refactor: Extract missing label handling into two methods (fast/slow)

- Created _handle_missing_labels_fast(): Fast version that adds first instance to both sets (may overlap)
- Created _handle_missing_labels_no_overlap(): Precise version that avoids overlap when possible
- Added allow_label_overlap parameter to AutoML.fit() (default=True for fast version)
- Updated documentation with new parameter
- Both versions maintain label completeness while offering speed/precision tradeoff

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add usage example for allow_label_overlap to Best-Practices.md

- Added comprehensive documentation in Classification section
- Included code examples showing both fast and precise versions
- Explained tradeoffs between speed and precision
- Noted that parameter only affects holdout evaluation

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: update documentation and tests

- Updated docstrings to clarify fast version only adds instances to missing sets
- Fixed documentation to reflect actual behavior (not "both sets" but "set with missing label")
- Completely rewrote test_no_overlap.py to test both allow_label_overlap modes
- Added tests with sample_weights for better code coverage
- Added test for single-instance class handling
- All 5 tests passing

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix AttributeError: initialize _allow_label_overlap in settings and retrain_from_log

- Added allow_label_overlap to settings initialization with default=True
- Added parameter defaulting in fit() method to use settings value if not provided
- Added _allow_label_overlap initialization in retrain_from_log method
- Fixes test failures in test_multiclass, test_regression, and spark tests

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add docstring to fit()

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-21 14:03:48 +08:00
Copilot
3d489f1aaa Add validation and clear error messages for custom_metric parameter (#1500)
* Initial plan

* Add validation and documentation for custom_metric parameter

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Refactor validation into reusable method and improve error handling

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Apply pre-commit formatting fixes

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-21 08:58:11 +08:00
Copilot
c64eeb5e8d Document that final_estimator parameters in ensemble are not auto-tuned (#1499)
* Initial plan

* Document final_estimator parameter behavior in ensemble configuration

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Address code review feedback: fix syntax in examples and use float comparison

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Run pre-commit to fix formatting issues

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 21:59:31 +08:00
Copilot
1687ca9a94 Fix eval_set preprocessing for XGBoost estimators with categorical features (#1470)
* Initial plan

* Initial analysis - reproduced eval_set preprocessing bug

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix eval_set preprocessing for XGBoost estimators with categorical features

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add eval_set tests to test_xgboost function

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix linting issues with ruff and black

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 20:41:21 +08:00
Copilot
4ea9650f99 Fix nested dictionary merge in SearchThread losing sampled hyperparameters (#1494)
* Initial plan

* Add recursive dict update to fix nested config merge

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2026-01-20 15:50:18 +08:00
Copilot
5f1aa2dda8 Fix: Preserve FLAML_sample_size in best_config_per_estimator (#1475)
* Initial plan

* Fix: Preserve FLAML_sample_size in best_config_per_estimator

Modified best_config_per_estimator property to keep FLAML_sample_size when returning best configurations. Previously, AutoMLState.sanitize() was removing this key, which caused the sample size information to be lost when using starting_points from a previous run.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add a test to verify the improvement of starting_points

* Update documentation to reflect FLAML_sample_size preservation

Updated Task-Oriented-AutoML.md to document that best_config_per_estimator now preserves FLAML_sample_size:
- Added note in "Warm start" section explaining that FLAML_sample_size is preserved for effective warm-starting
- Added note in "Get best configuration" section with example showing FLAML_sample_size in output
- Explains importance of sample size preservation for continuing optimization with correct sample sizes

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix unintended code change

* Improve docstrings and docs

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-20 07:42:31 +08:00
Copilot
67bdcde4d5 Fix BlendSearch OptunaSearch warning for non-hierarchical spaces with Ray Tune domains (#1477)
* Initial plan

* Fix BlendSearch OptunaSearch warning for non-hierarchical spaces

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Clean up test file

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add regression test for BlendSearch UDF mode warning fix

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Improve the fix and tests

* Fix Define-by-run function passed in  argument is not yet supported when using

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-20 00:01:41 +08:00
Copilot
46a406edd4 Add objective parameter to LGBMEstimator search space (#1474)
* Initial plan

* Add objective parameter to LGBMEstimator search_space

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Add test for LGBMEstimator objective parameter

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>

* Fix format error

* Remove changes, just add a test to verify the current supported usage

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-19 21:10:21 +08:00
Li Jiang
f1817ea7b1 Add support to python 3.13 (#1486) 2026-01-19 18:31:43 +08:00
Li Jiang
ced1d6f331 Support pickling the whole AutoML instance, Sync Fabric till 0d4ab16f (#1481) 2026-01-12 23:04:38 +08:00
Copilot
0b138d9193 Fix log_training_metric causing IndexError for time series models (#1469)
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2026-01-10 18:07:17 +08:00
Li Jiang
1c9835dc0a Add support to Python 3.12, Sync Fabric till dc382961 (#1467)
* Merged PR 1686010: Bump version to 2.3.5.post2, Distribute source and wheel, Fix license-file, Only log better models

- Fix license-file
- Bump version to 2.3.5.post2
- Distribute source and wheel
- Log better models only
- Add artifact_path to register_automl_pipeline
- Improve logging of _automl_user_configurations

----
This pull request fixes the project’s configuration by updating the license metadata for compliance with FLAML OSS 2.3.5.

The changes in `/pyproject.toml` update the project’s license and readme metadata by replacing deprecated keys with the new structured fields.
- `/pyproject.toml`: Replaced `license_file` with `license = { text = "MIT" }`.
- `/pyproject.toml`: Replaced `description-file` with `readme = "README.md"`.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4252053

* Merged PR 1688479: Handle feature_importances_ is None, Catch RuntimeError and wait for spark cluster to recover

- Add warning message when feature_importances_ is None (#3982120)
- Catch RuntimeError and wait for spark cluster to recover (#3982133)

----
Bug fix.

This pull request prevents an AttributeError in the feature importance plotting function by adding a check for a `None` value with an informative warning message.
- `flaml/fabric/visualization.py`: Checks if `result.feature_importances_` is `None`, logs a warning with possible reasons, and returns early.
- `flaml/fabric/visualization.py`: Imports `logger` from `flaml.automl.logger` to support the warning message.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #3982120, #3982133

* Removed deprecated metadata section

* Fix log_params, log_artifact doesn't support run_id in mlflow 2.6.0

* Remove autogen

* Remove autogen

* Remove autogen

* Merged PR 1776547: Fix flaky test test_automl

Don't throw error when time budget is not enough

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix addressing a failing test in the AutoML notebook example.

#### PR Summary
This PR fixes a flaky test by adding a conditional check in the AutoML test that prints a message and exits early if no best estimator is set, thereby preventing unpredictable test failures.
- `test/automl/test_notebook_example.py`: Introduced a check to print "Training budget is not sufficient" and return if `automl.best_estimator` is not found.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4573514

* Merged PR 1777952: Fix unrecognized or malformed field 'license-file' when uploading wheel to feed

Try to fix InvalidDistribution: Invalid distribution metadata: unrecognized or malformed field 'license-file'

----
Bug fix addressing package metadata configuration.

This pull request fixes the error with unrecognized or malformed license file fields during wheel uploads by updating the setup configuration.
- In `setup.py`, added `license="MIT"` and `license_files=["LICENSE"]` to provide proper license metadata.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #4560034

* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0

* Cherry-pick Merged PR 1890869: Improve time_budget estimation for mlflow logging

* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0

* Disable openai workflow

* Add python 3.12 to test envs

* Manually trigger openai

* Support markdown files with underscore-prefixed file names

* Improve save dependencies

* SynapseML is not installed

* Fix syntax error:Module !flaml/autogen was never imported

* macos 3.12 also hangs

* fix syntax error

* Update python version in actions

* Install setuptools for using pkg_resources

* Fix test_automl_performance in Github actions

* Fix test_nested_run
2026-01-10 12:17:21 +08:00
Li Jiang
1285700d7a Update readme, bump version to 2.4.0, fix CI errors (#1466)
* Update gitignore

* Bump version to 2.4.0

* Update readme

* Pre-download california housing data

* Use pre-downloaded california housing data

* Pin lightning<=2.5.6

* Fix typo in find and replace

* Fix estimators has no attribute __sklearn_tags__

* Pin torch to 2.2.2 in tests

* Fix conflict

* Update pytorch-forecasting

* Update pytorch-forecasting

* Update pytorch-forecasting

* Use numpy<2 for testing

* Update scikit-learn

* Run Build and UT every other day

* Pin pip<24.1

* Pin pip<24.1 in pipeline

* Loosen pip, install pytorch_forecasting only in py311

* Add support to new versions of nlp dependecies

* Fix formats

* Remove redefinition

* Update mlflow versions

* Fix mlflow version syntax

* Update gitignore

* Clean up cache to free space

* Remove clean up action cache

* Fix blendsearch

* Update test workflow

* Update setup.py

* Fix catboost version

* Update workflow

* Prepare for python 3.14

* Support no catboost

* Fix tests

* Fix python_requires

* Update test workflow

* Fix vw tests

* Remove python 3.9

* Fix nlp tests

* Fix prophet

* Print pip freeze for better debugging

* Fix Optuna search does not support parameters of type Float with samplers of type Quantized

* Save dependencies for later inspection

* Fix coverage.xml not exists

* Fix github action permission

* Handle python 3.13

* Address openml is not installed

* Check dependencies before run tests

* Update dependencies

* Fix syntax error

* Use bash

* Update dependencies

* Fix git error

* Loose mlflow constraints

* Add rerun, use mlflow-skinny

* Fix git error

* Remove ray tests

* Update xgboost versions

* Fix automl pickle error

* Don't test python 3.10 on macos as it's stuck

* Rebase before push

* Reduce number of branches
2026-01-09 13:40:52 +08:00
Li Jiang
c2b25310fc Sync Fabric till 2cd1c3da (#1433)
* Sync Fabric till 2cd1c3da

* Remove synapseml from tag names

* Fix 'NoneType' object has no attribute 'DataFrame'

* Deprecated 3.8 support

* Fix 'NoneType' object has no attribute 'DataFrame'

* Still use python 3.8 for pydoc

* Don't run tests in parallel

* Remove autofe and lowcode
2025-05-23 10:19:31 +08:00
Stickic-cyber
468bc62d27 Fix issue with "list index out of range" when max_iter=1 (#1419) 2025-04-09 21:54:17 +08:00
Daniel Grindrod
d0a11958a5 fix: Fixed bug where group folds and sample weights couldn't be used in the same automl instance (#1405) 2025-02-15 10:41:27 +08:00
Will Charles
840f76e5e5 Changed tune.report import for ray>=2 (#1392)
* Changed tune.report import for ray>=2

* env: Changed pydantic restriction in env

* Reverted Pydantic install conditions

* Reverted Pydantic install conditions

* test: Check if GPU is available

* tests: uncommented a line

* tests: Better fix for Ray GPU checking

* tests: Added timeout to dataset loading

* tests: Deleted _test_hf_data()

* test: Reduce lrl2 dataset size

* bug: timeout error

* bug: timeout error

* fix: Added threading check for timout issue

* Undo old commits

* Timeout fix from #1406

---------

Co-authored-by: Daniel Grindrod <dannycg1996@gmail.com>
2025-02-14 09:38:33 +08:00
Li Jiang
d8b7d25b80 Fix test hang issue (#1406)
* Add try except to resource.setrlimit

* Set time limit only in main thread

* Check only test model

* Pytest debug

* Test separately

* Move test_model.py to automl folder
2025-02-13 19:50:35 +08:00
Daniel Grindrod
c038fbca07 fix: KeyError no longer occurs when using groupfolds for regression tasks. (#1385)
* fix: Now resetting indexes for regression datasets when using group folds

* refactor: Simplified if statement to include all fold types

* docs: Updated docs to make it clear that group folds can be used for regression tasks

---------

Co-authored-by: Daniel Grindrod <daniel.grindrod@evotec.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2024-12-18 10:06:58 +08:00
Daniel Grindrod
42d1dcfa0e fix: Fixed bug with catboost and groups (#1383)
Co-authored-by: Daniel Grindrod <daniel.grindrod@evotec.com>
2024-12-17 13:54:49 +08:00
Daniel Grindrod
5a74227bc3 Flaml: fix lgbm reproducibility (#1369)
* fix: Fixed bug where every underlying LGBMRegressor or LGBMClassifier had n_estimators = 1

* test: Added test showing case where FLAMLised CatBoostModel result isn't reproducible

* fix: Fixing issue where callbacks cause LGBM results to not be reproducible

* Update test/automl/test_regression.py

Co-authored-by: Li Jiang <bnujli@gmail.com>

* fix: Adding back the LGBM EarlyStopping

* refactor: Fix tweaked to ensure other models aren't likely to be affected

* test: Fixed test to allow reproduced results to be better than the FLAML results, when LGBM earlystopping is involved

---------

Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2024-11-01 10:06:15 +08:00
Daniel Grindrod
a316f84fe1 fix: LinearSVC results now reproducible (#1376)
Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com>
2024-10-31 14:02:16 +08:00
Daniel Grindrod
72881d3a2b fix: Fixing the random state of ElasticNetClassifier by default, to ensure reproduciblity. Also included elasticnet in reproducibility tests (#1374)
Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2024-10-29 14:21:43 +08:00
Li Jiang
69da685d1e Fix data transform issue, spark log_loss metric compute error and json dumps TypeError (Sync Fabric till 3c545e67) (#1371)
* Merged PR 1444697: Fix json dumps TypeError

Fix json dumps TypeError

----
Bug fix to address a `TypeError` in `json.dumps`.

This pull request fixes a `TypeError` encountered when using `json.dumps` on `automl._automl_user_configurations` by introducing a safe JSON serialization function.
- Added `safe_json_dumps` function in `flaml/fabric/mlflow.py` to handle non-serializable objects.
- Updated `MLflowIntegration` class in `flaml/fabric/mlflow.py` to use `safe_json_dumps` for JSON serialization.
- Modified `test/automl/test_multiclass.py` to test the new `safe_json_dumps` function.

Related work items: #3439408

* Fix data transform issue and spark log_loss metric compute error
2024-10-29 11:58:40 +08:00
Daniel Grindrod
d224218ecf fix: FLAML catboost metrics arent reproducible (#1364)
* fix: CatBoostRegressors metrics are now reproducible

* test: Made tests live, which ensure the reproducibility of catboost models

* fix: Added defunct line of code as a comment

* fix: Re-adding removed if statement, and test to show one issue that if statement can cause

* fix: Stopped ending CatBoost training early when time budget is running out

---------

Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com>
2024-10-23 13:51:23 +08:00
Daniel Grindrod
a2a5e1abb9 test: Adding tests to verify model reproducibility (#1362) 2024-10-12 09:53:16 +08:00
Li Jiang
8e171bc402 Remove temporary pickle files (#1354)
* Remove temporary pickle files

* Update version to 2.3.1

* Use TemporaryDirectory for pickle and log_artifact

* Fix 'CatBoostClassifier' object has no attribute '_get_param_names'
2024-09-21 15:46:32 +08:00
Li Jiang
5bfa0b1cd3 Improve mlflow integration and add more models (#1331)
* Add more spark models and improved mlflow integration

* Update test_extra_models, setup and gitignore

* Remove autofe

* Remove autofe

* Remove autofe

* Sync changes in internal

* Fix test for env without pyspark

* Fix import errors

* Fix tests

* Fix typos

* Fix pytorch-forecasting version

* Remove internal funcs, rename _mlflow.py

* Fix import error

* Fix dependency

* Fix experiment name setting

* Fix dependency

* Update pandas version

* Update pytorch-forecasting version

* Add warning message for not has_automl

* Fix test errors with nltk 3.8.2

* Don't enable mlflow logging w/o an active run

* Fix pytorch-forecasting can't be pickled issue

* Update pyspark tests condition

* Update synapseml

* Update synapseml

* No parent run, no logging for OSS

* Log when autolog is enabled

* upgrade code

* Enable autolog for tune

* Increase time budget for test

* End run before start a new run

* Update parent run

* Fix import error

* clean up

* skip macos and win

* Update notes

* Update default value of model_history
2024-08-13 07:53:47 +00:00
Jirka Borovec
b348cb1136 configure & apply pyupgrade with py3.8+ (#1333)
* configure pyupgrade with `py3.8+`

* apply update

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2024-08-12 02:54:18 +00:00
Jirka Borovec
cd0e88e383 fix missing req. arg for new datasets package (#1334)
Co-authored-by: Li Jiang <bnujli@gmail.com>
2024-08-12 02:19:11 +00:00
Li Jiang
a17c6e392e Fix test errors of nltk and numpy (#1335)
* Fix test errors with nltk 3.8.2

* Fix test errors with numpy large

* Fix test errors with numpy large
2024-08-12 00:14:21 +00:00
Li Jiang
f27f98c6d7 Fix test mac os python 3.11 (#1328)
* add test

* Skip test_autohf_classificationhead.py for MacOS py311

* Skip test/nlp/test_default.py for MacOS py311

* Check test_tune

* Check test_lexiflow

* Check test_tune

* Remove checks

* Skip test_nested_run for macos py311)

* Skip test_nested_space for macos py311

* Test tune on MacOS Python 3.11 w/o pytest

* Split tests by folder

* Skip test lexiflow for MacOS py311

* Enable test_tune for MacOS py311

* Clean up
2024-08-06 05:50:44 +00:00
Li Jiang
a68d073ccf Add support to python 3.11 (#1326)
* Add support to python 3.11

* Fix workflow python version comparison

* Ray is not supported in python 3.11

* Fix test_numpy
2024-07-31 00:18:41 +00:00
Li Jiang
d8129b9211 Fix typos, upgrade yarn packages, add some improvements (#1290)
* Fix typos, upgrade yarn packages, add some improvements

* Fix joblib 1.4.0 breaks joblib-spark

* Fix xgboost test error

* Pin xgboost<2.0.0

* Try update prophet to 1.5.1

* Update github workflow

* Revert prophet version

* Update github workflow

* Update install libomp

* Fix test errors

* Fix test errors

* Add retry to test and coverage

* Revert "Add retry to test and coverage"

This reverts commit ce13097cd5.

* Increase test budget

* Add more data to test_models, try fixing ValueError: Found array with 0 sample(s) (shape=(0, 252)) while a minimum of 1 is required.
2024-07-19 13:40:04 +00:00
Gleb Levitski
3de0dc667e Add ruff sort to pre-commit and sort imports in the library (#1259)
* lint

* bump ver

* bump ver

* fixed circular import

---------

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2024-03-12 21:28:57 +00:00
Li Jiang
b645da3ea7 Fix spark errors (#1274)
* Fix mlflow not found error

* Fix joblib>1.2.0 force cancel error

* Remove joblib version constraint

* Update log

* Improve joblib exception catch

* Added permissions
2024-02-09 01:08:24 +00:00
Gleb Levitski
6b93c2e394 [ENH] Add support for sklearn HistGradientBoostingEstimator (#1230)
* Update model.py

HistGradientBoosting support

* Create __init__.py

* Update model.py

* Create histgb.py

* Update __init__.py

* Update test_model.py

* added histgb to estimator list

* Update Task-Oriented-AutoML.md

added docs

* lint

* fixed bugs

---------

Co-authored-by: Gleb <gleb@Glebs-MacBook-Pro.local>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-10-31 14:45:23 +00:00
Chi Wang
fda9fa0103 improve docstr of preprocessors (#1227)
* improve docstr of preprocessors

* Update SynapseML version

* RFix test

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-09-29 03:07:21 +00:00
Chi Wang
868e7dd1ca support xgboost 2.0 (#1219)
* support xgboost 2.0

* try classes_

* test version

* quote

* use_label_encoder

* Fix xgboost test error

* remove deprecated files

* remove deprecated files

* remove deprecated import

* replace deprecated import in integrate_spark.ipynb

* replace deprecated import in automl_lightgbm.ipynb

* formatted integrate_spark.ipynb

* replace deprecated import

* try fix driver python path

* Update python-package.yml

* replace deprecated reference

* move spark python env var to other section

* Update setup.py, install xgb<2 for MacOS

* Fix typo

* assert

* Try assert xgboost version

* Fail fast

* Keep all test/spark to try fail fast

* No need to skip spark test in Mac or Win

* Remove assert xgb version

* Remove fail fast

* Found root cause, fix test_sparse_matrix_xgboost

* Revert "No need to skip spark test in Mac or Win"

This reverts commit a09034817f.

* remove assertion

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: levscaut <57213911+levscaut@users.noreply.github.com>
Co-authored-by: levscaut <lwd2010530@qq.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2023-09-22 06:55:00 +00:00
Chi Wang
4886cb5689 Rename Responsive -> Conversable (#1202)
* responsive -> conversable

* preview

* rename

* register reply

* rename and version

* bump version to 2.1.0

* notebook

* bug fix
2023-09-12 00:07:35 +00:00
Chi Wang
5f9b514be7 suffix in model name (#1206)
* suffix in model name

* bump version to 2.0.3
2023-09-04 02:32:51 +00:00
Yiran Wu
87c2361040 fix generate_reply when sender is None. (#1186)
* fix generate_reply

* code format

* add test case

* update

* update

* Update test/autogen/agentchat/test_responsive_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/autogen/agentchat/test_responsive_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update flaml/autogen/agentchat/responsive_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-08-25 10:50:22 +00:00
Yiran Wu
07b97eb469 cover function calls with no arguments (#1185) 2023-08-20 05:28:29 +00:00
Chi Wang
7ab4d114d7 silent; code_execution_config; exit; version (#1179)
* silent; code_execution_config; exit; version

* url

* url

* readme

* preview

* doc

* url

* endpoints

* timeout

* chess

* Fix retrieve chat

* config

* mathchat

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-08-14 07:09:45 +00:00
Li Jiang
700ff05874 Add RetrieveChat (#1158)
* Add RetrieveChat notebook, RetrieveAssistantAgent and RetrieveUserProxyAgent

* Update according to comments

* Add output

* Add tests, merge main, address comments

* Fix tests

* Merge main

* Remove unnecessary code

* Update test

* Update notebook, some functions

* Fix print issue

* Update notebook

* Update notebook

* Update notebook

* Improve retrieve utils and update notebook

* Update vector db creation method

* Update notebook

* Update notebook

* Add terminate if no more context

* Update prompt and notebook, add example for update context

* Update results

* Update results

* Update results of update context

* Fix typo

* Add table of contents

* Update table of contents
2023-08-13 12:51:54 +00:00
Chi Wang
c44d2f4a01 support async in agents (#1178)
* Make auto reply method pluggable

* support async

* async

* allow richer trigger types

* test list

* rename key
2023-08-08 01:34:47 +00:00
Chi Wang
a603e6dddc Make auto reply method pluggable (#1177)
* Make auto reply method pluggable

* allow richer trigger types

* test list
2023-08-07 18:41:58 +00:00
Chi Wang
2208dfb79e Improve auto reply registration (#1170)
* Improve auto reply registration

* object key

* fix test error

* bug fix in math user proxy agent

* allow send/receive without reply

* reset -> stop
2023-08-04 14:26:58 +00:00