FLAML

mirror of https://github.com/microsoft/FLAML.git synced 2026-02-09 02:09:16 +08:00

Author	SHA1	Message	Date
Li Jiang	1c9835dc0a	Add support to Python 3.12, Sync Fabric till dc382961 (#1467 ) * Merged PR 1686010: Bump version to 2.3.5.post2, Distribute source and wheel, Fix license-file, Only log better models - Fix license-file - Bump version to 2.3.5.post2 - Distribute source and wheel - Log better models only - Add artifact_path to register_automl_pipeline - Improve logging of _automl_user_configurations ---- This pull request fixes the project’s configuration by updating the license metadata for compliance with FLAML OSS 2.3.5. The changes in `/pyproject.toml` update the project’s license and readme metadata by replacing deprecated keys with the new structured fields. - `/pyproject.toml`: Replaced `license_file` with `license = { text = "MIT" }`. - `/pyproject.toml`: Replaced `description-file` with `readme = "README.md"`. <!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot --> Related work items: #4252053 * Merged PR 1688479: Handle feature_importances_ is None, Catch RuntimeError and wait for spark cluster to recover - Add warning message when feature_importances_ is None (#3982120) - Catch RuntimeError and wait for spark cluster to recover (#3982133) ---- Bug fix. This pull request prevents an AttributeError in the feature importance plotting function by adding a check for a `None` value with an informative warning message. - `flaml/fabric/visualization.py`: Checks if `result.feature_importances_` is `None`, logs a warning with possible reasons, and returns early. - `flaml/fabric/visualization.py`: Imports `logger` from `flaml.automl.logger` to support the warning message. <!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot --> Related work items: #3982120, #3982133 * Removed deprecated metadata section * Fix log_params, log_artifact doesn't support run_id in mlflow 2.6.0 * Remove autogen * Remove autogen * Remove autogen * Merged PR 1776547: Fix flaky test test_automl Don't throw error when time budget is not enough ---- #### AI description (iteration 1) #### PR Classification Bug fix addressing a failing test in the AutoML notebook example. #### PR Summary This PR fixes a flaky test by adding a conditional check in the AutoML test that prints a message and exits early if no best estimator is set, thereby preventing unpredictable test failures. - `test/automl/test_notebook_example.py`: Introduced a check to print "Training budget is not sufficient" and return if `automl.best_estimator` is not found. <!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot --> Related work items: #4573514 * Merged PR 1777952: Fix unrecognized or malformed field 'license-file' when uploading wheel to feed Try to fix InvalidDistribution: Invalid distribution metadata: unrecognized or malformed field 'license-file' ---- Bug fix addressing package metadata configuration. This pull request fixes the error with unrecognized or malformed license file fields during wheel uploads by updating the setup configuration. - In `setup.py`, added `license="MIT"` and `license_files=["LICENSE"]` to provide proper license metadata. <!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot --> Related work items: #4560034 * Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0 * Cherry-pick Merged PR 1890869: Improve time_budget estimation for mlflow logging * Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0 * Disable openai workflow * Add python 3.12 to test envs * Manually trigger openai * Support markdown files with underscore-prefixed file names * Improve save dependencies * SynapseML is not installed * Fix syntax error:Module !flaml/autogen was never imported * macos 3.12 also hangs * fix syntax error * Update python version in actions * Install setuptools for using pkg_resources * Fix test_automl_performance in Github actions * Fix test_nested_run	2026-01-10 12:17:21 +08:00
Li Jiang	1285700d7a	Update readme, bump version to 2.4.0, fix CI errors (#1466 ) * Update gitignore * Bump version to 2.4.0 * Update readme * Pre-download california housing data * Use pre-downloaded california housing data * Pin lightning<=2.5.6 * Fix typo in find and replace * Fix estimators has no attribute __sklearn_tags__ * Pin torch to 2.2.2 in tests * Fix conflict * Update pytorch-forecasting * Update pytorch-forecasting * Update pytorch-forecasting * Use numpy<2 for testing * Update scikit-learn * Run Build and UT every other day * Pin pip<24.1 * Pin pip<24.1 in pipeline * Loosen pip, install pytorch_forecasting only in py311 * Add support to new versions of nlp dependecies * Fix formats * Remove redefinition * Update mlflow versions * Fix mlflow version syntax * Update gitignore * Clean up cache to free space * Remove clean up action cache * Fix blendsearch * Update test workflow * Update setup.py * Fix catboost version * Update workflow * Prepare for python 3.14 * Support no catboost * Fix tests * Fix python_requires * Update test workflow * Fix vw tests * Remove python 3.9 * Fix nlp tests * Fix prophet * Print pip freeze for better debugging * Fix Optuna search does not support parameters of type Float with samplers of type Quantized * Save dependencies for later inspection * Fix coverage.xml not exists * Fix github action permission * Handle python 3.13 * Address openml is not installed * Check dependencies before run tests * Update dependencies * Fix syntax error * Use bash * Update dependencies * Fix git error * Loose mlflow constraints * Add rerun, use mlflow-skinny * Fix git error * Remove ray tests * Update xgboost versions * Fix automl pickle error * Don't test python 3.10 on macos as it's stuck * Rebase before push * Reduce number of branches	2026-01-09 13:40:52 +08:00
Li Jiang	5bfa0b1cd3	Improve mlflow integration and add more models (#1331 ) * Add more spark models and improved mlflow integration * Update test_extra_models, setup and gitignore * Remove autofe * Remove autofe * Remove autofe * Sync changes in internal * Fix test for env without pyspark * Fix import errors * Fix tests * Fix typos * Fix pytorch-forecasting version * Remove internal funcs, rename _mlflow.py * Fix import error * Fix dependency * Fix experiment name setting * Fix dependency * Update pandas version * Update pytorch-forecasting version * Add warning message for not has_automl * Fix test errors with nltk 3.8.2 * Don't enable mlflow logging w/o an active run * Fix pytorch-forecasting can't be pickled issue * Update pyspark tests condition * Update synapseml * Update synapseml * No parent run, no logging for OSS * Log when autolog is enabled * upgrade code * Enable autolog for tune * Increase time budget for test * End run before start a new run * Update parent run * Fix import error * clean up * skip macos and win * Update notes * Update default value of model_history	2024-08-13 07:53:47 +00:00
Li Jiang	d8129b9211	Fix typos, upgrade yarn packages, add some improvements (#1290 ) * Fix typos, upgrade yarn packages, add some improvements * Fix joblib 1.4.0 breaks joblib-spark * Fix xgboost test error * Pin xgboost<2.0.0 * Try update prophet to 1.5.1 * Update github workflow * Revert prophet version * Update github workflow * Update install libomp * Fix test errors * Fix test errors * Add retry to test and coverage * Revert "Add retry to test and coverage" This reverts commit `ce13097cd5`. * Increase test budget * Add more data to test_models, try fixing ValueError: Found array with 0 sample(s) (shape=(0, 252)) while a minimum of 1 is required.	2024-07-19 13:40:04 +00:00
Li Jiang	700ff05874	Add RetrieveChat (#1158 ) * Add RetrieveChat notebook, RetrieveAssistantAgent and RetrieveUserProxyAgent * Update according to comments * Add output * Add tests, merge main, address comments * Fix tests * Merge main * Remove unnecessary code * Update test * Update notebook, some functions * Fix print issue * Update notebook * Update notebook * Update notebook * Improve retrieve utils and update notebook * Update vector db creation method * Update notebook * Update notebook * Add terminate if no more context * Update prompt and notebook, add example for update context * Update results * Update results * Update results of update context * Fix typo * Add table of contents * Update table of contents	2023-08-13 12:51:54 +00:00
Chi Wang	595f5a8025	gpt-4 support; openai workflow fix; model str; timeout; voting (#958 ) * workflow; model str; timeout * voting * notebook * pull request * recover workflow * voted answer * aoai * ignore None answer * default config * note * gpt-4 * n=5 * cleanup * config name * introduction * readme * avoid None * add output/ to gitignore * openai version * invalid var * comment long running cells	2023-03-26 17:13:06 +00:00
Li Jiang	50334f2c52	Support spark dataframe as input dataset and spark models as estimators (#934 ) * add basic support to Spark dataframe add support to SynapseML LightGBM model update to pyspark>=3.2.0 to leverage pandas_on_Spark API * clean code, add TODOs * add sample_train_data for pyspark.pandas dataframe, fix bugs * improve some functions, fix bugs * fix dict change size during iteration * update model predict * update LightGBM model, update test * update SynapseML LightGBM params * update synapseML and tests * update TODOs * Added support to roc_auc for spark models * Added support to score of spark estimator * Added test for automl score of spark estimator * Added cv support to pyspark.pandas dataframe * Update test, fix bugs * Added tests * Updated docs, tests, added a notebook * Fix bugs in non-spark env * Fix bugs and improve tests * Fix uninstall pyspark * Fix tests error * Fix java.lang.OutOfMemoryError: Java heap space * Fix test_performance * Update test_sparkml to test_0sparkml to use the expected spark conf * Remove unnecessary widgets in notebook * Fix iloc java.lang.StackOverflowError * fix pre-commit * Added params check for spark dataframes * Refactor code for train_test_split to a function * Update train_test_split_pyspark * Refactor if-else, remove unnecessary code * Remove y from predict, remove mem control from n_iter compute * Update workflow * Improve _split_pyspark * Fix test failure of too short training time * Fix typos, improve docstrings * Fix index errors of pandas_on_spark, add spark loss metric * Fix typo of ndcgAtK * Update NDCG metrics and tests * Remove unuseful logger * Use cache and count to ensure consistent indexes * refactor for merge maain * fix errors of refactor * Updated SparkLightGBMEstimator and cache * Updated config2params * Remove unused import * Fix unknown parameters * Update default_estimator_list * Add unit tests for spark metrics	2023-03-25 19:59:46 +00:00
Jirka Borovec	2ff1035733	precommit: end-of-file-fixer (#929 ) * precommit: end-of-file-fixer * exclude .gitignore * apply --------- Co-authored-by: Shaokun <shaokunzhang529@gmail.com>	2023-02-28 16:27:14 +00:00
levscaut	c6a2440348	add PySparkOvertimeMonitor to avoid exceeding time budget (#923 ) * merging * clean commit * Delete mylearner.py This file is not needed. * fix py4j import error * more tolerant cancelling time * fix problems following suggestions * Update flaml/tune/spark/utils.py Co-authored-by: Li Jiang <bnujli@gmail.com> * remove redundant model * Update test/spark/custom_mylearner.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * add docstr * reverse change in gitignore * Update test/spark/custom_mylearner.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> --------- Co-authored-by: Li Jiang <bnujli@gmail.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2023-02-24 08:07:00 +00:00
Xueqing Liu	5f97532986	adding evaluation (#495 ) * adding automl.score * fixing the metric name in train_with_config * adding pickle after score * fixing a bug in automl.pickle	2022-03-25 17:00:08 -04:00
Chi Wang	efd85b4c86	Deploy a new doc website (#338 ) A new documentation website. And: * add actions for doc * update docstr * installation instructions for doc dev * unify README and Getting Started * rename notebook * doc about best_model_for_estimator #340 * docstr for keep_search_state #340 * DNN Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu> Co-authored-by: Z.sk <shaokunzhang@psu.edu>	2021-12-16 17:11:33 -08:00
Xueqing Liu	42de3075e9	Make NLP tasks available from AutoML.fit() (#210 ) Sequence classification and regression: "seq-classification" and "seq-regression" Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-11-16 11:06:20 -08:00
Xueqing Liu	926589bdda	exception, coverage for autohf (#106 ) * increase coverage * fixing exception messages * fixing import	2021-06-14 14:11:40 -07:00
Xueqing Liu	a4049ad9b6	autohf (#43 ) automate huggingface transformer	2021-06-09 08:37:03 -07:00
Chi Wang	4a8110c87b	pickle the AutoML object (#37 ) * pickle the AutoML object * get best model per estimator * test deberta * stateless API * Add Gitter badge (#41) * prevent divide by zero * test roberta * BlendSearchTuner Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com> Co-authored-by: The Gitter Badger <badger@gitter.im>	2021-03-16 22:13:35 -07:00
Chi Wang	1560a6e52a	V0.2.7 (#35 ) * bug fix * admissible region * use CFO's init point as backup * step lower bound * test electra	2021-03-05 23:39:14 -08:00
Chi Wang	6ff0ed434b	v0.2.5 (#30 ) * test distillbert * import check * complete partial config * None check * init config is not suggested by bo * badge * notebook for lightgbm	2021-02-22 22:10:41 -08:00
Chi Wang	776aa55189	V0.2.2 (#19 ) * v0.2.2 separate the HPO part into the module flaml.tune enhanced implementation of FLOW^2, CFO and BlendSearch support parallel tuning using ray tune add support for sample_weight and generic fit arguments enable mlflow logging Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com> Co-authored-by: qingyun-wu <qw2ky@virginia.edu>	2021-02-05 21:41:14 -08:00
Chi Wang (MSR)	492990655d	v0.1.0	2020-12-04 09:40:27 -08:00

19 Commits