* Merged PR 1686010: Bump version to 2.3.5.post2, Distribute source and wheel, Fix license-file, Only log better models
- Fix license-file
- Bump version to 2.3.5.post2
- Distribute source and wheel
- Log better models only
- Add artifact_path to register_automl_pipeline
- Improve logging of _automl_user_configurations
----
This pull request fixes the project’s configuration by updating the license metadata for compliance with FLAML OSS 2.3.5.
The changes in `/pyproject.toml` update the project’s license and readme metadata by replacing deprecated keys with the new structured fields.
- `/pyproject.toml`: Replaced `license_file` with `license = { text = "MIT" }`.
- `/pyproject.toml`: Replaced `description-file` with `readme = "README.md"`.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->
Related work items: #4252053
* Merged PR 1688479: Handle feature_importances_ is None, Catch RuntimeError and wait for spark cluster to recover
- Add warning message when feature_importances_ is None (#3982120)
- Catch RuntimeError and wait for spark cluster to recover (#3982133)
----
Bug fix.
This pull request prevents an AttributeError in the feature importance plotting function by adding a check for a `None` value with an informative warning message.
- `flaml/fabric/visualization.py`: Checks if `result.feature_importances_` is `None`, logs a warning with possible reasons, and returns early.
- `flaml/fabric/visualization.py`: Imports `logger` from `flaml.automl.logger` to support the warning message.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->
Related work items: #3982120, #3982133
* Removed deprecated metadata section
* Fix log_params, log_artifact doesn't support run_id in mlflow 2.6.0
* Remove autogen
* Remove autogen
* Remove autogen
* Merged PR 1776547: Fix flaky test test_automl
Don't throw error when time budget is not enough
----
#### AI description (iteration 1)
#### PR Classification
Bug fix addressing a failing test in the AutoML notebook example.
#### PR Summary
This PR fixes a flaky test by adding a conditional check in the AutoML test that prints a message and exits early if no best estimator is set, thereby preventing unpredictable test failures.
- `test/automl/test_notebook_example.py`: Introduced a check to print "Training budget is not sufficient" and return if `automl.best_estimator` is not found.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->
Related work items: #4573514
* Merged PR 1777952: Fix unrecognized or malformed field 'license-file' when uploading wheel to feed
Try to fix InvalidDistribution: Invalid distribution metadata: unrecognized or malformed field 'license-file'
----
Bug fix addressing package metadata configuration.
This pull request fixes the error with unrecognized or malformed license file fields during wheel uploads by updating the setup configuration.
- In `setup.py`, added `license="MIT"` and `license_files=["LICENSE"]` to provide proper license metadata.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->
Related work items: #4560034
* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0
* Cherry-pick Merged PR 1890869: Improve time_budget estimation for mlflow logging
* Cherry-pick Merged PR 1879296: Add support to python 3.12 and spark 4.0
* Disable openai workflow
* Add python 3.12 to test envs
* Manually trigger openai
* Support markdown files with underscore-prefixed file names
* Improve save dependencies
* SynapseML is not installed
* Fix syntax error:Module !flaml/autogen was never imported
* macos 3.12 also hangs
* fix syntax error
* Update python version in actions
* Install setuptools for using pkg_resources
* Fix test_automl_performance in Github actions
* Fix test_nested_run
* Update gitignore
* Bump version to 2.4.0
* Update readme
* Pre-download california housing data
* Use pre-downloaded california housing data
* Pin lightning<=2.5.6
* Fix typo in find and replace
* Fix estimators has no attribute __sklearn_tags__
* Pin torch to 2.2.2 in tests
* Fix conflict
* Update pytorch-forecasting
* Update pytorch-forecasting
* Update pytorch-forecasting
* Use numpy<2 for testing
* Update scikit-learn
* Run Build and UT every other day
* Pin pip<24.1
* Pin pip<24.1 in pipeline
* Loosen pip, install pytorch_forecasting only in py311
* Add support to new versions of nlp dependecies
* Fix formats
* Remove redefinition
* Update mlflow versions
* Fix mlflow version syntax
* Update gitignore
* Clean up cache to free space
* Remove clean up action cache
* Fix blendsearch
* Update test workflow
* Update setup.py
* Fix catboost version
* Update workflow
* Prepare for python 3.14
* Support no catboost
* Fix tests
* Fix python_requires
* Update test workflow
* Fix vw tests
* Remove python 3.9
* Fix nlp tests
* Fix prophet
* Print pip freeze for better debugging
* Fix Optuna search does not support parameters of type Float with samplers of type Quantized
* Save dependencies for later inspection
* Fix coverage.xml not exists
* Fix github action permission
* Handle python 3.13
* Address openml is not installed
* Check dependencies before run tests
* Update dependencies
* Fix syntax error
* Use bash
* Update dependencies
* Fix git error
* Loose mlflow constraints
* Add rerun, use mlflow-skinny
* Fix git error
* Remove ray tests
* Update xgboost versions
* Fix automl pickle error
* Don't test python 3.10 on macos as it's stuck
* Rebase before push
* Reduce number of branches
* Add more spark models and improved mlflow integration
* Update test_extra_models, setup and gitignore
* Remove autofe
* Remove autofe
* Remove autofe
* Sync changes in internal
* Fix test for env without pyspark
* Fix import errors
* Fix tests
* Fix typos
* Fix pytorch-forecasting version
* Remove internal funcs, rename _mlflow.py
* Fix import error
* Fix dependency
* Fix experiment name setting
* Fix dependency
* Update pandas version
* Update pytorch-forecasting version
* Add warning message for not has_automl
* Fix test errors with nltk 3.8.2
* Don't enable mlflow logging w/o an active run
* Fix pytorch-forecasting can't be pickled issue
* Update pyspark tests condition
* Update synapseml
* Update synapseml
* No parent run, no logging for OSS
* Log when autolog is enabled
* upgrade code
* Enable autolog for tune
* Increase time budget for test
* End run before start a new run
* Update parent run
* Fix import error
* clean up
* skip macos and win
* Update notes
* Update default value of model_history
* Fix typos, upgrade yarn packages, add some improvements
* Fix joblib 1.4.0 breaks joblib-spark
* Fix xgboost test error
* Pin xgboost<2.0.0
* Try update prophet to 1.5.1
* Update github workflow
* Revert prophet version
* Update github workflow
* Update install libomp
* Fix test errors
* Fix test errors
* Add retry to test and coverage
* Revert "Add retry to test and coverage"
This reverts commit ce13097cd5.
* Increase test budget
* Add more data to test_models, try fixing ValueError: Found array with 0 sample(s) (shape=(0, 252)) while a minimum of 1 is required.
* add basic support to Spark dataframe
add support to SynapseML LightGBM model
update to pyspark>=3.2.0 to leverage pandas_on_Spark API
* clean code, add TODOs
* add sample_train_data for pyspark.pandas dataframe, fix bugs
* improve some functions, fix bugs
* fix dict change size during iteration
* update model predict
* update LightGBM model, update test
* update SynapseML LightGBM params
* update synapseML and tests
* update TODOs
* Added support to roc_auc for spark models
* Added support to score of spark estimator
* Added test for automl score of spark estimator
* Added cv support to pyspark.pandas dataframe
* Update test, fix bugs
* Added tests
* Updated docs, tests, added a notebook
* Fix bugs in non-spark env
* Fix bugs and improve tests
* Fix uninstall pyspark
* Fix tests error
* Fix java.lang.OutOfMemoryError: Java heap space
* Fix test_performance
* Update test_sparkml to test_0sparkml to use the expected spark conf
* Remove unnecessary widgets in notebook
* Fix iloc java.lang.StackOverflowError
* fix pre-commit
* Added params check for spark dataframes
* Refactor code for train_test_split to a function
* Update train_test_split_pyspark
* Refactor if-else, remove unnecessary code
* Remove y from predict, remove mem control from n_iter compute
* Update workflow
* Improve _split_pyspark
* Fix test failure of too short training time
* Fix typos, improve docstrings
* Fix index errors of pandas_on_spark, add spark loss metric
* Fix typo of ndcgAtK
* Update NDCG metrics and tests
* Remove unuseful logger
* Use cache and count to ensure consistent indexes
* refactor for merge maain
* fix errors of refactor
* Updated SparkLightGBMEstimator and cache
* Updated config2params
* Remove unused import
* Fix unknown parameters
* Update default_estimator_list
* Add unit tests for spark metrics
* merging
* clean commit
* Delete mylearner.py
This file is not needed.
* fix py4j import error
* more tolerant cancelling time
* fix problems following suggestions
* Update flaml/tune/spark/utils.py
Co-authored-by: Li Jiang <bnujli@gmail.com>
* remove redundant model
* Update test/spark/custom_mylearner.py
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* add docstr
* reverse change in gitignore
* Update test/spark/custom_mylearner.py
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
---------
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* pickle the AutoML object
* get best model per estimator
* test deberta
* stateless API
* Add Gitter badge (#41)
* prevent divide by zero
* test roberta
* BlendSearchTuner
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: The Gitter Badger <badger@gitter.im>
* v0.2.2
separate the HPO part into the module flaml.tune
enhanced implementation of FLOW^2, CFO and BlendSearch
support parallel tuning using ray tune
add support for sample_weight and generic fit arguments
enable mlflow logging
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: qingyun-wu <qw2ky@virginia.edu>