fix: KeyError no longer occurs when using groupfolds for regression tasks. (#1385 )

* fix: Now resetting indexes for regression datasets when using group folds * refactor: Simplified if statement to include all fold types * docs: Updated docs to make it clear that group folds can be used for regression tasks --------- Co-authored-by: Daniel Grindrod <daniel.grindrod@evotec.com> Co-authored-by: Li Jiang <bnujli@gmail.com>
Bump nanoid from 3.3.6 to 3.3.8 in /website (#1387 )
2026-02-14 20:59:16 +08:00 · 2024-12-18 10:06:58 +08:00 · 2024-12-17 19:26:34 +08:00 · 2024-12-17 13:54:49 +08:00 · 2024-12-04 20:50:15 +08:00 · 2024-11-20 15:48:39 +08:00
32 changed files with 9704 additions and 269 deletions
--- a/README.md
+++ b/README.md
@@ -154,3 +154,9 @@ provided by the bot. You will only need to do this once across all repos using o
 This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
 For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
 contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
+
+## Contributors Wall
+
+<a href="https://github.com/microsoft/flaml/graphs/contributors">
+  <img src="https://contrib.rocks/image?repo=microsoft/flaml&max=204" />
+</a>
--- a/flaml/init.py
+++ b/flaml/init.py
@@ -1,4 +1,5 @@
 import logging
+import warnings

 try:
    from flaml.automl import AutoML, logger_formatter
@@ -12,7 +13,8 @@ from flaml.version import __version__

 # Set the root logger.
 logger = logging.getLogger(__name__)
-logger.setLevel(logging.INFO)
+if logger.level == logging.NOTSET:
+    logger.setLevel(logging.INFO)

 if not has_automl:
-    logger.warning("flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.")
+    warnings.warn("flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.")
--- a/flaml/automl/init.py
+++ b/flaml/automl/init.py
@@ -1,5 +1,9 @@
-from flaml.automl.automl import AutoML, size
 from flaml.automl.logger import logger_formatter
-from flaml.automl.state import AutoMLState, SearchState

-__all__ = ["AutoML", "AutoMLState", "SearchState", "logger_formatter", "size"]
+try:
+    from flaml.automl.automl import AutoML, size
+    from flaml.automl.state import AutoMLState, SearchState
+
+    __all__ = ["AutoML", "AutoMLState", "SearchState", "logger_formatter", "size"]
+except ImportError:
+    __all__ = ["logger_formatter"]
--- a/flaml/automl/automl.py
+++ b/flaml/automl/automl.py
@@ -203,7 +203,7 @@ class AutoML(BaseEstimator):
                * Valid str options depend on different tasks.
                For classification tasks, valid choices are
                    ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified.
-                For regression tasks, valid choices are ["auto", 'uniform', 'time'].
+                For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group'].
                    "auto" -> uniform.
                For time series forecast tasks, must be "auto" or 'time'.
                For ranking task, must be "auto" or 'group'.
@@ -739,7 +739,7 @@ class AutoML(BaseEstimator):
                * Valid str options depend on different tasks.
                For classification tasks, valid choices are
                    ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified.
-                For regression tasks, valid choices are ["auto", 'uniform', 'time'].
+                For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group'].
                    "auto" -> uniform.
                For time series forecast tasks, must be "auto" or 'time'.
                For ranking task, must be "auto" or 'group'.
@@ -1358,7 +1358,7 @@ class AutoML(BaseEstimator):
                * Valid str options depend on different tasks.
                For classification tasks, valid choices are
                    ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified.
-                For regression tasks, valid choices are ["auto", 'uniform', 'time'].
+                For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group'].
                    "auto" -> uniform.
                For time series forecast tasks, must be "auto" or 'time'.
                For ranking task, must be "auto" or 'group'.
--- a/flaml/automl/data.py
+++ b/flaml/automl/data.py
@@ -293,7 +293,7 @@ class DataTransformer:
                    y = y.rename(TS_VALUE_COL)
            for column in X.columns:
                # sklearn\utils\validation.py needs int/float values
-                if X[column].dtype.name in ("object", "category"):
+                if X[column].dtype.name in ("object", "category", "string"):
                    if X[column].nunique() == 1 or X[column].nunique(dropna=True) == n - X[column].isnull().sum():
                        X.drop(columns=column, inplace=True)
                        drop = True
--- a/flaml/automl/model.py
+++ b/flaml/automl/model.py
@@ -157,7 +157,25 @@ class BaseEstimator:

    @property
    def estimator(self):
-        """Trained model after fit() is called, or None before fit() is called."""
+        """
+        Get the best trained estimator model.
+
+        Returns:
+            object or None: The trained model obtained after calling the `fit()` method,
+            representing the best estimator found during the training process. If `fit()` has
+            not been called yet, it returns `None`.
+
+        Examples:
+            >>> from flaml import AutoML
+            >>> automl = AutoML()
+            >>> automl.fit(X_train, y_train)
+            >>> best_estimator = automl.model.estimator
+            >>> print(best_estimator)
+            RandomForestClassifier()
+
+            Note:
+            To access the best estimator, use `automl.model.estimator`.
+        """
        return self._model

    @property
@@ -249,9 +267,11 @@ class BaseEstimator:
            mem = psutil.virtual_memory() if psutil is not None else None
            try:
                with limit_resource(
-                    mem.available * (1 - free_mem_ratio) + psutil.Process(os.getpid()).memory_info().rss
-                    if mem is not None
-                    else -1,
+                    (
+                        mem.available * (1 - free_mem_ratio) + psutil.Process(os.getpid()).memory_info().rss
+                        if mem is not None
+                        else -1
+                    ),
                    budget,
                ):
                    train_time = self._fit(X_train, y_train, **kwargs)
@@ -1534,12 +1554,16 @@ class LGBMEstimator(BaseEstimator):
            if n_iter > 1:
                max_iter = min(
                    n_iter,
-                    int((budget - time.time() + start_time - self._t1) / self._time_per_iter + 1)
-                    if budget is not None
-                    else n_iter,
-                    int((1 - free_mem_ratio) * mem0 / self._mem_per_iter)
-                    if psutil is not None and self._mem_per_iter > 0
-                    else n_iter,
+                    (
+                        int((budget - time.time() + start_time - self._t1) / self._time_per_iter + 1)
+                        if budget is not None
+                        else n_iter
+                    ),
+                    (
+                        int((1 - free_mem_ratio) * mem0 / self._mem_per_iter)
+                        if psutil is not None and self._mem_per_iter > 0
+                        else n_iter
+                    ),
                )
                if trained and max_iter <= self.params[self.ITER_HP]:
                    return time.time() - start_time
@@ -1561,18 +1585,17 @@ class LGBMEstimator(BaseEstimator):
                    callbacks = None
            if callbacks is None:
                self._fit(X_train, y_train, **kwargs)
-            else:
-                self._fit(X_train, y_train, callbacks=callbacks, **kwargs)
-            if callbacks is None:
                # for xgboost>=1.6.0, pop callbacks to enable pickle
                callbacks = self.params.pop("callbacks")
                self._model.set_params(callbacks=callbacks[:-1])
+            else:
+                self._fit(X_train, y_train, callbacks=callbacks, **kwargs)
            best_iteration = (
                getattr(self._model.get_booster(), "best_iteration", None)
                if isinstance(self, XGBoostSklearnEstimator)
                else self._model.best_iteration_
            )
-            if best_iteration is not None:
+            if best_iteration is not None and best_iteration > 0:
                self._model.set_params(n_estimators=best_iteration + 1)
        else:
            self._fit(X_train, y_train, **kwargs)
@@ -2043,8 +2066,8 @@ class CatBoostEstimator(BaseEstimator):
            self.estimator_class = CatBoostRegressor

    def fit(self, X_train, y_train, budget=None, free_mem_ratio=0, **kwargs):
-        if "is_retrain" in kwargs:
-            kwargs.pop("is_retrain")
+        kwargs.pop("is_retrain", None)
+        kwargs.pop("groups", None)
        start_time = time.time()
        deadline = start_time + budget if budget else np.inf
        train_dir = f"catboost_{str(start_time)}"
@@ -2090,7 +2113,8 @@ class CatBoostEstimator(BaseEstimator):
        if weight is not None:
            kwargs["sample_weight"] = weight
        self._model = model
-        self.params[self.ITER_HP] = self._model.tree_count_
+        # Commented-out line below incorrectly assigned n_estimators - see https://github.com/microsoft/FLAML/pull/1364
+        # self.params[self.ITER_HP] = self._model.tree_count_
        train_time = time.time() - start_time
        return train_time

@@ -2184,6 +2208,11 @@ class SVCEstimator(SKLearnEstimator):

    def __init__(self, task="binary", **config):
        super().__init__(task, **config)
+        self.params.update(
+            {
+                "random_state": config.get("random_seed", 10242048),
+            }
+        )
        assert self._task.is_classification(), "LinearSVC for classification task only"
        self.estimator_class = LinearSVC

@@ -2428,6 +2457,11 @@ class ElasticNetEstimator(SKLearnEstimator):

    def __init__(self, task="regression", **config):
        super().__init__(task, **config)
+        self.params.update(
+            {
+                "random_state": config.get("random_seed", 10242048),
+            }
+        )
        assert self._task.is_regression(), "ElasticNet for regression task only"
        self.estimator_class = ElasticNet

@@ -2752,7 +2786,7 @@ class BaseResourceLimit:
    def check_resource_limits(self, current_time, current_iteration, mllib):
        if (mllib == "xgb" and current_iteration == 0) or (mllib == "cat" and current_iteration == 1):
            self._time_per_iter = current_time - self.start_time
-        if current_time + self._time_per_iter > self.deadline:
+        if mllib != "cat" and current_time + self._time_per_iter > self.deadline:
            return False
        if psutil is not None and self.free_mem_ratio is not None:
            mem = psutil.virtual_memory()
--- a/flaml/automl/spark/metrics.py
+++ b/flaml/automl/spark/metrics.py
@@ -1,3 +1,4 @@
+import json
 from typing import Union

 import numpy as np
@@ -9,7 +10,7 @@ from pyspark.ml.evaluation import (
    RegressionEvaluator,
 )

-from flaml.automl.spark import F, psSeries
+from flaml.automl.spark import F, T, psDataFrame, psSeries, sparkDataFrame


 def ps_group_counts(groups: Union[psSeries, np.ndarray]) -> np.ndarray:
@@ -36,6 +37,16 @@ def _compute_label_from_probability(df, probability_col, prediction_col):
    return df


+def string_to_array(s):
+    try:
+        return json.loads(s)
+    except json.JSONDecodeError:
+        return []
+
+
+string_to_array_udf = F.udf(string_to_array, T.ArrayType(T.DoubleType()))
+
+
 def spark_metric_loss_score(
    metric_name: str,
    y_predict: psSeries,
@@ -135,6 +146,11 @@ def spark_metric_loss_score(
        )
    elif metric_name == "log_loss":
        # For log_loss, prediction_col should be probability, and we need to convert it to label
+        # handle data like "{'type': '1', 'values': '[1, 2, 3]'}"
+        # Fix cannot resolve "array_max(prediction)" due to data type mismatch: Parameter 1 requires the "ARRAY" type,
+        # however "prediction" has the type "STRUCT<type: TINYINT, size: INT, indices: ARRAY<INT>, values: ARRAY<DOUBLE>>"
+        df = df.withColumn(prediction_col, df[prediction_col].cast(T.StringType()))
+        df = df.withColumn(prediction_col, string_to_array_udf(df[prediction_col]))
        df = _compute_label_from_probability(df, prediction_col, prediction_col + "_label")
        evaluator = MulticlassClassificationEvaluator(
            metricName="logLoss",
--- a/flaml/automl/task/generic_task.py
+++ b/flaml/automl/task/generic_task.py
@@ -87,7 +87,6 @@ class GenericTask(Task):
                "transformer": TransformersEstimator,
                "transformer_ms": TransformersEstimatorModelSelection,
                "histgb": HistGradientBoostingEstimator,
-                # Above are open-source, below are internal
                "svc": SVCEstimator,
                "sgd": SGDEstimator,
                "nb_spark": SparkNaiveBayesEstimator,
@@ -443,8 +442,8 @@ class GenericTask(Task):
                X_train_all, y_train_all = shuffle(X_train_all, y_train_all, random_state=RANDOM_SEED)
            if data_is_df:
                X_train_all.reset_index(drop=True, inplace=True)
-            if isinstance(y_train_all, pd.Series):
-                y_train_all.reset_index(drop=True, inplace=True)
+        if isinstance(y_train_all, pd.Series):
+            y_train_all.reset_index(drop=True, inplace=True)

        X_train, y_train = X_train_all, y_train_all
        state.groups_all = state.groups
@@ -706,7 +705,6 @@ class GenericTask(Task):
            fit_kwargs = {}
        if cv_score_agg_func is None:
            cv_score_agg_func = default_cv_score_agg_func
-        start_time = time.time()
        val_loss_folds = []
        log_metric_folds = []
        metric = None
@@ -813,8 +811,6 @@ class GenericTask(Task):
            if is_spark_dataframe:
                X_train.spark.unpersist()  # uncache data to free memory
                X_val.spark.unpersist()  # uncache data to free memory
-            if budget and time.time() - start_time >= budget:
-                break
        val_loss, metric = cv_score_agg_func(val_loss_folds, log_metric_folds)
        n = total_fold_num
        pred_time /= n
--- a/flaml/automl/task/task.py
+++ b/flaml/automl/task/task.py
@@ -192,7 +192,7 @@ class Task(ABC):
                * Valid str options depend on different tasks.
                For classification tasks, valid choices are
                    ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified.
-                For regression tasks, valid choices are ["auto", 'uniform', 'time'].
+                For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group'].
                    "auto" -> uniform.
                For time series forecast tasks, must be "auto" or 'time'.
                For ranking task, must be "auto" or 'group'.
--- a/flaml/automl/time_series/ts_data.py
+++ b/flaml/automl/time_series/ts_data.py
@@ -393,7 +393,7 @@ class DataTransformerTS:

        for column in X.columns:
            # sklearn/utils/validation.py needs int/float values
-            if X[column].dtype.name in ("object", "category"):
+            if X[column].dtype.name in ("object", "category", "string"):
                if (
                    # drop columns where all values are the same
                    X[column].nunique() == 1
--- a/flaml/fabric/mlflow.py
+++ b/flaml/fabric/mlflow.py
@@ -3,6 +3,7 @@ import os
 import pickle
 import random
 import sys
+import tempfile
 import time
 from typing import MutableMapping

@@ -55,12 +56,12 @@ def get_mlflow_log_latency(model_history=False):
            sk_model = tree.DecisionTreeClassifier()
            mlflow.sklearn.log_model(sk_model, "sk_models")
            mlflow.sklearn.log_model(Pipeline([("estimator", sk_model)]), "sk_pipeline")
-            pickle_fpath = f"tmp_{int(time.time()*1000)}"
-            with open(pickle_fpath, "wb") as f:
-                pickle.dump(sk_model, f)
-            mlflow.log_artifact(pickle_fpath, "sk_model1")
-            mlflow.log_artifact(pickle_fpath, "sk_model2")
-            os.remove(pickle_fpath)
+            with tempfile.TemporaryDirectory() as tmpdir:
+                pickle_fpath = os.path.join(tmpdir, f"tmp_{int(time.time()*1000)}")
+                with open(pickle_fpath, "wb") as f:
+                    pickle.dump(sk_model, f)
+                mlflow.log_artifact(pickle_fpath, "sk_model1")
+                mlflow.log_artifact(pickle_fpath, "sk_model2")
        mlflow.set_tag("synapseml.ui.visible", "false")  # not shown inline in fabric
    mlflow.delete_run(run.info.run_id)
    et = time.time()
@@ -126,6 +127,13 @@ def _get_notebook_name():
    return None


+def safe_json_dumps(obj):
+    def default(o):
+        return str(o)
+
+    return json.dumps(obj, default=default)
+
+
 class MLflowIntegration:
    def __init__(self, experiment_type="automl", mlflow_exp_name=None, extra_tag=None):
        try:
@@ -348,12 +356,17 @@ class MLflowIntegration:
        else:
            mlflow.sklearn.log_model(model, estimator, signature=signature)

-    def _pickle_and_log_artifact(self, obj, artifact_name, pickle_fpath="temp_.pkl"):
+    def _pickle_and_log_artifact(self, obj, artifact_name, pickle_fname="temp_.pkl"):
        if not self._do_log_model:
            return
-        with open(pickle_fpath, "wb") as f:
-            pickle.dump(obj, f)
-        mlflow.log_artifact(pickle_fpath, artifact_name)
+        with tempfile.TemporaryDirectory() as tmpdir:
+            pickle_fpath = os.path.join(tmpdir, pickle_fname)
+            try:
+                with open(pickle_fpath, "wb") as f:
+                    pickle.dump(obj, f)
+                mlflow.log_artifact(pickle_fpath, artifact_name)
+            except Exception as e:
+                logger.debug(f"Failed to pickle and log artifact {artifact_name}, error: {e}")

    def pickle_and_log_automl_artifacts(self, automl, model, estimator, signature=None):
        """log automl artifacts to mlflow
@@ -432,7 +445,7 @@ class MLflowIntegration:
                "flaml.meric": automl_metric_name,
                "flaml.run_source": "flaml-automl",
                "flaml.log_type": self.log_type,
-                "flaml.automl_user_configurations": json.dumps(automl._automl_user_configurations),
+                "flaml.automl_user_configurations": safe_json_dumps(automl._automl_user_configurations),
            },
            "params": {
                "sample_size": search_state.sample_size,
--- a/flaml/tune/tune.py
+++ b/flaml/tune/tune.py
@@ -260,6 +260,8 @@ def run(
    mlflow_exp_name: Optional[str] = None,
    automl_info: Optional[Tuple[float]] = None,
    extra_tag: Optional[dict] = None,
+    cost_attr: Optional[str] = "auto",
+    cost_budget: Optional[float] = None,
    **ray_args,
 ):
    """The function-based way of performing HPO.
@@ -462,6 +464,12 @@ def run(
            overwritten by the value of `n_concurrent_trials` in AutoML. When <= 0, the concurrent trials
            will be set to the number of executors.
        extra_tag: dict, default=None | Extra tags to be added to the mlflow runs created by autologging.
+        cost_attr: None or str to specify the attribute to evaluate the cost of different trials.
+            Default is "auto", which means that we will automatically choose the cost attribute to use (depending
+            on the nature of the resource budget). When cost_attr is set to None, cost differences between different trials will be omitted
+            in our search algorithm. When cost_attr is set to a str different from "auto" and "time_total_s",
+            this cost_attr must be available in the result dict of the trial.
+        cost_budget: A float of the cost budget. Only valid when cost_attr is a str different from "auto" and "time_total_s".
        **ray_args: keyword arguments to pass to ray.tune.run().
            Only valid when use_ray=True.
    """
@@ -600,6 +608,8 @@ def run(
            metric_constraints=metric_constraints,
            use_incumbent_result_in_evaluation=use_incumbent_result_in_evaluation,
            lexico_objectives=lexico_objectives,
+            cost_attr=cost_attr,
+            cost_budget=cost_budget,
        )
    else:
        if metric is None or mode is None:
--- a/flaml/version.py
+++ b/flaml/version.py
@@ -1 +1 @@
-__version__ = "2.3.0"
+__version__ = "2.3.3"
--- a/test/automl/test_classification.py
+++ b/test/automl/test_classification.py
@@ -1,11 +1,15 @@
 import unittest
 from datetime import datetime
+from test.conftest import evaluate_cv_folds_with_underlying_model

 import numpy as np
 import pandas as pd
+import pytest
 import scipy.sparse
 from sklearn.datasets import load_breast_cancer
-from sklearn.model_selection import train_test_split
+from sklearn.model_selection import (
+    train_test_split,
+)

 from flaml import AutoML, tune
 from flaml.automl.model import LGBMEstimator
@@ -420,6 +424,122 @@ class TestClassification(unittest.TestCase):
        print(automl_experiment.best_estimator)


+@pytest.mark.parametrize(
+    "estimator",
+    [
+        "catboost",
+        "extra_tree",
+        "histgb",
+        "kneighbor",
+        "lgbm",
+        # "lrl1",
+        "lrl2",
+        "rf",
+        "svc",
+        "xgboost",
+        "xgb_limitdepth",
+    ],
+)
+def test_reproducibility_of_classification_models(estimator: str):
+    """FLAML finds the best model for a given dataset, which it then provides to users.
+
+    However, there are reported issues where FLAML was providing an incorrect model - see here:
+    https://github.com/microsoft/FLAML/issues/1317
+    In this test we take the best model which FLAML provided us, and then retrain and test it on the
+    same folds, to verify that the result is reproducible.
+    """
+    automl = AutoML()
+    automl_settings = {
+        "max_iter": 5,
+        "time_budget": -1,
+        "task": "classification",
+        "n_jobs": 1,
+        "estimator_list": [estimator],
+        "eval_method": "cv",
+        "n_splits": 10,
+        "metric": "f1",
+        "keep_search_state": True,
+        "skip_transform": True,
+    }
+    X, y = load_breast_cancer(return_X_y=True, as_frame=True)
+    automl.fit(X_train=X, y_train=y, **automl_settings)
+    best_model = automl.model
+    assert best_model is not None
+    config = best_model.get_params()
+    val_loss_flaml = automl.best_result["val_loss"]
+
+    # Take the best model, and see if we can reproduce the best result
+    reproduced_val_loss, metric_for_logging, train_time, pred_time = automl._state.task.evaluate_model_CV(
+        config=config,
+        estimator=best_model,
+        X_train_all=automl._state.X_train_all,
+        y_train_all=automl._state.y_train_all,
+        budget=None,
+        kf=automl._state.kf,
+        eval_metric="f1",
+        best_val_loss=None,
+        cv_score_agg_func=None,
+        log_training_metric=False,
+        fit_kwargs=None,
+        free_mem_ratio=0,
+    )
+    assert pytest.approx(val_loss_flaml) == reproduced_val_loss
+
+
+@pytest.mark.parametrize(
+    "estimator",
+    [
+        "catboost",
+        "extra_tree",
+        "histgb",
+        "kneighbor",
+        "lgbm",
+        # "lrl1",
+        "lrl2",
+        "svc",
+        "rf",
+        "xgboost",
+        "xgb_limitdepth",
+    ],
+)
+def test_reproducibility_of_underlying_classification_models(estimator: str):
+    """FLAML finds the best model for a given dataset, which it then provides to users.
+
+    However, there are reported issues where FLAML was providing an incorrect model - see here:
+    https://github.com/microsoft/FLAML/issues/1317
+    FLAML defines FLAMLised models, which wrap around the underlying (SKLearn/XGBoost/CatBoost) model.
+    Ideally, FLAMLised models should perform identically to the underlying model, when fitted
+    to the same data, with no budget. This verifies that this is the case for classification models.
+    In this test we take the best model which FLAML provided us, extract the underlying model,
+     before retraining and testing it on the same folds - to verify that the result is reproducible.
+    """
+    automl = AutoML()
+    automl_settings = {
+        "max_iter": 5,
+        "time_budget": -1,
+        "task": "classification",
+        "n_jobs": 1,
+        "estimator_list": [estimator],
+        "eval_method": "cv",
+        "n_splits": 10,
+        "metric": "f1",
+        "keep_search_state": True,
+        "skip_transform": True,
+    }
+    X, y = load_breast_cancer(return_X_y=True, as_frame=True)
+    automl.fit(X_train=X, y_train=y, **automl_settings)
+    best_model = automl.model
+    assert best_model is not None
+    val_loss_flaml = automl.best_result["val_loss"]
+    reproduced_val_loss_underlying_model = np.mean(
+        evaluate_cv_folds_with_underlying_model(
+            automl._state.X_train_all, automl._state.y_train_all, automl._state.kf, best_model.model, "classification"
+        )
+    )
+
+    assert pytest.approx(val_loss_flaml) == reproduced_val_loss_underlying_model
+
+
 if __name__ == "__main__":
    test = TestClassification()
    test.test_preprocess()
--- a/test/automl/test_mlflow.py
+++ b/test/automl/test_mlflow.py
@@ -77,6 +77,8 @@ class TestMLFlowLoggingParam:
            t = pickle.load(f)
            if __name__ == "__main__":
                print(t)
+            if not hasattr(automl.model._model, "_get_param_names"):
+                return
            for param in automl.model._model._get_param_names():
                assert eval("t._final_estimator._model" + f".{param}") == eval(
                    "automl.model._model" + f".{param}"
--- a/test/automl/test_multiclass.py
+++ b/test/automl/test_multiclass.py
@@ -187,7 +187,6 @@ class TestMultiClass(unittest.TestCase):
    def test_custom_metric(self):
        df, y = load_iris(return_X_y=True, as_frame=True)
        df["label"] = y
-        automl = AutoML()
        settings = {
            "dataframe": df,
            "label": "label",
@@ -204,7 +203,8 @@ class TestMultiClass(unittest.TestCase):
            "pred_time_limit": 1e-5,
            "ensemble": True,
        }
-        automl.fit(**settings)
+        automl = AutoML(**settings)  # test safe_json_dumps
+        automl.fit(dataframe=df, label="label")
        print(automl.classes_)
        print(automl.model)
        print(automl.config_history)
--- a/test/automl/test_regression.py
+++ b/test/automl/test_regression.py
@@ -1,9 +1,12 @@
 import unittest
+from test.conftest import evaluate_cv_folds_with_underlying_model

 import numpy as np
+import pytest
 import scipy.sparse
 from sklearn.datasets import (
    fetch_california_housing,
+    make_regression,
 )

 from flaml import AutoML
@@ -205,7 +208,6 @@ class TestRegression(unittest.TestCase):


 def test_multioutput():
-    from sklearn.datasets import make_regression
    from sklearn.model_selection import train_test_split
    from sklearn.multioutput import MultiOutputRegressor, RegressorChain

@@ -230,5 +232,210 @@ def test_multioutput():
    print(model.predict(X_test))


+@pytest.mark.parametrize(
+    "estimator",
+    [
+        "catboost",
+        "enet",
+        "extra_tree",
+        "histgb",
+        "kneighbor",
+        "lgbm",
+        "rf",
+        "xgboost",
+        "xgb_limitdepth",
+    ],
+)
+def test_reproducibility_of_regression_models(estimator: str):
+    """FLAML finds the best model for a given dataset, which it then provides to users.
+
+    However, there are reported issues where FLAML was providing an incorrect model - see here:
+    https://github.com/microsoft/FLAML/issues/1317
+    In this test we take the best regression model which FLAML provided us, and then retrain and test it on the
+    same folds, to verify that the result is reproducible.
+    """
+    automl = AutoML()
+    automl_settings = {
+        "max_iter": 2,
+        "time_budget": -1,
+        "task": "regression",
+        "n_jobs": 1,
+        "estimator_list": [estimator],
+        "eval_method": "cv",
+        "n_splits": 3,
+        "metric": "r2",
+        "keep_search_state": True,
+        "skip_transform": True,
+        "retrain_full": True,
+    }
+    X, y = fetch_california_housing(return_X_y=True, as_frame=True)
+    automl.fit(X_train=X, y_train=y, **automl_settings)
+    best_model = automl.model
+    assert best_model is not None
+    config = best_model.get_params()
+    val_loss_flaml = automl.best_result["val_loss"]
+
+    # Take the best model, and see if we can reproduce the best result
+    reproduced_val_loss, metric_for_logging, train_time, pred_time = automl._state.task.evaluate_model_CV(
+        config=config,
+        estimator=best_model,
+        X_train_all=automl._state.X_train_all,
+        y_train_all=automl._state.y_train_all,
+        budget=None,
+        kf=automl._state.kf,
+        eval_metric="r2",
+        best_val_loss=None,
+        cv_score_agg_func=None,
+        log_training_metric=False,
+        fit_kwargs=None,
+        free_mem_ratio=0,
+    )
+    assert pytest.approx(val_loss_flaml) == reproduced_val_loss
+
+
+def test_reproducibility_of_catboost_regression_model():
+    """FLAML finds the best model for a given dataset, which it then provides to users.
+
+    However, there are reported issues around the catboost model - see here:
+    https://github.com/microsoft/FLAML/issues/1317
+    In this test we take the best catboost regression model which FLAML provided us, and then retrain and test it on the
+    same folds, to verify that the result is reproducible.
+    """
+    automl = AutoML()
+    automl_settings = {
+        "time_budget": 7,
+        "task": "regression",
+        "n_jobs": 1,
+        "estimator_list": ["catboost"],
+        "eval_method": "cv",
+        "n_splits": 10,
+        "metric": "r2",
+        "keep_search_state": True,
+        "skip_transform": True,
+        "retrain_full": True,
+    }
+    X, y = fetch_california_housing(return_X_y=True, as_frame=True)
+    automl.fit(X_train=X, y_train=y, **automl_settings)
+    best_model = automl.model
+    assert best_model is not None
+    config = best_model.get_params()
+    val_loss_flaml = automl.best_result["val_loss"]
+
+    # Take the best model, and see if we can reproduce the best result
+    reproduced_val_loss, metric_for_logging, train_time, pred_time = automl._state.task.evaluate_model_CV(
+        config=config,
+        estimator=best_model,
+        X_train_all=automl._state.X_train_all,
+        y_train_all=automl._state.y_train_all,
+        budget=None,
+        kf=automl._state.kf,
+        eval_metric="r2",
+        best_val_loss=None,
+        cv_score_agg_func=None,
+        log_training_metric=False,
+        fit_kwargs=None,
+        free_mem_ratio=0,
+    )
+    assert pytest.approx(val_loss_flaml) == reproduced_val_loss
+
+
+def test_reproducibility_of_lgbm_regression_model():
+    """FLAML finds the best model for a given dataset, which it then provides to users.
+
+    However, there are reported issues around LGBMs - see here:
+    https://github.com/microsoft/FLAML/issues/1368
+    In this test we take the best LGBM regression model which FLAML provided us, and then retrain and test it on the
+    same folds, to verify that the result is reproducible.
+    """
+    automl = AutoML()
+    automl_settings = {
+        "time_budget": 3,
+        "task": "regression",
+        "n_jobs": 1,
+        "estimator_list": ["lgbm"],
+        "eval_method": "cv",
+        "n_splits": 9,
+        "metric": "r2",
+        "keep_search_state": True,
+        "skip_transform": True,
+        "retrain_full": True,
+    }
+    X, y = fetch_california_housing(return_X_y=True, as_frame=True)
+    automl.fit(X_train=X, y_train=y, **automl_settings)
+    best_model = automl.model
+    assert best_model is not None
+    config = best_model.get_params()
+    val_loss_flaml = automl.best_result["val_loss"]
+
+    # Take the best model, and see if we can reproduce the best result
+    reproduced_val_loss, metric_for_logging, train_time, pred_time = automl._state.task.evaluate_model_CV(
+        config=config,
+        estimator=best_model,
+        X_train_all=automl._state.X_train_all,
+        y_train_all=automl._state.y_train_all,
+        budget=None,
+        kf=automl._state.kf,
+        eval_metric="r2",
+        best_val_loss=None,
+        cv_score_agg_func=None,
+        log_training_metric=False,
+        fit_kwargs=None,
+        free_mem_ratio=0,
+    )
+    assert pytest.approx(val_loss_flaml) == reproduced_val_loss or val_loss_flaml > reproduced_val_loss
+
+
+@pytest.mark.parametrize(
+    "estimator",
+    [
+        "catboost",
+        "enet",
+        "extra_tree",
+        "histgb",
+        "kneighbor",
+        "lgbm",
+        "rf",
+        "xgboost",
+        "xgb_limitdepth",
+    ],
+)
+def test_reproducibility_of_underlying_regression_models(estimator: str):
+    """FLAML finds the best model for a given dataset, which it then provides to users.
+
+    However, there are reported issues where FLAML was providing an incorrect model - see here:
+    https://github.com/microsoft/FLAML/issues/1317
+    FLAML defines FLAMLised models, which wrap around the underlying (SKLearn/XGBoost/CatBoost) model.
+    Ideally, FLAMLised models should perform identically to the underlying model, when fitted
+    to the same data, with no budget. This verifies that this is the case for regression models.
+    In this test we take the best model which FLAML provided us, extract the underlying model,
+     before retraining and testing it on the same folds - to verify that the result is reproducible.
+    """
+    automl = AutoML()
+    automl_settings = {
+        "max_iter": 5,
+        "time_budget": -1,
+        "task": "regression",
+        "n_jobs": 1,
+        "estimator_list": [estimator],
+        "eval_method": "cv",
+        "n_splits": 10,
+        "metric": "r2",
+        "keep_search_state": True,
+        "skip_transform": True,
+        "retrain_full": False,
+    }
+    X, y = fetch_california_housing(return_X_y=True, as_frame=True)
+    automl.fit(X_train=X, y_train=y, **automl_settings)
+    best_model = automl.model
+    assert best_model is not None
+    val_loss_flaml = automl.best_result["val_loss"]
+    reproduced_val_loss_underlying_model = np.mean(
+        evaluate_cv_folds_with_underlying_model(
+            automl._state.X_train_all, automl._state.y_train_all, automl._state.kf, best_model.model, "regression"
+        )
+    )
+    assert pytest.approx(val_loss_flaml) == reproduced_val_loss_underlying_model
+
+
 if __name__ == "__main__":
    unittest.main()
--- a/test/automl/test_split.py
+++ b/test/automl/test_split.py
@@ -1,4 +1,5 @@
-from sklearn.datasets import fetch_openml
+import numpy as np
+from sklearn.datasets import fetch_openml, load_iris
 from sklearn.metrics import accuracy_score
 from sklearn.model_selection import GroupKFold, KFold, train_test_split

@@ -48,7 +49,7 @@ def test_time():
    _test(split_type="time")


-def test_groups():
+def test_groups_for_classification_task():
    from sklearn.externals._arff import ArffException

    try:
@@ -68,7 +69,7 @@ def test_groups():
        "model_history": True,
        "eval_method": "cv",
        "groups": np.random.randint(low=0, high=10, size=len(y)),
-        "estimator_list": ["lgbm", "rf", "xgboost", "kneighbor"],
+        "estimator_list": ["catboost", "lgbm", "rf", "xgboost", "kneighbor"],
        "learner_selector": "roundrobin",
    }
    automl.fit(X, y, **automl_settings)
@@ -88,6 +89,35 @@ def test_groups():
    automl.fit(X, y, **automl_settings)


+def test_groups_for_regression_task():
+    """Append nonsensical groups to iris dataset and use it to test that GroupKFold works for regression tasks"""
+    iris_dict_data = load_iris(as_frame=True)  # numpy arrays
+    iris_data = iris_dict_data["frame"]  # pandas dataframe data + target
+
+    rng = np.random.default_rng(42)
+    iris_data["cluster"] = rng.integers(
+        low=0, high=5, size=iris_data.shape[0]
+    )  # np.random.randint(0, 5, iris_data.shape[0])
+
+    automl = AutoML()
+    X = iris_data[["sepal length (cm)", "sepal width (cm)", "petal length (cm)"]].to_numpy()
+    y = iris_data["petal width (cm)"]
+    X_train, X_test, y_train, y_test, groups_train, groups_test = train_test_split(
+        X, y, iris_data["cluster"], random_state=42
+    )
+    automl_settings = {
+        "max_iter": 5,
+        "time_budget": -1,
+        "metric": "r2",
+        "task": "regression",
+        "estimator_list": ["lgbm", "rf", "xgboost", "kneighbor"],
+        "eval_method": "cv",
+        "split_type": "uniform",
+        "groups": groups_train,
+    }
+    automl.fit(X_train, y_train, **automl_settings)
+
+
 def test_stratified_groupkfold():
    from minio.error import ServerError
    from sklearn.model_selection import StratifiedGroupKFold
@@ -108,6 +138,7 @@ def test_stratified_groupkfold():
        "split_type": splitter,
        "groups": X_train["Airline"],
        "estimator_list": [
+            "catboost",
            "lgbm",
            "rf",
            "xgboost",
@@ -203,4 +234,4 @@ def test_object():


 if __name__ == "__main__":
-    test_groups()
+    test_groups_for_classification_task()
--- a/test/conftest.py
+++ b/test/conftest.py
@@ -0,0 +1,42 @@
+from typing import Any, Dict, List, Union
+
+import numpy as np
+import pandas as pd
+from catboost import CatBoostClassifier, CatBoostRegressor, Pool
+from sklearn.metrics import f1_score, r2_score
+
+
+def evaluate_cv_folds_with_underlying_model(X_train_all, y_train_all, kf, model: Any, task: str) -> pd.DataFrame:
+    """Mimic the FLAML CV process to calculate the metrics across each fold.
+
+    :param X_train_all: X training data
+    :param y_train_all: y training data
+    :param kf: The splitter object to use to generate the folds
+    :param model: The estimator to fit to the data during the CV process
+    :param task: classification or regression
+    :return: An array containing the metrics
+    """
+    rng = np.random.RandomState(2020)
+    all_fold_metrics: List[Dict[str, Union[int, float]]] = []
+    for train_index, val_index in kf.split(X_train_all, y_train_all):
+        X_train_split, y_train_split = X_train_all, y_train_all
+        train_index = rng.permutation(train_index)
+        X_train = X_train_split.iloc[train_index]
+        X_val = X_train_split.iloc[val_index]
+        y_train, y_val = y_train_split[train_index], y_train_split[val_index]
+        model_type = type(model)
+        if model_type is not CatBoostClassifier and model_type is not CatBoostRegressor:
+            model.fit(X_train, y_train)
+        else:
+            use_best_model = True
+            n = max(int(len(y_train) * 0.9), len(y_train) - 1000) if use_best_model else len(y_train)
+            X_tr, y_tr = (X_train)[:n], y_train[:n]
+            eval_set = Pool(data=X_train[n:], label=y_train[n:], cat_features=[]) if use_best_model else None
+            model.fit(X_tr, y_tr, eval_set=eval_set, use_best_model=True)
+        y_pred_classes = model.predict(X_val)
+        if task == "classification":
+            reproduced_metric = 1 - f1_score(y_val, y_pred_classes)
+        else:
+            reproduced_metric = 1 - r2_score(y_val, y_pred_classes)
+        all_fold_metrics.append(reproduced_metric)
+    return all_fold_metrics
--- a/tutorials/Automl2024DemoAutoMLTask.ipynb
+++ b/tutorials/Automl2024DemoAutoMLTask.ipynb
--- a/tutorials/Automl2024DemoTuneLLM.ipynb
+++ b/tutorials/Automl2024DemoTuneLLM.ipynb
--- a/tutorials/README.md
+++ b/tutorials/README.md
@@ -1,5 +1,6 @@
 Please find tutorials on FLAML below:

+- [AutoML 2024](flaml-tutorial-automl-24.md)
 - [PyData Seattle 2023](flaml-tutorial-pydata-23.md)
 - [A hands-on tutorial on FLAML presented at KDD 2022](flaml-tutorial-kdd-22.md)
 - [A lab forum on FLAML at AAAI 2023](flaml-tutorial-aaai-23.md)
--- a/tutorials/flaml-tutorial-automl-24.md
+++ b/tutorials/flaml-tutorial-automl-24.md
@@ -0,0 +1,44 @@
+# AutoML 2024 - Automated Machine Learning & Tuning with FLAML in Microsoft Fabric
+
+## Session Information
+
+**Date and Time**: 09.09.2024, 15:30-17:00
+
+Location:  Sorbonne University, 4 place Jussieu, 75005 Paris
+
+Duration: 1.5 hours
+
+For the most up-to-date information, see the [AutoML 2024 Agenda](https://2024.automl.cc/?page_id=1401) and the [tutorial page](https://2024.automl.cc/?page_id=1643).
+
+## Abstract
+
+In this tutorial, we will provide an in-depth and hands-on guidance on Automated Machine Learning & Tuning with FLAML in Microsoft Fabric. FLAML is a fast python library for AutoML and tuning. Microsoft Fabric is an end-to-end analytics and data platform designed for enterprises that require a unified solution. In Fabric, data scientists can use flaml.AutoML to automate their machine learning tasks. We will start with an overview of the AutoML problem and our solution. We will then introduce the hyperparameter optimization methods and 60+ estimators empowering the strong performance of FLAML. We will also demonstrate how to make the best use of FLAML in Microsoft Fabric to perform automated machine learning and hyperparameter tuning in various applications with the help of rich customization choices, parallel training and advanced auto logging functionalities. At last, we will share several new features of our solution based on our latest research and development work around FLAML in Microsoft Fabric and close the tutorial with open problems and challenges learned from AutoML practice.
+
+## Motivation & Outline
+
+As data becomes increasingly complex and voluminous, the demand for robust, scalable, and user-friendly tools for model selection, hyperparameter tuning, and performance optimization has never been higher. FLAML, a fast Python library for AutoML, and Microsoft Fabric, an advanced data platform, address these needs by offering a comprehensive suite of built-in machine learning tools. What sets FLAML in Microsoft Fabric apart is its unique support for visualization, auto-featurization, advanced auto logging capabilities, and a wider range of Spark models, distinguishing it from the open-source version of FLAML. Attendees of the AutoML conference will gain invaluable insights into leveraging these technologies to streamline their workflows, improve model accuracy, and enhance productivity. By mastering the integration of FLAML with Microsoft Fabric, participants can significantly reduce the time and expertise required for machine learning tasks, making this tutorial highly relevant and essential for advancing their work in data science and analytics.
+
+In this tutorial, we will provide an in-depth and hands-on guidance on Automated Machine Learning & Tuning with FLAML in [Microsoft Fabric](https://aka.ms/fabric). FLAML (by [Wang et al., 2021](https://proceedings.mlsys.org/paper_files/paper/2021/file/1ccc3bfa05cb37b917068778f3c4523a-Paper.pdf)) is a fast python library for AutoML and tuning. It started as a research project in Microsoft Research and has grown to a popular open-source library. It has accumulated over 3.7k stars and 4M+ downloads since its first release in December 2020. FLAML is notable for being fast, economical, and easy to customize. FLAML enhances the efficiency and productivity of machine learning and data science professionals, while delivering superior predictive performance in models. FLAML’s flexibility and customizability make it an invaluable tool for research and development. Microsoft Fabric is a comprehensive analytics and data platform designed for enterprises seeking a unified solutionIt provides data science capabilities that enable users to manage the entire data science workflow—from data exploration and cleaning, through experimentation and modeling, to model scoring and delivering predictive insights into BI reports. On Microsoft Fabric, users accelerate their model training workflows through the code-first FLAML APIs available through Fabric Notebooks. Microsoft Fabric supports tracking machine learning lifecycle with MLflow. FLAML experiments and runs could be automatically logged for you to visualize, compare and analyze. All the 60+ [models](https://learn.microsoft.com/en-us/fabric/data-science/automated-machine-learning-fabric/#supported-models) trained with flaml.AutoML will be automatically recognized and logged for further usage. We will give a hands-on tutorial on (1) how to use FLAML in Microsoft Fabric to automate typical machine learning tasks and generic tuning on user-defined functions; (2) how to make the best use of FLAML in Microsoft Fabric to perform AutoML and tuning in various applications with the help of rich customization choices, parallel training and advanced auto logging functionalities; and (3) several new features of FLAML based on our latest research and development work around FLAML in Microsoft Fabric.
+
+Part 1. Overview of AutoML in Microsoft Fabric
+
+- Background of AutoML & Hyperparameter tuning
+- Quick introduction to FLAML and Microsoft Fabric
+- Task-oriented AutoML
+- Tuning generic user-defined functions
+
+Part 2. A deep dive into FLAML in Microsoft Fabric
+
+- Parallel training with spark and customizing estimator and metric
+- Track and analyze experiments and models with auto logging
+
+Part 3. New features on FLAML in Microsoft Fabric
+
+- Auto Featurization
+- Visualization
+- Tuning in-context-learning for LLM models
+
+## Notebooks
+
+- [AutoML with FLAML Library](https://github.com/microsoft/FLAML/blob/main/tutorials/Automl2024DemoAutoMLTask.ipynb)
+- [Use FLAML to Tune Large Language Models](https://github.com/microsoft/FLAML/blob/main/tutorials/Automl2024DemoTuneLLM.ipynb)
--- a/website/docs/Examples/AutoML-Classification.md
+++ b/website/docs/Examples/AutoML-Classification.md
@@ -32,6 +32,8 @@ print(automl.predict_proba(X_train))
 print(automl.model.estimator)
 ```

+**Note**: You can access the best model's estimator using `automl.model.estimator`.
+
 #### Sample of output

 ```
--- a/website/docs/Examples/AutoML-NLP.md
+++ b/website/docs/Examples/AutoML-NLP.md
@@ -47,6 +47,8 @@ if os.path.exists("data/output/"):
    shutil.rmtree("data/output/")
 ```

+**Note**: You can access the best model's estimator using `automl.model.estimator`.
+
 #### Sample output

 ```
--- a/website/docs/Examples/AutoML-Rank.md
+++ b/website/docs/Examples/AutoML-Rank.md
@@ -28,6 +28,8 @@ automl.fit(
 )
 ```

+**Note**: You can access the best model's estimator using `automl.model.estimator`.
+
 #### Sample output

 ```
--- a/website/docs/Examples/AutoML-Regression.md
+++ b/website/docs/Examples/AutoML-Regression.md
@@ -32,6 +32,8 @@ print(automl.predict(X_train))
 print(automl.model.estimator)
 ```

+**Note**: You can access the best model's estimator using `automl.model.estimator`.
+
 #### Sample output

 ```
--- a/website/docs/Examples/AutoML-Time
+++ b/website/docs/Examples/AutoML-Time
@@ -29,6 +29,8 @@ automl.fit(
 print(automl.predict(X_train[84:]))
 ```

+**Note**: You can access the best model's estimator using `automl.model.estimator`.
+
 #### Sample output

 ```
--- a/website/docs/Examples/AutoML-for-LightGBM.md
+++ b/website/docs/Examples/AutoML-for-LightGBM.md
@@ -29,6 +29,8 @@ settings = {
 automl.fit(X_train=X_train, y_train=y_train, **settings)
 ```

+**Note**: You can access the best model's estimator using `automl.model.estimator`.
+
 #### Sample output

 ```
--- a/website/docs/Examples/AutoML-for-XGBoost.md
+++ b/website/docs/Examples/AutoML-for-XGBoost.md
@@ -31,6 +31,8 @@ settings = {
 automl.fit(X_train=X_train, y_train=y_train, **settings)
 ```

+**Note**: You can access the best model's estimator using `automl.model.estimator`.
+
 #### Sample output

 ```
--- a/website/docs/Use-Cases/Task-Oriented-AutoML.md
+++ b/website/docs/Use-Cases/Task-Oriented-AutoML.md
@@ -393,7 +393,7 @@ For holdout, you can also set:

 - `split_ratio`: the fraction for validation data, 0.1 by default.
 - `X_val`, `y_val`: a separate validation dataset. When they are passed, the validation metrics will be computed against this given validation dataset. If they are not passed, then a validation dataset will be split from the training data and held out from training during the model search. After the model search, flaml will retrain the model with best configuration on the full training data.
-  You can set`retrain_full` to be `False` to skip the final retraining or "budget" to ask flaml to do its best to retrain within the time budget.
+  You can set`retrain_full` to be `False` to skip the final retraining or "budget" to ask flaml to do its best to retrain within the time budget. When `retrain_full` is set to `True`, the user-provided validation data is not used in the final retraining of the model.

 For cross validation, you can also set `n_splits` of the number of folds. By default it is 5.

--- a/website/yarn.lock
+++ b/website/yarn.lock
Author	SHA1	Message	Date
Daniel Grindrod	c038fbca07	fix: KeyError no longer occurs when using groupfolds for regression tasks. (#1385 ) * fix: Now resetting indexes for regression datasets when using group folds * refactor: Simplified if statement to include all fold types * docs: Updated docs to make it clear that group folds can be used for regression tasks --------- Co-authored-by: Daniel Grindrod <daniel.grindrod@evotec.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-12-18 10:06:58 +08:00
dependabot[bot]	6a99202492	Bump nanoid from 3.3.6 to 3.3.8 in /website (#1387 ) Bumps [nanoid](https://github.com/ai/nanoid) from 3.3.6 to 3.3.8. - [Release notes](https://github.com/ai/nanoid/releases) - [Changelog](https://github.com/ai/nanoid/blob/main/CHANGELOG.md) - [Commits](https://github.com/ai/nanoid/compare/3.3.6...3.3.8) --- updated-dependencies: - dependency-name: nanoid dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-12-17 19:26:34 +08:00
Daniel Grindrod	42d1dcfa0e	fix: Fixed bug with catboost and groups (#1383 ) Co-authored-by: Daniel Grindrod <daniel.grindrod@evotec.com>	2024-12-17 13:54:49 +08:00
EgorKraevTransferwise	b83c8a7d3b	Pass cost_attr and cost_budget from flaml.tune.run() to the search algo (#1382 )	2024-12-04 20:50:15 +08:00
dependabot[bot]	b9194cdcf2	Bump cross-spawn from 7.0.3 to 7.0.6 in /website (#1379 ) Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 7.0.3 to 7.0.6. - [Changelog](https://github.com/moxystudio/node-cross-spawn/blob/master/CHANGELOG.md) - [Commits](https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.6) --- updated-dependencies: - dependency-name: cross-spawn dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-20 15:48:39 +08:00
Li Jiang	9a1f6b0291	Bump version to 2.3.3 (#1378 )	2024-11-13 11:44:34 +08:00
kernelmethod	07f4413aae	Fix logging nuisances that can arise when importing flaml (#1377 )	2024-11-13 07:49:55 +08:00
Daniel Grindrod	5a74227bc3	Flaml: fix lgbm reproducibility (#1369 ) * fix: Fixed bug where every underlying LGBMRegressor or LGBMClassifier had n_estimators = 1 * test: Added test showing case where FLAMLised CatBoostModel result isn't reproducible * fix: Fixing issue where callbacks cause LGBM results to not be reproducible * Update test/automl/test_regression.py Co-authored-by: Li Jiang <bnujli@gmail.com> * fix: Adding back the LGBM EarlyStopping * refactor: Fix tweaked to ensure other models aren't likely to be affected * test: Fixed test to allow reproduced results to be better than the FLAML results, when LGBM earlystopping is involved --------- Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-11-01 10:06:15 +08:00
Ranuga	7644958e21	Add documentation for `automl.model.estimator` usage (#1311 ) * Added documentation for automl.model.estimator usage Updated documentation across various examples and the model.py file to include information about automl.model.estimator. This addition enhances the clarity and usability of FLAML by providing users with clear guidance on how to utilize this feature in their AutoML workflows. These changes aim to improve the overall user experience and facilitate easier understanding of FLAML's capabilities. * fix: Ran pre-commit hook on docs --------- Co-authored-by: Li Jiang <bnujli@gmail.com> Co-authored-by: Daniel Grindrod <dannycg1996@gmail.com> Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com>	2024-10-31 20:53:54 +08:00
Daniel Grindrod	a316f84fe1	fix: LinearSVC results now reproducible (#1376 ) Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com>	2024-10-31 14:02:16 +08:00
Daniel Grindrod	72881d3a2b	fix: Fixing the random state of ElasticNetClassifier by default, to ensure reproduciblity. Also included elasticnet in reproducibility tests (#1374 ) Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-10-29 14:21:43 +08:00
Li Jiang	69da685d1e	Fix data transform issue, spark log_loss metric compute error and json dumps TypeError (Sync Fabric till 3c545e67) (#1371 ) * Merged PR 1444697: Fix json dumps TypeError Fix json dumps TypeError ---- Bug fix to address a `TypeError` in `json.dumps`. This pull request fixes a `TypeError` encountered when using `json.dumps` on `automl._automl_user_configurations` by introducing a safe JSON serialization function. - Added `safe_json_dumps` function in `flaml/fabric/mlflow.py` to handle non-serializable objects. - Updated `MLflowIntegration` class in `flaml/fabric/mlflow.py` to use `safe_json_dumps` for JSON serialization. - Modified `test/automl/test_multiclass.py` to test the new `safe_json_dumps` function. Related work items: #3439408 * Fix data transform issue and spark log_loss metric compute error	2024-10-29 11:58:40 +08:00
Li Jiang	c01c3910eb	Update version.py (#1372 )	2024-10-29 09:33:23 +08:00
dependabot[bot]	98d3fd2f48	Bump http-proxy-middleware from 2.0.6 to 2.0.7 in /website (#1370 ) Bumps [http-proxy-middleware](https://github.com/chimurai/http-proxy-middleware) from 2.0.6 to 2.0.7. - [Release notes](https://github.com/chimurai/http-proxy-middleware/releases) - [Changelog](https://github.com/chimurai/http-proxy-middleware/blob/v2.0.7/CHANGELOG.md) - [Commits](https://github.com/chimurai/http-proxy-middleware/compare/v2.0.6...v2.0.7) --- updated-dependencies: - dependency-name: http-proxy-middleware dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-28 10:43:28 +08:00
Li Jiang	9724c626cc	Remove outdated comment (#1366 )	2024-10-24 12:17:21 +08:00
smty2018	0d92400200	Included that retrain_full = True does not include the user provided validation data in the docs. #1228 (#1245 ) * Update Task-Oriented-AutoML.md * Update Task-Oriented-AutoML.md * Update marker * Fix format --------- Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-10-23 16:48:45 +08:00
Daniel Grindrod	d224218ecf	fix: FLAML catboost metrics arent reproducible (#1364 ) * fix: CatBoostRegressors metrics are now reproducible * test: Made tests live, which ensure the reproducibility of catboost models * fix: Added defunct line of code as a comment * fix: Re-adding removed if statement, and test to show one issue that if statement can cause * fix: Stopped ending CatBoost training early when time budget is running out --------- Co-authored-by: Daniel Grindrod <Daniel.Grindrod@evotec.com>	2024-10-23 13:51:23 +08:00
Daniel Grindrod	a2a5e1abb9	test: Adding tests to verify model reproducibility (#1362 )	2024-10-12 09:53:16 +08:00
Daniel Grindrod	5c0f18b7bc	fix: Cross validation process isn't always run to completion (#1360 )	2024-10-01 08:24:53 +08:00
dependabot[bot]	e5d95f5674	Bump express from 4.19.2 to 4.21.0 in /website (#1357 )	2024-09-22 11:01:00 +08:00
Li Jiang	49ba962d47	Support logger_formatter without automl dependencies (#1356 )	2024-09-21 20:04:46 +08:00
Li Jiang	8e171bc402	Remove temporary pickle files (#1354 ) * Remove temporary pickle files * Update version to 2.3.1 * Use TemporaryDirectory for pickle and log_artifact * Fix 'CatBoostClassifier' object has no attribute '_get_param_names'	2024-09-21 15:46:32 +08:00
dependabot[bot]	c90946f303	Bump webpack from 5.76.1 to 5.94.0 in /website (#1342 ) Bumps [webpack](https://github.com/webpack/webpack) from 5.76.1 to 5.94.0. - [Release notes](https://github.com/webpack/webpack/releases) - [Commits](https://github.com/webpack/webpack/compare/v5.76.1...v5.94.0) --- updated-dependencies: - dependency-name: webpack dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-06 11:56:42 +08:00
dependabot[bot]	64f30af603	Bump micromatch from 4.0.5 to 4.0.8 in /website (#1343 ) Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.5 to 4.0.8. - [Release notes](https://github.com/micromatch/micromatch/releases) - [Changelog](https://github.com/micromatch/micromatch/blob/master/CHANGELOG.md) - [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8) --- updated-dependencies: - dependency-name: micromatch dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Li Jiang <bnujli@gmail.com>	2024-09-05 15:18:26 +08:00
Li Jiang	f45582d3c7	Add info of tutorial automl 2024 (#1344 ) * Add info of tutorial automl 2024 * Add notebooks * Fix links * Update usage of built-in LLMs	2024-09-04 19:35:09 +08:00
Li Jiang	bf4bca2195	Add contributors wall (#1341 ) * Add contributors wall * code format	2024-08-30 22:33:44 +08:00