create an automl option to remove unnecessary dependency for autogen and tune (#1007)

* version update post release v1.2.2

* automl option

* import pandas

* remove automl.utils

* default

* test

* type hint and version update

* dependency update

* link to open in colab

* use packging.version to close #725

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
This commit is contained in:
Chi Wang
2023-05-24 16:55:04 -07:00
committed by GitHub
parent e9fdbc6e02
commit a0b318b12e
48 changed files with 2013 additions and 1154 deletions

View File

@@ -1,5 +1,13 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"attachments": {},
"cell_type": "markdown",
@@ -23,7 +31,7 @@
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the [openai,blendsearch] option:\n",
"```bash\n",
"pip install flaml[openai,blendsearch]==1.2.2\n",
"pip install flaml[openai,blendsearch]\n",
"```"
]
},
@@ -40,7 +48,7 @@
},
"outputs": [],
"source": [
"# %pip install flaml[openai,blendsearch]==1.2.2 datasets"
"# %pip install flaml[openai,blendsearch] datasets"
]
},
{

View File

@@ -1,5 +1,13 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_openai.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"attachments": {},
"cell_type": "markdown",
@@ -23,7 +31,7 @@
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the [autogen,blendsearch] option:\n",
"```bash\n",
"pip install flaml[autogen,blendsearch]==1.2.2\n",
"pip install flaml[autogen,blendsearch]\n",
"```"
]
},
@@ -40,7 +48,7 @@
},
"outputs": [],
"source": [
"# %pip install flaml[autogen,blendsearch]==1.2.2 datasets"
"# %pip install flaml[autogen,blendsearch] datasets"
]
},
{

File diff suppressed because one or more lines are too long

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"slideshow": {
@@ -27,9 +28,9 @@
"\n",
"In this notebook, we demonstrate how to use FLAML library to tune hyperparameters of LightGBM with a regression example.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `notebook` option:\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `automl` option (this option is introduced from version 2, for version 1 it is installed by default):\n",
"```bash\n",
"pip install flaml[notebook]\n",
"pip install flaml[automl]\n",
"```"
]
},
@@ -39,7 +40,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install flaml[notebook]==1.0.10"
"%pip install flaml[automl] matplotlib openml"
]
},
{
@@ -786,11 +787,6 @@
"model = lgb.train(params, dtrain, valid_sets=[dtrain, dval], verbose_eval=10000) \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": 20,

File diff suppressed because one or more lines are too long

View File

@@ -25,7 +25,7 @@
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `synapse` option:\n",
"```bash\n",
"pip install flaml[synapse]>=1.1.3; \n",
"pip install flaml[synapse] \n",
"```\n",
" "
]
@@ -36,7 +36,7 @@
"metadata": {},
"outputs": [],
"source": [
"# %pip install \"flaml[synapse]>=1.1.3\""
"# %pip install \"flaml[synapse]\""
]
},
{

View File

@@ -8,6 +8,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -21,7 +22,7 @@
"\n",
"In this notebook, we demonstrate how to use FLAML library for time series forecasting tasks: univariate time series forecasting (only time), multivariate time series forecasting (with exogneous variables) and forecasting discrete values.\n",
"\n",
"FLAML requires Python>=3.7. To run this notebook example, please install flaml with the notebook and forecast option:\n"
"FLAML requires Python>=3.7. To run this notebook example, please install flaml with the [automl,ts_forecast] option:\n"
]
},
{
@@ -156,7 +157,7 @@
}
],
"source": [
"%pip install flaml[notebook,ts_forecast]==1.1.2\n",
"%pip install flaml[automl,ts_forecast] matplotlib openml\n",
"# avoid version 1.0.2 to 1.0.5 for this notebook due to a bug for arima and sarimax's init config"
]
},

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"slideshow": {
@@ -27,9 +28,9 @@
"\n",
"In this notebook, we demonstrate how to use FLAML library to tune hyperparameters of XGBoost with a regression example.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `notebook` option:\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `automl` option (this option is introduced from version 2, for version 1 it is installed by default):\n",
"```bash\n",
"pip install flaml[notebook]==1.1.2\n",
"pip install flaml[automl]\n",
"```"
]
},
@@ -39,7 +40,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install flaml[notebook]==1.1.2"
"%pip install flaml[automl] matplotlib openml"
]
},
{

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"slideshow": {
@@ -27,9 +28,9 @@
"\n",
"In this notebook, we use one real data example (binary classification) to showcase how to use FLAML library together with AzureML.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the [azureml] option:\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the [automl,azureml] option:\n",
"```bash\n",
"pip install flaml[azureml]\n",
"pip install flaml[automl,azureml]\n",
"```"
]
},
@@ -39,7 +40,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install flaml[azureml]"
"%pip install flaml[automl,azureml]"
]
},
{

View File

@@ -21,6 +21,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -39,12 +40,21 @@
"\n",
"In this notebook, we use one real data example (binary classification) to showcase how to use FLAML library.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `notebook` option:\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `[automl]` option (this option is introduced from version 2, for version 1 it is installed by default):\n",
"```bash\n",
"pip install flaml[notebook]\n",
"pip install flaml[automl]\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"%pip install flaml[automl] openml"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -72,15 +82,6 @@
"#### As FLAML's AutoML module can be used a transformer in the Sklearn's pipeline we can get all the benefits of pipeline and thereby write extremley clean, and resuable code."
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"%pip install flaml[notebook]"
]
},
{
"cell_type": "markdown",
"metadata": {},

File diff suppressed because one or more lines are too long

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -22,7 +23,7 @@
"\n",
"*Running this notebook takes about one hour.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `notebook` and `nlp` options:\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the legacy `[nlp]` options:\n",
"\n",
"```bash\n",
"pip install flaml[nlp]==0.7.1 # in higher version of flaml, the API for nlp tasks changed\n",
@@ -362,10 +363,10 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u001B[2m\u001B[36m(pid=50964)\u001B[0m {'eval_loss': 0.5942569971084595, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10434782608695652}\n",
"\u001B[2m\u001B[36m(pid=50964)\u001B[0m {'eval_loss': 0.5942569971084595, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10434782608695652}\n",
"\u001B[2m\u001B[36m(pid=50948)\u001B[0m {'eval_loss': 0.649192214012146, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.2}\n",
"\u001B[2m\u001B[36m(pid=50948)\u001B[0m {'eval_loss': 0.649192214012146, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.2}\n"
"\u001b[2m\u001b[36m(pid=50964)\u001b[0m {'eval_loss': 0.5942569971084595, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10434782608695652}\n",
"\u001b[2m\u001b[36m(pid=50964)\u001b[0m {'eval_loss': 0.5942569971084595, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10434782608695652}\n",
"\u001b[2m\u001b[36m(pid=50948)\u001b[0m {'eval_loss': 0.649192214012146, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.2}\n",
"\u001b[2m\u001b[36m(pid=50948)\u001b[0m {'eval_loss': 0.649192214012146, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.2}\n"
]
},
{
@@ -483,12 +484,12 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u001B[2m\u001B[36m(pid=54411)\u001B[0m {'eval_loss': 0.624100387096405, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=54411)\u001B[0m {'eval_loss': 0.624100387096405, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=54411)\u001B[0m {'eval_loss': 0.624100387096405, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=54417)\u001B[0m {'eval_loss': 0.5938675999641418, 'eval_accuracy': 0.7156862745098039, 'eval_f1': 0.8258258258258258, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=54417)\u001B[0m {'eval_loss': 0.5938675999641418, 'eval_accuracy': 0.7156862745098039, 'eval_f1': 0.8258258258258258, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=54417)\u001B[0m {'eval_loss': 0.5938675999641418, 'eval_accuracy': 0.7156862745098039, 'eval_f1': 0.8258258258258258, 'epoch': 0.5}\n"
"\u001b[2m\u001b[36m(pid=54411)\u001b[0m {'eval_loss': 0.624100387096405, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=54411)\u001b[0m {'eval_loss': 0.624100387096405, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=54411)\u001b[0m {'eval_loss': 0.624100387096405, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=54417)\u001b[0m {'eval_loss': 0.5938675999641418, 'eval_accuracy': 0.7156862745098039, 'eval_f1': 0.8258258258258258, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=54417)\u001b[0m {'eval_loss': 0.5938675999641418, 'eval_accuracy': 0.7156862745098039, 'eval_f1': 0.8258258258258258, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=54417)\u001b[0m {'eval_loss': 0.5938675999641418, 'eval_accuracy': 0.7156862745098039, 'eval_f1': 0.8258258258258258, 'epoch': 0.5}\n"
]
},
{
@@ -588,18 +589,18 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u001B[2m\u001B[36m(pid=57835)\u001B[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001B[2m\u001B[36m(pid=57835)\u001B[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001B[2m\u001B[36m(pid=57835)\u001B[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001B[2m\u001B[36m(pid=57835)\u001B[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001B[2m\u001B[36m(pid=57836)\u001B[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001B[2m\u001B[36m(pid=57836)\u001B[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001B[2m\u001B[36m(pid=57836)\u001B[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001B[2m\u001B[36m(pid=57836)\u001B[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001B[2m\u001B[36m(pid=57839)\u001B[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=57839)\u001B[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=57839)\u001B[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=57839)\u001B[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n"
"\u001b[2m\u001b[36m(pid=57835)\u001b[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001b[2m\u001b[36m(pid=57835)\u001b[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001b[2m\u001b[36m(pid=57835)\u001b[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001b[2m\u001b[36m(pid=57835)\u001b[0m {'eval_loss': 0.5822290778160095, 'eval_accuracy': 0.7058823529411765, 'eval_f1': 0.8181818181818181, 'epoch': 0.5043478260869565}\n",
"\u001b[2m\u001b[36m(pid=57836)\u001b[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001b[2m\u001b[36m(pid=57836)\u001b[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001b[2m\u001b[36m(pid=57836)\u001b[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001b[2m\u001b[36m(pid=57836)\u001b[0m {'eval_loss': 0.6087244749069214, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.10344827586206896}\n",
"\u001b[2m\u001b[36m(pid=57839)\u001b[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=57839)\u001b[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=57839)\u001b[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=57839)\u001b[0m {'eval_loss': 0.5486209392547607, 'eval_accuracy': 0.7034313725490197, 'eval_f1': 0.8141321044546851, 'epoch': 0.5}\n"
]
},
{
@@ -699,21 +700,21 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u001B[2m\u001B[36m(pid=61251)\u001B[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=61251)\u001B[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=61251)\u001B[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=61251)\u001B[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=61251)\u001B[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001B[2m\u001B[36m(pid=61255)\u001B[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001B[2m\u001B[36m(pid=61255)\u001B[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001B[2m\u001B[36m(pid=61255)\u001B[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001B[2m\u001B[36m(pid=61255)\u001B[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001B[2m\u001B[36m(pid=61255)\u001B[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001B[2m\u001B[36m(pid=61236)\u001B[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001B[2m\u001B[36m(pid=61236)\u001B[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001B[2m\u001B[36m(pid=61236)\u001B[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001B[2m\u001B[36m(pid=61236)\u001B[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001B[2m\u001B[36m(pid=61236)\u001B[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n"
"\u001b[2m\u001b[36m(pid=61251)\u001b[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=61251)\u001b[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=61251)\u001b[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=61251)\u001b[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=61251)\u001b[0m {'eval_loss': 0.6236899495124817, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.5}\n",
"\u001b[2m\u001b[36m(pid=61255)\u001b[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001b[2m\u001b[36m(pid=61255)\u001b[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001b[2m\u001b[36m(pid=61255)\u001b[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001b[2m\u001b[36m(pid=61255)\u001b[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001b[2m\u001b[36m(pid=61255)\u001b[0m {'eval_loss': 0.6249027848243713, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.3}\n",
"\u001b[2m\u001b[36m(pid=61236)\u001b[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001b[2m\u001b[36m(pid=61236)\u001b[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001b[2m\u001b[36m(pid=61236)\u001b[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001b[2m\u001b[36m(pid=61236)\u001b[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n",
"\u001b[2m\u001b[36m(pid=61236)\u001b[0m {'eval_loss': 0.6138392686843872, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'epoch': 0.20689655172413793}\n"
]
},
{

View File

@@ -1,6 +1,15 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/zeroshot_lightgbm.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"slideshow": {
@@ -19,16 +28,16 @@
"\n",
"In this notebook, we demonstrate a basic use case of zero-shot AutoML with FLAML.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml and openml:"
"FLAML requires `Python>=3.7`. To run this notebook example, please install the [autozero] option:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# %pip install -U flaml openml;"
"# %pip install flaml[autozero] lightgbm openml;"
]
},
{
@@ -51,7 +60,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 2,
"metadata": {},
"outputs": [
{
@@ -80,7 +89,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
@@ -101,7 +110,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "subslide"
@@ -113,7 +122,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"load dataset from ./openml_ds537.pkl\n",
"download dataset from openml\n",
"Dataset name: houses\n",
"X_train.shape: (15480, 8), y_train.shape: (15480,);\n",
"X_test.shape: (5160, 8), y_test.shape: (5160,)\n"
@@ -127,25 +136,38 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" median_income housing_median_age ... latitude longitude\n",
"19226 7.3003 19.0 ... 38.46 -122.68\n",
"14549 5.9547 18.0 ... 32.95 -117.24\n",
"9093 3.2125 19.0 ... 34.68 -118.27\n",
"12213 6.9930 13.0 ... 33.51 -117.18\n",
"12765 2.5162 21.0 ... 38.62 -121.41\n",
"... ... ... ... ... ...\n",
"13123 4.4125 20.0 ... 38.27 -121.26\n",
"19648 2.9135 27.0 ... 37.48 -120.89\n",
"9845 3.1977 31.0 ... 36.58 -121.90\n",
"10799 5.6315 34.0 ... 33.62 -117.93\n",
"2732 1.3882 15.0 ... 32.80 -115.56\n",
" median_income housing_median_age total_rooms total_bedrooms \\\n",
"19226 7.3003 19 4976.0 711.0 \n",
"14549 5.9547 18 1591.0 268.0 \n",
"9093 3.2125 19 552.0 129.0 \n",
"12213 6.9930 13 270.0 42.0 \n",
"12765 2.5162 21 3260.0 763.0 \n",
"... ... ... ... ... \n",
"13123 4.4125 20 1314.0 229.0 \n",
"19648 2.9135 27 1118.0 195.0 \n",
"9845 3.1977 31 1431.0 370.0 \n",
"10799 5.6315 34 2125.0 498.0 \n",
"2732 1.3882 15 1171.0 328.0 \n",
"\n",
" population households latitude longitude \n",
"19226 1926.0 625.0 38.46 -122.68 \n",
"14549 547.0 243.0 32.95 -117.24 \n",
"9093 314.0 106.0 34.68 -118.27 \n",
"12213 120.0 42.0 33.51 -117.18 \n",
"12765 1735.0 736.0 38.62 -121.41 \n",
"... ... ... ... ... \n",
"13123 712.0 219.0 38.27 -121.26 \n",
"19648 647.0 209.0 37.48 -120.89 \n",
"9845 704.0 393.0 36.58 -121.90 \n",
"10799 1052.0 468.0 33.62 -117.93 \n",
"2732 1024.0 298.0 32.80 -115.56 \n",
"\n",
"[15480 rows x 8 columns]\n"
]
@@ -168,7 +190,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "slide"
@@ -176,6 +198,13 @@
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:flaml.default.suggest:metafeature distance: 0.02197989436019765\n"
]
},
{
"name": "stdout",
"output_type": "stream",
@@ -206,7 +235,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "slide"
@@ -220,7 +249,7 @@
"0.8537444671194614"
]
},
"execution_count": 10,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
@@ -238,7 +267,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "slide"
@@ -251,7 +280,7 @@
"0.8296179648694404"
]
},
"execution_count": 11,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -309,9 +338,16 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:flaml.default.suggest:metafeature distance: 0.02197989436019765\n"
]
},
{
"name": "stdout",
"output_type": "stream",
@@ -341,9 +377,17 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 11,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:flaml.default.suggest:metafeature distance: 0.02197989436019765\n"
]
}
],
"source": [
"from flaml.default import preprocess_and_suggest_hyperparams\n",
"(\n",
@@ -365,7 +409,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "slide"
@@ -394,7 +438,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "slide"
@@ -415,7 +459,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 14,
"metadata": {
"slideshow": {
"slide_type": "slide"
@@ -425,6 +469,17 @@
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-1 {color: black;background-color: white;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LGBMRegressor(colsample_bytree=0.7019911744574896,\n",
" learning_rate=0.022635758411078528, max_bin=511,\n",
" min_child_samples=2, n_estimators=4797, num_leaves=122,\n",
" reg_alpha=0.004252223402511765, reg_lambda=0.11288241427227624,\n",
" verbose=-1)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">LGBMRegressor</label><div class=\"sk-toggleable__content\"><pre>LGBMRegressor(colsample_bytree=0.7019911744574896,\n",
" learning_rate=0.022635758411078528, max_bin=511,\n",
" min_child_samples=2, n_estimators=4797, num_leaves=122,\n",
" reg_alpha=0.004252223402511765, reg_lambda=0.11288241427227624,\n",
" verbose=-1)</pre></div></div></div></div></div>"
],
"text/plain": [
"LGBMRegressor(colsample_bytree=0.7019911744574896,\n",
" learning_rate=0.022635758411078528, max_bin=511,\n",
@@ -433,7 +488,7 @@
" verbose=-1)"
]
},
"execution_count": 17,
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
@@ -451,7 +506,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
@@ -480,35 +535,45 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 04-28 02:51:45] {1663} INFO - task = regression\n",
"[flaml.automl.logger: 04-28 02:51:45] {1670} INFO - Data split method: uniform\n",
"[flaml.automl.logger: 04-28 02:51:45] {1673} INFO - Evaluation method: cv\n",
"[flaml.automl.logger: 04-28 02:51:45] {1771} INFO - Minimizing error metric: 1-r2\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 05-31 22:54:25] {2373} INFO - task = regression\n",
"[flaml.automl: 05-31 22:54:25] {2375} INFO - Data split method: uniform\n",
"[flaml.automl: 05-31 22:54:25] {2379} INFO - Evaluation method: cv\n",
"[flaml.automl: 05-31 22:54:25] {2448} INFO - Minimizing error metric: 1-r2\n",
"[flaml.automl: 05-31 22:54:25] {2586} INFO - List of ML learners in AutoML Run: ['lgbm']\n",
"[flaml.automl: 05-31 22:54:25] {2878} INFO - iteration 0, current learner lgbm\n",
"[flaml.automl: 05-31 22:56:54] {3008} INFO - Estimated sufficient time budget=1490299s. Estimated necessary time budget=1490s.\n",
"[flaml.automl: 05-31 22:56:54] {3055} INFO - at 149.1s,\testimator lgbm's best error=0.1513,\tbest estimator lgbm's best error=0.1513\n",
"[flaml.automl: 05-31 22:56:54] {2878} INFO - iteration 1, current learner lgbm\n",
"[flaml.automl: 05-31 22:59:24] {3055} INFO - at 299.0s,\testimator lgbm's best error=0.1513,\tbest estimator lgbm's best error=0.1513\n",
"[flaml.automl: 05-31 22:59:24] {2878} INFO - iteration 2, current learner lgbm\n",
"[flaml.automl: 05-31 23:01:34] {3055} INFO - at 429.1s,\testimator lgbm's best error=0.1513,\tbest estimator lgbm's best error=0.1513\n",
"[flaml.automl: 05-31 23:01:34] {2878} INFO - iteration 3, current learner lgbm\n",
"[flaml.automl: 05-31 23:04:43] {3055} INFO - at 618.2s,\testimator lgbm's best error=0.1513,\tbest estimator lgbm's best error=0.1513\n",
"[flaml.automl: 05-31 23:05:14] {3315} INFO - retrain lgbm for 31.0s\n",
"[flaml.automl: 05-31 23:05:14] {3322} INFO - retrained model: LGBMRegressor(colsample_bytree=0.7019911744574896,\n",
"INFO:flaml.default.suggest:metafeature distance: 0.02197989436019765\n",
"INFO:flaml.default.suggest:metafeature distance: 0.006677018633540373\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 04-28 02:51:45] {1881} INFO - List of ML learners in AutoML Run: ['lgbm']\n",
"[flaml.automl.logger: 04-28 02:51:45] {2191} INFO - iteration 0, current learner lgbm\n",
"[flaml.automl.logger: 04-28 02:53:39] {2317} INFO - Estimated sufficient time budget=1134156s. Estimated necessary time budget=1134s.\n",
"[flaml.automl.logger: 04-28 02:53:39] {2364} INFO - at 113.5s,\testimator lgbm's best error=0.1513,\tbest estimator lgbm's best error=0.1513\n",
"[flaml.automl.logger: 04-28 02:53:39] {2191} INFO - iteration 1, current learner lgbm\n",
"[flaml.automl.logger: 04-28 02:55:32] {2364} INFO - at 226.6s,\testimator lgbm's best error=0.1513,\tbest estimator lgbm's best error=0.1513\n",
"[flaml.automl.logger: 04-28 02:55:54] {2600} INFO - retrain lgbm for 22.3s\n",
"[flaml.automl.logger: 04-28 02:55:54] {2603} INFO - retrained model: LGBMRegressor(colsample_bytree=0.7019911744574896,\n",
" learning_rate=0.02263575841107852, max_bin=511,\n",
" min_child_samples=2, n_estimators=4797, num_leaves=122,\n",
" reg_alpha=0.004252223402511765, reg_lambda=0.11288241427227633,\n",
" reg_alpha=0.004252223402511765, reg_lambda=0.11288241427227624,\n",
" verbose=-1)\n",
"[flaml.automl: 05-31 23:05:14] {2617} INFO - fit succeeded\n",
"[flaml.automl: 05-31 23:05:14] {2618} INFO - Time taken to find the best model: 149.06516432762146\n"
"[flaml.automl.logger: 04-28 02:55:54] {1911} INFO - fit succeeded\n",
"[flaml.automl.logger: 04-28 02:55:54] {1912} INFO - Time taken to find the best model: 113.4601559638977\n"
]
}
],
@@ -545,7 +610,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.15 (main, Oct 26 2022, 03:47:43) \n[GCC 10.2.1 20210110]"
"version": "3.9.15"
}
},
"nbformat": 4,