LLAMBO ICLR 2024 - 凯的博客

LLAMBO

Warmstarting the bo process 零样本提示为热身采样点，提出三种方法no context、partial context、full context

surrogate modeling

Reference

Large Language Models to Enhance Bayesian Optimization

源码学习

python run_bayesmark.py –dataset $dataset –model $model –num_seeds 1 –sm_mode discriminative –engine “gpt35turbo_20230727”

task_context: model、task、tot_feats、cat_feats、num_feats、n_classes、metric、lower_is_better、num_samples、hyperparameter_constraints

benchmark = BayesmarkExprRunner(task_context, data, seed)

BayesmarkExpRunner seed、model、task、metric、dataset、hyperparameter_constraints、bbox_func

LLAMBO task_context、model_name、lower_is_better、n_candidates(10)、n_templates(2)、n_gens(10)、alpha(0.1)、n_initial_samples(5)、n_trials(25)、init_f(benchmark.generate_initialization)、bbox_eval_f(benchmark.evaluate_point) chat_engine(chat_engine)、top_pct(None) use_input_warping(False)、prompt_setting(None)、shuffle_features(False)、surrogate_model(LLM_DIS_SM)、acq_func(LLM_ACQ)

RateLimiter max_tokens(100000)、max_requests(720)、time_frame(60)、timestamps、tokens_used、request_count(0)

LLM_DIS_SM task_context、n_gens、lower_is_better、n_templates、rate_limiter、warping_transformer、chat_engine、prompt_setting、shuffle_features、bootstrapping(False)、use_recalibaration(False)、rate_limiter、apply_warping、recalibrator、verbose

LLM_ACQ task_context、n_candidates、n_templates、lower_is_better、rate_limiter、warping_transformer、chat_engine、prompt_setting、shuffle_features、n_gens、apply_jitter、apply_warping

运行过程 llambo.optimize()->self._initialize()->benmark.generate_initialization->self._evaluate_config->benmark.evaluate_point

optimization loop: 得到候选点 self.acq_func.get_candidate_points self._gen_prompt_tempates_acqusitions->self._async_generate_concurrently prompt模板生成过程 self._gen_prompt_tempates_acquisitions self._prepare_configurations_acqusition->self._jitter->self._prepare_configurations_acquisition 正式调用OPENAI_API self._async_generate_concurrently self._async_generate 返回openai_api的处理结果

挑选候选点 self.surrogate_model.select_query_point self._evaluate_candidate_points prompt模板生成过程 discriminative_sm_utils.py该文件都是在写prompt生成过程正式调用OPENAI_API self._predict->self._generate_concurrently->self._generate

评价候选点 self._evaluate_config

更新所观察点 self_update_observations

Bayesian Optimization

贝叶斯优化是一种求解函数最优值的算法，它最普遍的使用场景是在机器学习过程对超参数进行调优。贝叶斯优化算法的核心框架是SMBO(Sequential Model-Based Optimization),而贝叶斯优化狭义上特指代理模型为高斯过程回归模型的SMBO

步骤：

初始采样：随机选择一些参数点，并计算对应的目标函数值；这些点和目标函数值将用于初始化代理模型
构建代理模型：使用高斯过程或随机森林等方法，构建目标函数的代理模型。高斯过程常用，因为它不仅能预测函数值，还能提供预测不确定性
代理模型优化：使用代理模型预测新的参数点的目标函数值和不确定性。基于这些预测，计算一个采集函数(Acquisition Function),如期望改进(Expected Improvement, EI),上置信界(Upper Confidence Bound, UCB)等。采集函数用来平衡探索(选择那些不确定性较大的点，希望发现新的好点)和开发之间的权衡(选择那些预计目标函数值较好的点，利用已有信息改进的最优解)
更新代理模型：在采集函数的指导下选择下一个参数点，计算其目标函数值，将新的数据点加入已有的数据集中，更新代理模型
重复迭代：重复步骤3和4，逐步缩小参数空间，找到最优参数

FEATURED TAGS

Tensor Compiler Compiler Optimization Code Generation Heterogeneous Systems Operator Fusion Deep Neural Network Recursive Tensor Execution Deep Learning Compiler Classical Machine Learning Compiler Optimizations Bayesian Optimization Autotuning Spatial Accelerators Tensor Computations Code Reproduction Neural Processing Units Polyhedral Model Auto-tuning Machine Learning Compiler Neural Network Program Transformations Tensor Programs Deep learning Tensor Program Optimizer Search Algorithm Compiler Infrastructure Scalalbe and Modular Compiler Systems Tensor Computation GPU Task Scheduling GPU Streams Tensor Expression Language Automated Program optimization Framework AI compiler memory hierarchy data locality tiling fusion polyhedral model scheduling domain-specific architectures memory intensive TVM Sparse Tensor Algebra Sparse Iteration Spaces Optimizing Transformations Tensor Operations Machine Learning Model Scoring AI Compiler Memory-Intensive Computation Fusion Neural Networks Dataflow Domain specific Language Programmable Domain-specific Acclerators Mapping Space Search Gradient-based Search Deep Learning Systems Systems for Machine Learning Programming Models Compilation Design Space Exploration Tile Size Optimization Performance Modeling High-Performance Tensor Program Tensor Language Model Tensor Expression GPU Loop Transformations Vectorization and Parallelization Hierarchical Classifier TVM API Optimizing Compilers Halide Pytorch Optimizing Tensor Programs Gradient Descent debug Automatic Tensor Program Tuning Operators Fusion Tensor Program Cost Model Weekly Schedule Spatio-temporal Schedule tensor compilers auto-tuning tensor program optimization compute schedules Tensor Compilers Data Processing Pipeline Mobile Devices Layout Transformations Transformer Design space exploration GPU kernel optimization Compilers Group Tuning Technique Tensor Processing Unit Hardware-software Codeisgn Data Analysis Adaptive Systems Program Auto-tuning python api Code Optimization Distributed Systems High Performance Computing code generation compiler optimization tensor computation Instructions Integration Code rewriting Tensor Computing DSL CodeReproduction Deep Learning Compiler Loop Program Analysis Nested Data Parallelism Loop Fusion C++ Machine Learning System Decision Forest Optimizfing Compiler Decision Tree Ensemble Decision Tree Inference Parallelization Optimizing Compiler decision trees random forest machine learning parallel processing multithreading Tree Structure Performance Model Code generation Compiler optimization Tensor computation accelerator neural networks optimizing compilers autotuning performance models deep neural networks compilers auto-scheduling tensor programs Tile size optimization Performance modeling Program Functionalization affine transformations loop optimization Performance Optimization Subgraph Similarity deep learning compiler Intra- and Inter-Operator Parallelisms ILP tile-size operator fusion cost model graph partition zero-shot tuning tensor program kernel orchestration machine learning compiler Loop tiling Locality Polyhedral compilation Optimizing Transformation Sparse Tensors Asymptotic Analysis Automatic Scheduling Data Movement Optimization Operation Fusion Compute-Intensive Automatic Exploration data reuse deep reuse Tensorize docker graph substitution compiler Just-in-time compiler graph Tensor program construction tensor compilation graph traversal Markov analysis Deep Learning Compilation Tensor Program Auto-Tuning Decision Tree Search-based code generation Domain specific lanuages Parallel architectures Dynamic neural network mobile device spatial accelerate software mapping reinforcement learning Computation Graph Graph Scheduling and Transformation Graph-level Optimization Operator-level Optimization Partitioning Algorithms IR Design Parallel programming languages Software performance Digitial signal processing Retargetable compilers Equational logic and rewriting Tensor-level Memory Management Code Generation and Optimizations Scheduling Sparse Tensor Auto-Scheduling Tensor Coarse-Grained Reconfigurable Architecture Graph Neural Network Reinforcement Learning Auto-Tuning Domain-Specific Accelerator Deep learning compiler Long context Memory optimization code analysis transformer architecture-mapping

LLAMBO

Reference

源码学习

Bayesian Optimization

CATALOG

FEATURED TAGS