Automated Machine Learning

Automated machine learning (AutoML) is the practice of using algorithms to search for machine learning pipelines — including data preprocessing, model architecture, and hyperparameter configuration — that perform well on a given task, without requiring manual specification by a human practitioner. AutoML systems extend the reach of machine learning to practitioners without deep expertise, while also serving as research tools for discovering configurations that outperform manually designed systems. The dominant AutoML approaches include neural architecture search (searching over model structures), Bayesian optimization over hyperparameter spaces, and ensemble construction from candidate models.

The promise of AutoML is democratization: expert-level model performance without expert knowledge. The reality is more complex. AutoML systems encode substantial domain knowledge in their search spaces — the set of pipeline components and hyperparameter ranges from which configurations are drawn. A search space that excludes the right architecture cannot find it. The design of the search space is itself an expert task, and the quality of the AutoML system is bounded by the quality of the search space definition. AutoML automates the search; it cannot automate the framing of what to search for.

The deeper implication: AutoML is a tool for optimization within a predefined space, not a tool for discovering that the space is wrong. Every major advance in deep learning — SGD with momentum, convolutional architectures, the attention mechanism — required recognizing that the existing search space was inadequate, not searching harder within it. AutoML can find the best CNN; it cannot discover that attention is better than convolution. That requires scientific creativity, which no current AutoML system possesses.