Abstract
Differentiable Architecture Search (DARTS) is now a widely disseminated weight-sharing neural architecture search method. However, it suffers from well-known performance collapse due to an inevitable aggregation of skip connections. In this paper, we first disclose that its root cause lies in an unfair advantage in exclusive competition. Through experiments, we show that if either of two conditions is broken, the collapse disappears. Thereby, we present a novel approach called Fair DARTS where the exclusive competition is relaxed to be collaborative. Specifically, we let each operation’s architectural weight be independent of others. Yet there is still an important issue of discretization discrepancy. We then propose a zero-one loss to push architectural weights towards zero or one, which approximates an expected multi-hot solution. Our experiments are performed on two mainstream search spaces, and we derive new state-of-the-art results on CIFAR-10 and ImageNet (Code is available here: https://github.com/xiaomi-automl/FairDARTS).
T. Zhou and B. Zhang—Equal Contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Corresponding to the experiment (\(k=3\) ) in Fig. 2.
- 2.
We run DARTS 4 times and it holds every time.
- 3.
The maximum number of edges for a node is also limited to 2 as in DARTS.
- 4.
- 5.
This differs from DARTS’ reported values as it trains one model for several times.
References
Bi, K., Hu, C., Xie, L., Chen, X., Wei, L., Tian, Q.: Stabilizing DARTS with amended gradient estimation on architectural parameters. arXiv preprint arXiv:1910.11831 (2019)
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: SMASH: one-shot model architecture search through hypernetworks. In: International Conference on Learning Representations (2018)
Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (2019)
Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In: International Conference on Computer Vision (2019)
Chu, X., Zhang, B., Li, Q., Xu, R.: SCARLET-NAS: bridging the gap between scalability and fairness in neural architecture search. arXiv preprint arXiv:1908.06022 (2019)
Chu, X., Zhang, B., Xu, R.: MoGA: searching beyond MobileNetV3. In: International Conference on Acoustics, Speech, and Signal Processing (2020). https://arxiv.org/pdf/1908.01314.pdf
Chu, X., Zhang, B., Xu, R., Li, J.: FairNAS: rethinking evaluation fairness of weight sharing neural architecture search. arXiv preprint arXiv:1907.01845 (2019)
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation policies from data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Dong, X., Yang, Y.: One-shot neural architecture search via self-evaluated template network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3681–3690 (2019)
Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1761–1770 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Howard, A., et al.: Searching for MobileNetV3. In: International Conference on Computer Vision (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: International Conference on Learning Representations (2017)
Li, G., Zhang, X., Wang, Z., Li, Z., Zhang, T.: StacNAS: towards stable and consistent optimization for differentiable Neural Architecture Search. arXiv preprint arXiv:1909.11926 (2019)
Li, G., Qian, G., Delgadillo, I.C., Müller, M., Thabet, A., Ghanem, B.: SGAS: sequential greedy architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Li, Y., Yuan, Y.: Convergence analysis of two-layer neural networks with ReLU activation. In: Advances in Neural Information Processing Systems, pp. 597–607 (2017)
Liang, H., et al.: DARTS+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2019)
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: International Conference on Learning Representations (2019)
Maclaurin, D., Duvenaud, D., Adams, R.: Gradient-based hyperparameter optimization through reversible learning. In: International Conference on Machine Learning (2015)
Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: International Conference on Learning Representations (2017)
Nayman, N., Noy, A., Ridnik, T., Friedman, I., Jin, R., Zelnik-Manor, L.: XNAS: neural architecture search with expert advice. In: Advances in Neural Information Processing Systems (2019)
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. In: International Conference on Machine Learning (2018)
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017)
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: International Conference on Machine Learning, AutoML Workshop (2018)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Tan, M., Chen, B., Pang, R., Vasudevan, V., Le, Q.V.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (2019)
Tan, M., Le., Q.V.: MixConv: mixed depthwise convolutional kernels. In: The British Machine Vision Conference (2019)
Wu, B., et al.: : FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: International Conference on Learning Representations (2019)
Xu, Y., et al.: PC-DARTS: partial channel connections for memory-efficient differentiable architecture search. In: International Conference on Learning Representations (2020)
Zela, A., Elsken, T., Saikia, T., Marrakchi, Y., Brox, T., Hutter, F.: Understanding and robustifying differentiable architecture search. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=H1gDNyrKDS
Zheng, X., Ji, R., Tang, L., Zhang, B., Liu, J., Tian, Q.: Multinomial distribution learning for effective neural architecture search. In: International Conference on Computer Vision (2019)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Chu, X., Zhou, T., Zhang, B., Li, J. (2020). Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12360. Springer, Cham. https://doi.org/10.1007/978-3-030-58555-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-58555-6_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58554-9
Online ISBN: 978-3-030-58555-6
eBook Packages: Computer ScienceComputer Science (R0)