丁香五月天婷婷久久婷婷色综合91|国产传媒自偷自拍|久久影院亚洲精品|国产欧美VA天堂国产美女自慰视屏|免费黄色av网站|婷婷丁香五月激情四射|日韩AV一区二区中文字幕在线观看|亚洲欧美日本性爱|日日噜噜噜夜夜噜噜噜|中文Av日韩一区二区

您正在使用IE低版瀏覽器,為了您的雷峰網(wǎng)賬號(hào)安全和更好的產(chǎn)品體驗(yàn),強(qiáng)烈建議使用更快更安全的瀏覽器
此為臨時(shí)鏈接,僅用于文章預(yù)覽,將在時(shí)失效
人工智能 正文
發(fā)私信給camel
發(fā)送

0

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述

本文作者: camel 2019-08-11 11:46
導(dǎo)語(yǔ):本文首先從端到端系統(tǒng)的角度總結(jié)了自動(dòng)機(jī)器學(xué)習(xí)在各個(gè)流程中的研究成果,然后著重對(duì)最近廣泛研究的神經(jīng)結(jié)構(gòu)搜索進(jìn)行了總結(jié),最后討論了一些未來(lái)的研究方向。

本文作者為香港浸會(huì)大學(xué) 賀鑫,雷鋒網(wǎng)AI科技評(píng)論獲其授權(quán)發(fā)表。

英文標(biāo)題 | AutoML:A survey of State-of-the-art

作  者 | Xin He, Kaiyong Zhao, Xiaowen Chu

單  位 | Hong Kong Baptist University(香港浸會(huì)大學(xué))

論文鏈接 | https://arxiv.org/abs/1908.00709


深度學(xué)習(xí)已經(jīng)運(yùn)用到多個(gè)領(lǐng)域,為人們生活帶來(lái)極大便利。然而,為特定任務(wù)構(gòu)造一個(gè)高質(zhì)量的深度學(xué)習(xí)系統(tǒng)不僅需要耗費(fèi)大量時(shí)間和資源,而且很大程度上需要專業(yè)的領(lǐng)域知識(shí)。

因此,為了讓深度學(xué)習(xí)技術(shù)以更加簡(jiǎn)單的方式應(yīng)用到更多的領(lǐng)域,自動(dòng)機(jī)器學(xué)習(xí)(AutoML)逐漸成為人們關(guān)注的重點(diǎn)。

本文首先從端到端系統(tǒng)的角度總結(jié)了自動(dòng)機(jī)器學(xué)習(xí)在各個(gè)流程中的研究成果(如下圖),然后著重對(duì)最近廣泛研究的神經(jīng)結(jié)構(gòu)搜索(Neural Architecture Search, NAS)進(jìn)行了總結(jié),最后討論了一些未來(lái)的研究方向。

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述


一、數(shù)據(jù)準(zhǔn)備

眾所周知,數(shù)據(jù)對(duì)于深度學(xué)習(xí)任務(wù)而言至關(guān)重要,因此一個(gè)好的AutoML系統(tǒng)應(yīng)該能夠自動(dòng)提高數(shù)據(jù)質(zhì)量和數(shù)量,我們將數(shù)據(jù)準(zhǔn)備劃分成兩個(gè)部分:數(shù)據(jù)收集和數(shù)據(jù)清洗。

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述


1、數(shù)據(jù)收集

現(xiàn)如今不斷有公開(kāi)數(shù)據(jù)集涌現(xiàn)出來(lái),例如MNIST,CIFAR10,ImageNet等等。我們也可以通過(guò)一些公開(kāi)的網(wǎng)站獲取各種數(shù)據(jù)集,例如Kaggle, Google Dataset Search以及Elsevier Data Search等等。但是對(duì)于一些特殊的任務(wù),尤其是醫(yī)療或者涉及到個(gè)人隱私的任務(wù),由于數(shù)據(jù)很難獲取,所以通常很難找到一個(gè)合適的數(shù)據(jù)集或者數(shù)據(jù)集很小。解決這一問(wèn)題主要有兩種思路:數(shù)據(jù)生成和數(shù)據(jù)搜索。

1)數(shù)據(jù)生成

圖像:

  • Cubuk, EkinD., et al. "Autoaugment: Learning augmentation policies fromdata." arXiv preprint arXiv:1805.09501 (2018).

語(yǔ)音:

  • Park, DanielS., et al. "Specaugment: A simple data augmentation method for automaticspeech recognition." arXiv preprint arXiv:1904.08779(2019).

文本

  • Xie, Ziang,et al. "Data noising as smoothing in neural network languagemodels." arXiv preprint arXiv:1703.02573 (2017).

  • Yu, Adams Wei,et al. "Qanet: Combining local convolution with global self-attention forreading comprehension." arXiv preprint arXiv:1804.09541 (2018).

GAN:

  • Karras, Tero, Samuli Laine, and Timo Aila. "A style-basedgenerator architecture for generative adversarial networks." Proceedingsof the IEEE Conference on Computer Vision and Pattern Recognition. 2019.

模擬器:

  • Brockman, Greg, et al. "Openai gym." arXiv preprintarXiv:1606.01540 (2016).

 

2)數(shù)據(jù)搜索

  • Roh, Yuji, Geon Heo, and Steven Euijong Whang. "A survey ondata collection for machine learning: a big data-ai integrationperspective." arXiv preprint arXiv:1811.03402(2018).

  • Yarowsky, David. "Unsupervised word sense disambiguationrivaling supervised methods." 33rd annual meeting of the associationfor computational linguistics. 1995.

  • Zhou, Yan, and Sally Goldman. "Democraticco-learning." 16th IEEE International Conference on Tools withArtificial Intelligence. IEEE, 2004.

 

2、數(shù)據(jù)清洗

  • Krishnan,Sanjay, and Eugene Wu. "Alphaclean: Automatic generation of data cleaningpipelines." arXiv preprint arXiv:1904.11827 (2019).

  • Chu, Xu, etal. "Katara: A data cleaning system powered by knowledge bases andcrowdsourcing." Proceedings of the 2015 ACM SIGMOD InternationalConference on Management of Data. ACM, 2015.

  • Krishnan,Sanjay, et al. "Activeclean: An interactive data cleaning framework formodern machine learning." Proceedings of the 2016 InternationalConference on Management of Data. ACM, 2016.

  • Krishnan,Sanjay, et al. "SampleClean: Fast and Reliable Analytics on DirtyData." IEEE Data Eng. Bull. 38.3 (2015): 59-75.

 

二、特征工程

特征工程可分為三個(gè)部分:

1、特征選擇

2、特征構(gòu)造

  • H. Vafaie and K. De Jong, “Evolutionary feature spacetransformation,” in Feature Extraction, Construction and Selection. Springer,1998, pp. 307–323

  • J. Gama, “Functional trees,” Machine Learning, vol. 55, no. 3, pp.219–250, 2004.

  • D. Roth and K. Small, “Interactive feature space construction usingsemantic information,” in Proceedings of the Thirteenth Conference onComputational Natural Language Learning. Association for Computational Linguistics,2009, pp. 66–74.

3、特征提取

  • Q. Meng, D. Catchpoole, D. Skillicom, and P. J. Kennedy, “Relationalautoencoder for feature extraction,” in 2017 International Joint Conference onNeural Networks (IJCNN). IEEE, 2017, pp. 364–371.

  • O. Irsoy and E. Alpayd?n, “Unsupervised feature extraction withautoencoder trees,” Neurocomputing, vol. 258, pp. 63–73, 2017.

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述



三、模型生成

模型生成的方式主要有兩種:一是基于傳統(tǒng)的機(jī)器學(xué)習(xí)方法生成模型,例如SVM,decision tree等,已經(jīng)開(kāi)源的庫(kù)有Auto-sklearn和TPOT等。另一種是是神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)搜索(NAS)。我們會(huì)從兩個(gè)方面對(duì)NAS進(jìn)行總結(jié),一是NAS的網(wǎng)絡(luò)結(jié)構(gòu),二是搜索策略。

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述


1、網(wǎng)絡(luò)結(jié)構(gòu)

1)整體結(jié)構(gòu)(entire structure):

該類(lèi)方法是生成一個(gè)完整的網(wǎng)絡(luò)結(jié)構(gòu)。其存在明顯的缺點(diǎn),如網(wǎng)絡(luò)結(jié)構(gòu)搜索空間過(guò)大,生成的網(wǎng)絡(luò)結(jié)構(gòu)缺乏可遷移性和靈活性。

  • B. Zoph and Q. V. Le, “Neural architecture search with reinforcementlearning.” [Online]. Available:http://arxiv.org/abs/1611.01578

  • H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268

2)基于單元結(jié)構(gòu)(cell-based structure):

為解決整體結(jié)構(gòu)網(wǎng)絡(luò)搜索存在的問(wèn)題提出了基于單元結(jié)構(gòu)設(shè)計(jì)的方法。如下圖所示,搜索到單元結(jié)構(gòu)后需要疊加若干個(gè)單元結(jié)構(gòu)便可得到最終的網(wǎng)絡(luò)結(jié)構(gòu)。不難發(fā)現(xiàn),搜索空間從整個(gè)網(wǎng)絡(luò)縮減到了更小的單元結(jié)構(gòu),而且我們可以通過(guò)增減單元結(jié)構(gòu)的數(shù)量來(lái)改變網(wǎng)絡(luò)結(jié)構(gòu)。但是這種方法同樣存在一個(gè)很明顯的問(wèn)題,即單元結(jié)構(gòu)的數(shù)量和連接方式不確定,現(xiàn)如今的方法大都是依靠人類(lèi)經(jīng)驗(yàn)設(shè)定。

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述

  • H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268

  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online].Available: http://arxiv.org/abs/1707.07012

  • Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu, “Practicalblock-wise neural network architecture generation.” [Online]. Available:http://arxiv.org/abs/1708.05552

  • B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neuralnetwork architectures using reinforcement learning,” vol. ICLR. [Online].Available: http://arxiv.org/abs/1611.02167

  • E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q.Le, and A. Kurakin, “Large-scale evolution of image classifiers.” [Online].Available: http://arxiv.org/abs/1703.01041

  • E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolutionfor image classifier architecture search.” [Online]. Available:http://arxiv.org/abs/1802.01548

3)層次結(jié)構(gòu)(hierarchical structure)

不同于上面將單元結(jié)構(gòu)按照鏈?zhǔn)降姆椒ㄟM(jìn)行連接,層次結(jié)構(gòu)是將前一步驟生成的單元結(jié)構(gòu)作為下一步單元結(jié)構(gòu)的基本組成部件,通過(guò)迭代的思想得到最終的網(wǎng)絡(luò)結(jié)構(gòu)。如下圖所示,(a)中左邊3個(gè)是最基本的操作,右邊是基于這些基本操作生成的某一個(gè)單元結(jié)構(gòu);(b)中左邊展示了上一步驟中生成的若干個(gè)單元結(jié)構(gòu),通過(guò)按照某種策略將這些單元結(jié)構(gòu)進(jìn)行組合得到了更高階的單元結(jié)構(gòu)。

  • H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu,“Hierarchical representations for efficient architecture search,” in ICLR, p.13

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述


4)基于網(wǎng)絡(luò)態(tài)射結(jié)構(gòu)(network morphism-based structure):

一般的網(wǎng)絡(luò)設(shè)計(jì)方法是首先設(shè)計(jì)出一個(gè)網(wǎng)絡(luò)結(jié)構(gòu),然后訓(xùn)練它并在驗(yàn)證集上查看它的性能表現(xiàn),如果表現(xiàn)較差,則重新設(shè)計(jì)一個(gè)網(wǎng)絡(luò)。可以很明顯地發(fā)現(xiàn)這種設(shè)計(jì)方法會(huì)做很多無(wú)用功,因此耗費(fèi)大量時(shí)間。而基于網(wǎng)絡(luò)態(tài)射結(jié)構(gòu)方法能夠在原有的網(wǎng)絡(luò)結(jié)構(gòu)基礎(chǔ)上做修改,所以其在很大程度上能保留原網(wǎng)絡(luò)的優(yōu)點(diǎn),而且其特殊的變換方式能夠保證新的網(wǎng)絡(luò)結(jié)構(gòu)還原成原網(wǎng)絡(luò),也就是說(shuō)它的表現(xiàn)至少不會(huì)差于原網(wǎng)絡(luò)。


自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述


  • T. Chen, I. Goodfellow, and J. Shlens, “Net2net: Acceleratinglearning via knowledge transfer,” arXiv preprint arXiv:1511.05641, 2015.

  • T. Elsken, J. H. Metzen, and F. Hutter, “Efficient multi-objectiveneural architecture search via lamarckian evolution.” [Online]. Available:http://arxiv.org/abs/1804.09081

  • H. Cai, T. Chen, W. Zhang, Y. Yu, and J. Wang, “Efficientarchitecture search by network transformation,” in Thirty-Second AAAIConference on Artificial Intelligence, 2018

2、搜索策略

1)網(wǎng)格搜索

  • H. H. Hoos, Automated Algorithm Configuration and Parameter Tuning,2011

  • I. Czogiel, K. Luebke, and C. Weihs, Response surface methodologyfor optimizing hyper parameters. Universitatsbibliothek Dortmund, 2006.

  • C.-W. Hsu, C.-C. Chang, C.-J. Lin et al., “A practical guide tosupport vector classification,” 2003.

  • J. Y. Hesterman, L. Caucci, M. A. Kupinski, H. H. Barrett, and L. R.Furenlid, “Maximum-likelihood estimation with a contracting-grid searchalgorithm,” IEEE transactions on nuclear science, vol. 57, no. 3, pp.1077–1084, 2010.

2)隨機(jī)搜索

  • J. Bergstra and Y. Bengio, “Random search for hyper-parameteroptimization,” p. 25.

  • H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio,“An empirical evaluation of deep architectures on problems with many factors ofvariation,” in Proceedings of the 24th international conference on Machinelearning. ACM, 2007, pp. 473–480.

  • L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar,“Hyperband: A novel bandit-based approach to hyperparameter optimization.”[Online]. Available: http://arxiv.org/abs/1603.06560

3)強(qiáng)化學(xué)習(xí)

  • B. Zoph and Q. V. Le, “Neural architecture search with reinforcementlearning.” [Online]. Available: http://arxiv.org/abs/1611.01578

  • B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neuralnetwork architectures using reinforcement learning,” vol. ICLR. [Online].Available: http://arxiv.org/abs/1611.02167

  • H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268

  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online].Available: http://arxiv.org/abs/1707.07012

  • Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu, “Practical block-wiseneural network architecture generation.” [Online]. Available:http://arxiv.org/abs/1708.05552

4)進(jìn)化算法

  • L. Xie and A. Yuille, “Genetic CNN,” vol. ICCV. [Online]. Available:http://arxiv.org/abs/1703.01513

  • M. Suganuma, S. Shirakawa, and T. Nagao, “A genetic programmingapproach to designing convolutional neural network architectures.” [Online].Available: http://arxiv.org/abs/1704.00764

  • E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q.Le, and A. Kurakin, “Large-scale evolution of image classifiers.” [Online].Available: http://arxiv.org/abs/1703.01041

  • K. O. Stanley and R. Miikkulainen, “Evolving neural networks throughaugmenting topologies,” vol. 10, no. 2, pp. 99–127. [Online]. Available:http://www.mitpressjournals.org/doi/10.1162/106365602320169811

  • T. Elsken, J. H. Metzen, and F. Hutter, “Efficient multi-objectiveneural architecture search via lamarckian evolution.” [Online]. Available:http://arxiv.org/abs/1804.09081

5)貝葉斯算法

  • J. Gonzalez, “Gpyopt: A bayesian optimization framework in python,”http://github.com/SheffieldML/GPyOpt, 2016

  • J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesianoptimization of machine learning algorithms,” in Advances in neural informationprocessing systems, 2012, pp. 2951–2959.

  • S. Falkner, A. Klein, and F. Hutter, “BOHB: Robust and efficienthyperparameter optimization at scale,” p. 10.

  • F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-basedoptimization for general algorithm configuration,” in Learning and IntelligentOptimization, C. A. C. Coello, Ed. Springer Berlin Heidelberg, vol. 6683, pp.507–523. [Online]. Available:http://link.springer.com/10.1007/978-3-642-25566-3 40

  • J. Bergstra, D. Yamins, and D. D. Cox, “Making a science of modelsearch: Hyperparameter optimization in hundreds of dimensions for visionarchitectures,” p. 9.

  • A. Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter, “Fastbayesian optimization of machine learning hyperparameters on large datasets.”[Online]. Available: http://arxiv.org/abs/1605.07079

6)梯度下降算法

  • H. Liu, K. Simonyan, and Y. Yang, “DARTS: Differentiablearchitecture search.” [Online]. Available: http://arxiv.org/abs/1806.09055

  • S. Saxena and J. Verbeek, “Convolutional neural fabrics,” inAdvances in Neural Information Processing Systems, 2016, pp. 4053–4061.

  • K. Ahmed and L. Torresani, “Connectivity learning in multi-branchnetworks,” arXiv preprint arXiv:1709.09582, 2017.

  • R. Shin, C. Packer, and D. Song, “Differentiable neural networkarchitecture search,” 2018.

  • D. Maclaurin, D. Duvenaud, and R. Adams, “Gradient-basedhyperparameter optimization through reversible learning,” in InternationalConference on Machine Learning, 2015, pp. 2113–2122.

  • F. Pedregosa, “Hyperparameter optimization with approximategradient,” arXiv preprint arXiv:1602.02355, 2016.

  • S. H. Han Cai, Ligeng Zhu, “PROXYLESSNAS: DIRECT NEURAL ARCHITECTURESEARCH ON TARGET TASK AND HARDWARE,” 2019

  • G. D. H. Andrew Hundt, Varun Jain, “sharpDARTS: Faster and MoreAccurate Differentiable Architecture Search,” Tech. Rep. [Online]. Available:https://arxiv.org/pdf/1903.09900.pdf

三、模型評(píng)估

模型結(jié)構(gòu)設(shè)計(jì)好后我們需要對(duì)模型進(jìn)行評(píng)估,最簡(jiǎn)單的方法是將模型訓(xùn)練至收斂,然后根據(jù)其在驗(yàn)證集上的結(jié)果判斷其好壞。但是這種方法需要大量時(shí)間和計(jì)算資源。因此有不少加速模型評(píng)估過(guò)程的算法被提出,總結(jié)如下:

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述

1、低保真度評(píng)估

  • Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter, “Fastbayesian optimization of machine learning hyperparameters on large datasets.”[Online]. Available: http://arxiv.org/abs/1605.07079

  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online]. Available:http://arxiv.org/abs/1707.07012

  • E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolutionfor image classi?er architecture search.” [Online]. Available:http://arxiv.org/abs/1802.01548

  • A. Zela, A. Klein, S. Falkner, and F. Hutter, “Towards automateddeep learning: Ef?cient joint neural architecture and hyperparameter search.” [Online].Available: http://arxiv.org/abs/1807.06906

  • Y.-q. Hu, Y. Yu, W.-w. Tu, Q. Yang, Y. Chen, and W. Dai,“Multi-Fidelity Automatic Hyper-Parameter Tuning via Transfer Series Expansion,” p. 8, 2019.

2、遷移學(xué)習(xí)

  • C. Wong, N. Houlsby, Y. Lu, and A. Gesmundo, “Transfer learning withneural automl,” in Advances in Neural Information Processing Systems, 2018, pp.8356–8365.

  • T. Wei, C. Wang, Y. Rui, and C. W. Chen, “Network morphism,” inInternational Conference on Machine Learning, 2016, pp. 564–572.

  • T. Chen, I. Goodfellow, and J. Shlens, “Net2net: Acceleratinglearning via knowledge transfer,” arXiv preprint arXiv:1511.05641, 2015.

3、基于代理(surrogate-based)

  • K. Eggensperger, F. Hutter, H. H. Hoos, and K. Leyton-Brown,“Surrogate benchmarks for hyperparameter optimization.” in MetaSel@ ECAI, 2014,pp. 24–31.

  • C. Wang, Q. Duan, W. Gong, A. Ye, Z. Di, and C. Miao, “An evaluationof adaptive surrogate modeling based optimization with two benchmark problems,”Environmental Modelling & Software, vol. 60, pp. 167–179,2014.

  • K. Eggensperger, F. Hutter, H. Hoos, and K. Leyton-Brown, “Ef?cientbenchmarking of hyperparameter optimizers via surrogates,” in Twenty-Ninth AAAIConference on Arti?cial Intelligence, 2015.

  • K. K. Vu, C. D’Ambrosio, Y. Hamadi, and L. Liberti, “Surrogate-basedmethods for black-box optimization,” International Transactions in OperationalResearch, vol. 24, no. 3, pp. 393–424, 2017.

  • C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L.Fei-Fei, A. Yuille, J. Huang, and K. Murphy, “Progressive neural architecturesearch.” [Online]. Available: http://arxiv.org/abs/1712.00559

4、早停(early-stopping)

  • A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter, “Learningcurve prediction with bayesian neural networks,” 2016.

  • B. Deng, J. Yan, and D. Lin, “Peephole: Predicting networkperformance before training,” arXiv preprint arXiv:1712.03351, 2017.

  • T. Domhan, J. T.Springenberg, and F. Hutter, “Speeding up automatic hyperparameter optimizationof deep neural networks by extrapolation of learning curves,” in Twenty-FourthInternational Joint Conference on Arti?cial Intelligence, 2015.

  • M. Mahsereci, L. Balles, C. Lassner, and P. Hennig, “Early stoppingwithout a validation set,” arXiv preprint arXiv:1703.09580, 2017.

 

四、NAS算法性能總結(jié)


自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述

下圖總結(jié)了不同NAS算法在CIFAR10上的搜索網(wǎng)絡(luò)所花費(fèi)的時(shí)間以及準(zhǔn)確率??梢钥吹较啾扔诨趶?qiáng)化學(xué)習(xí)和進(jìn)化算法的方法,基于梯度下降和隨機(jī)搜索的方法能夠使用更少的時(shí)間搜索得到表現(xiàn)優(yōu)異的網(wǎng)絡(luò)模型。

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述


五、總結(jié)

通過(guò)對(duì)AutoML最新研究進(jìn)展的總結(jié)我們發(fā)現(xiàn)還有如下問(wèn)題值得思考和解決:

1、完整pipeline系統(tǒng)

現(xiàn)如今有不少開(kāi)源AutoML庫(kù),如TPOT,Auto-sklearn都只是涉及整個(gè)pipeline的某一個(gè)或多個(gè)過(guò)程,但是還沒(méi)有真正實(shí)現(xiàn)整個(gè)過(guò)程全自動(dòng),因此如何將上述所有流程整合到一個(gè)系統(tǒng)內(nèi)實(shí)現(xiàn)完全自動(dòng)化是未來(lái)需要不斷研究的方向。

2、可解釋性

深度學(xué)習(xí)網(wǎng)絡(luò)的一個(gè)缺點(diǎn)便是它的可解釋性差,AutoML在搜索網(wǎng)絡(luò)過(guò)程中同樣存在這個(gè)問(wèn)題。目前還缺乏一個(gè)嚴(yán)謹(jǐn)?shù)目茖W(xué)證明來(lái)解釋

為什么某些操作表現(xiàn)更好,例如

就基于單元結(jié)構(gòu)設(shè)計(jì)的網(wǎng)絡(luò)而言,很難解釋為什么通過(guò)疊加單元結(jié)構(gòu)就能得到表現(xiàn)不錯(cuò)的網(wǎng)絡(luò)結(jié)構(gòu)。另外為何ENAS提出的權(quán)值共享能夠work同樣值得思考。

3、可復(fù)現(xiàn)性

大多數(shù)的AutoML研究工作都只是報(bào)告了其研究成果,很少會(huì)開(kāi)源其完整代碼,有的只是提供了最終搜索得到的網(wǎng)絡(luò)結(jié)構(gòu)而沒(méi)有提供搜索過(guò)程的代碼。另外較多論文提出的方法難以復(fù)現(xiàn),一方面是因?yàn)樗麄冊(cè)趯?shí)際搜索過(guò)程中使用了很多技巧,而這些都沒(méi)有在論文中詳細(xì)描述,另一方面是網(wǎng)絡(luò)結(jié)構(gòu)的搜索存在一定的概率性質(zhì)。因此如何確保AutoML技術(shù)的可復(fù)現(xiàn)性也是未來(lái)的一個(gè)方向。

4、靈活的編碼方式

通過(guò)總結(jié)NAS方法我們可以發(fā)現(xiàn),所有方法的搜索空間都是在人類(lèi)經(jīng)驗(yàn)的基礎(chǔ)上設(shè)計(jì)的,所以最終得到的網(wǎng)絡(luò)結(jié)構(gòu)始終無(wú)法跳出人類(lèi)設(shè)計(jì)的框架。例如現(xiàn)如今的NAS無(wú)法憑空生成一種新的類(lèi)似于卷積的基本操作,也無(wú)法生成像Transformer那樣復(fù)雜的網(wǎng)絡(luò)結(jié)構(gòu)。因此如何定義一種泛化性更強(qiáng),更靈活的網(wǎng)絡(luò)結(jié)構(gòu)編碼方式也是未來(lái)一個(gè)值得研究的問(wèn)題。

5、終身學(xué)習(xí)(lifelong learn)

大多數(shù)的AutoML都需要針對(duì)特定數(shù)據(jù)集和任務(wù)設(shè)計(jì)網(wǎng)絡(luò)結(jié)構(gòu),而對(duì)于新的數(shù)據(jù)則缺乏泛化性。而人類(lèi)在學(xué)習(xí)了一部分貓狗的照片后,當(dāng)出現(xiàn)未曾見(jiàn)過(guò)的貓狗依然能夠識(shí)別出來(lái)。因此一個(gè)健壯的AutoML系統(tǒng)應(yīng)當(dāng)能夠終身學(xué)習(xí),即既能保持對(duì)舊數(shù)據(jù)的記憶能力,又能學(xué)習(xí)新的數(shù)據(jù)。

雷峰網(wǎng)特約稿件,未經(jīng)授權(quán)禁止轉(zhuǎn)載。詳情見(jiàn)轉(zhuǎn)載須知。

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述

分享:
相關(guān)文章
當(dāng)月熱門(mén)文章
最新文章
請(qǐng)?zhí)顚?xiě)申請(qǐng)人資料
姓名
電話
郵箱
微信號(hào)
作品鏈接
個(gè)人簡(jiǎn)介
為了您的賬戶安全,請(qǐng)驗(yàn)證郵箱
您的郵箱還未驗(yàn)證,完成可獲20積分喲!
請(qǐng)驗(yàn)證您的郵箱
立即驗(yàn)證
完善賬號(hào)信息
您的賬號(hào)已經(jīng)綁定,現(xiàn)在您可以設(shè)置密碼以方便用郵箱登錄
立即設(shè)置 以后再說(shuō)