Today, our article "Automatically discovering clusters of algorithm and problem instance behaviors as well as their causes from experimental data, algorithm setups, and instance features" has appeared in the Applied Soft Computing journal published by Elsevier, which describes our research topic Automating Scientific Research in Optimization. This article maybe marks the first contribution where a significant part of the high-level work of a researcher in the fields of optimization and machine learning is automated by a process applying different machine learning steps.

Thomas Weise, Xiaofeng Wang, Qi Qi, Bin Li, and Ke Tang. Automatically discovering clusters of algorithm and problem instance behaviors as well as their causes from experimental data, algorithm setups, and instance features. Applied Soft Computing Journal (ASOC), 73:366–382, December 2018.
doi:10.1016/j.asoc.2018.08.030 / share link (valid until November 6, 2018)

In the fields of heuristic optimization, we aim to get good solutions for computationally hard problems. Solving the Travelling Salesman Problem, for instance, means to find the shortest tour that goes through n cities and returns back to the starting point. Such problems often cannot be solved to optimality in feasible time due to their complexity. This means that algorithms often start with a more or less random initial guess about the solution and then step-by-step improve it. This means performance has two dimensions: the runtime we grant to the algorithm until we stop it and take the best-so-far result and the solution quality of that best-so-far result. Since there currently are not yet sufficient theoretical tools to assess the performance of such algorithms, researchers conduct many experiments and compare the results. This often means to apply many different setups of an algorithm to many different instances of a problem type. Since optimization algorithms are often randomized, multiple repetitions of the experiments are needed. Evaluating such experimental data is not easy. Moreover, as evaluation result, we do not just want to know which algorithm performs best and which problem is the hardest ― a researcher wants to know why.

Call for Papers Call for Papers

Special Issue on Benchmarking of Computational Intelligence Algorithms

Applied Soft Computing by Elsevier B.V.


Cover of the Applied Soft Computing journal.Computational Intelligence (CI) is a huge and expanding field which is rapidly gaining importance, attracting more and more interests from both academia and industry. It includes a wide and ever-growing variety of optimization and machine learning algorithms, which, in turn, are applied to an even wider and faster growing range of different problem domains. For all of these domains and application scenarios, we want to pick the best algorithms. Actually, we want to do more, we want to improve upon the best algorithm. This requires a deep understanding of the problem at hand, the performance of the algorithms we have for that problem, the features that make instances of the problem hard for these algorithms, and the parameter settings for which the algorithms perform the best. Such knowledge can only be obtained empirically, by collecting data from experiments, by analyzing this data statistically, and by mining new information from it. Benchmarking is the engine driving research in the fields of optimization and machine learning for decades, while its potential has not been fully explored. Benchmarking the algorithms of Computational Intelligence is an application of Computational Intelligence itself! This special issue of the EI/SCIE-indexed Applied Soft Computing journal published by Elsevier B.V. solicited novel contributions from this domain according to the topics of interest listed below.

Here you can download the Call for Papers (CfP) in PDF format and here as plain text file. The journal's website where the papers appear is here. The special issue has now been completed and 14 articles were accepted and published.

One of my fundamental research interests is how we can determine which optimization algorithm is good for which problem.

Unfortunately, answering this question is quite complicated. For most practically relevant problems, we need to find a trade-off between the (run)time we can invest in getting a good solution against the quality of said solution. Furthermore, the performance of almost all algorithms cannot just be a described by single pair of "solution quality" and "time needed to get a solution of that quality". Instead, these (anytime) algorithms start with an initial (often not-so-good) guess about the solution and then improve it step-by-step. In other words, their runtime behavior can be described as something like a function relating solution quality to runtime. But not a real function, since a) many algorithms are randomized, meaning that they behave differently every time you use them, even with the same input data, and b) an algorithm will usually behave different on different instances of an optimization problem type.

This means that we need to do a lot of experiments: We need to apply an optimization algorithm multiple times to a given optimization problem instance in order to "average out" the randomness. Each time, we need to collect data about the whole runtime behavior, not just the final results. Then we need to do this for multiple instances with different features in order to learn about how, e.g., the scale of a problem influences the algorithm behavior. This means that we will quite obtain a lot of data from many algorithm setups on many problem instances. The question that researchers face is thus "How can we extract useful information from that data?" How can we obtain information which helps us to improve our algorithms? How can we get data from which we can learn about the weaknesses of our methods so that we can improve them?

In this research presentation, I discuss my take on this subject. I introduce a process for automatically discovering the reasons why a certain algorithm behaves as it does and why a problem instance is harder for a set of algorithms than another. This process has already been implemented in our open source framework and is currently implemented in R.

Some time ago, I discussed why global optimization with an Evolutionary Algorithm (EA) is not necessarily better than local search. Actually, I get asked the question "Why should I use an EA?" quite a few times. Thus, today, it is time to write down a few ideas about why and why not you may benefit from using an EA. I tried to be objective, which is not entirely easy since I work in that domain.

Currently, two of the leading industry nations, Germany and China, are pushing their industry to increase a higher degree of automation. Automation is among the key technologies of concepts such as Industry 4.0 and Made in China 2025 [中国制造2025]. The goal is not automation in the traditional sense, i.e., the fixed and rigid implementation of static processes which are to be repeated millions of times in exactly the same way. Instead, decisions should be automated, i.e., the machinery carrying out processes in production and logistics should dynamically decide what to do based on its environment and its current situation. In other words, these machines should become intelligent.

As a researcher in optimization and operations research, this idea is not new to me. Actually, this is exactly the goal of work and it has been the goal for the past seven decades – with one major difference: the level at which the automated, intelligent decision process takes place. In this article I want to shortly discuss my point of view on this matter.

feed-image rss feed-image atom