Research Talk: Automating Scientific Research in Optimization

: Written by: Thomas Weise; Created: 09 July 2017

One of my fundamental research interests is how we can determine which optimization algorithm is good for which problem.

Unfortunately, answering this question is quite complicated. For most practically relevant problems, we need to find a trade-off between the (run)time we can invest in getting a good solution against the quality of said solution. Furthermore, the performance of almost all algorithms cannot just be a described by single pair of "solution quality" and "time needed to get a solution of that quality". Instead, these (anytime) algorithms start with an initial (often not-so-good) guess about the solution and then improve it step-by-step. In other words, their runtime behavior can be described as something like a function relating solution quality to runtime. But not a real function, since a) many algorithms are randomized, meaning that they behave differently every time you use them, even with the same input data, and b) an algorithm will usually behave different on different instances of an optimization problem type.

This means that we need to do a lot of experiments: We need to apply an optimization algorithm multiple times to a given optimization problem instance in order to "average out" the randomness. Each time, we need to collect data about the whole runtime behavior, not just the final results. Then we need to do this for multiple instances with different features in order to learn about how, e.g., the scale of a problem influences the algorithm behavior. This means that we will quite obtain a lot of data from many algorithm setups on many problem instances. The question that researchers face is thus "How can we extract useful information from that data?" How can we obtain information which helps us to improve our algorithms? How can we get data from which we can learn about the weaknesses of our methods so that we can improve them?

In this research presentation, I discuss my take on this subject. I introduce a process for automatically discovering the reasons why a certain algorithm behaves as it does and why a problem instance is harder for a set of algorithms than another. This process has already been implemented in our open source optimizationBenchmarking.org framework and is currently implemented in R.

Talk Abstract

In the fields of heuristic optimization and machine learning, experimentation is the way to assess the performance of an algorithm setup and the hardness of problems. Good experimentation is complicated. Most algorithms in the domain are anytime algorithms, meaning they can improve their approximation quality over time. This means that one algorithm may initially perform better than another one but converge to worse solutions in the end. Instead of single final results, the whole runtime behavior of algorithms needs to be compared (and runtime may be measured in multiple ways). We do not just want to know which algorithm performs best and which problem is the hardest ― a researcher wants to know why. We introduce a process which can 1) automatically model the progress of algorithm setups on different problem instances based on data collected in experiments, 2) use these models to discover clusters of algorithm (or problem instance) behaviors, and 3) propose reasons why a certain algorithm setup (or problem instance) belongs to a certain algorithm (or problem instance) behavior cluster. These high-level conclusions are presented in form of decision trees relating algorithm parameters (or instance features) to cluster ids. We emphasize the duality of analyzing algorithm setups and problem instances. Our process is implemented as open source software and tested in two case studies, on the Maximum Satisfiability Problem and the Traveling Salesman Problem. Besides its basic application to raw experimental data, yielding clusters and explanations of "quantitative" algorithm behavior, our process also allows for "qualitative" conclusions by feeding it with data which is normalized with problem features or algorithm parameters. It can also be applied recursively, e.g., to further investigate the behavior of the algorithms in the cluster with the best-performing setups on the problem instances belonging to the cluster of hardest instances. Both use cases are investigated in the case studies.

Slide Versions

the slides of the 1.5 hour version of this research presentation (~7 MiB)
the slides of the 1 hour version of this research presentation (~6 MiB)
the slides of the 30 minute version of this research presentation (~3 MiB)

Presentations (reverse chronological order)

at the Goethe University of Frankfurt in Frankfurt, Germany in 2018
at the Leibnitz University Hannover in Hannover, Germany in 2018
at the Leuphana University of Lüneburg, Germany in 2018
at the University of Bremen in Bremen, Germany in 2018
at the Carl von Ossietzky University of Oldenburg, Germany in 2018
at the University of Kassel in Kassel, Germany in 2018
at the IAS and IAO in Stuttgart, Germany in 2018
at the Johannes Gutenberg University Mainz in Mainz, Germany in 2018
at the Julius-Maximilians-Universität in Würzburg, Germany in 2018
at the Technical University of Munich in Munich, Germany in 2018
at the University of Leipzig in Leipzig, Germany in 2018
at the TU Chemnitz in Chemnitz, Germany in 2018
at the TU Bergakademie Freiberg in Freiberg, Germany in 2018
at the TU Dresden in Dresden, German in 2018
at the Otto von Guericke University Magdeburg in Magdeburg, Germany in 2018
at the Zuse Institute Berlin in Berlin, Germany in 2018
during the First Institute Workshop on Applied Optimization in our own group in 2018
at the Anhui University in Hefei, Anhui, China as part of the Symposium on Evolutionary Computation in 2017
at the Ludwig-Maximilians-Universität Munich in Munich (München), Germany in 2017
at the Johannes Gutenberg University Mainz in Mainz, Germany in 2017
at the University of Cologne in Köln (Cologne), Germany in 2017
at the University of Leipzig in Leipzig, Germany in 2017
at the University of Applied Sciences Zittau/Görlitz in Görlitz, Germany in 2017
at the Otto von Guericke University Magdeburg in Magdeburg, Germany in 2017
at the Clausthal University of Technology in Clausthal-Zellerfeld, Germany in 2017
at the Technische Universität Ilmenau in Ilmenau, Germany in 2017
at the MiWoCI Workshop at the University of Applied Sciences Mittweida in Mittweida, Germany in 2017
at the Chemnitz University of Technology in Chemnitz, Germany in 2017
at the Friedrich Schiller University in Jena, Germany in 2017

Research Talk: Automating Scientific Research in Optimization

Talk Abstract

Slide Versions

Presentations (reverse chronological order)

Further Reading

Popular Tags

Latest Articles