This paper proposes the use of multi-objective ensemble learning to monitor drinking-water quality. Such problem consists of a data set with an extreme imbalance ratio where the events, the minority class, must be correctly detected given a time series denoting water quality and operative data on a minutely basis. First, the given data set is preprocessed for imputing missing data, adjusting concept drift and adding new statistical features, such as moving average, moving standard deviation, moving maximum and moving minimum. Next, two ensemble learning techniques are used, namely SMOTEBoost and RUSBoost. Such techniques have been developed specifically for dealing with imbalanced data, where the base learners are trained by adjusting the ratio between the classes. The first algorithm focuses on oversampling the minority class, while the second focuses on under-sampling the majority class. Finally, multi-objective optimisation is used for pruning the base models of such ensembles in order to maximise the prediction score without reducing generalisation performance. In the training phase, the model is trained, optimised and evaluated using hold-out validation on a given training data set. At the end, the trained model is inserted into a framework, which is used for online event detection and assessing the model's performance.
In this paper, we apply the SOMA T3A algorithm to solve 10 hard problems of the 100-Digit Challenge of the GECCO 2019 Competition. With effective improvements in choosing Migrants and Leader in the organization process, as well as the Step and PRT adaptive parameters in the migration process, the algorithm has achieved promising results. The total score that the algorithm achieved is 92.04 points.
In the constraint-handling techniques, the penalty approaches (especially the adaptive penalty methods) are simple and flexible, and have been combined with various Evolutionary Algorithms so far. In this paper, we propose a new adaptive penalty method combined with L-SHADE as a method to optimize the 28 benchmark problems provided for the GECCO 2019 Competition on constrained single-objective numerical optimization effectively. The penalty factor is adjusted based on the trade-off information between the objective function value and the constraint violation that can be taken by individuals, the ranges of the objective function value and the constraint violation that can be taken by individuals and the proportion of feasible individuals in the current population. By doing this, the proposed method balances the objective function value and the constraint violation, and population is not converged in the only direction in which one improves and the other becomes worse. In addition, we use a few parameters that are easy to set up.
In this paper, a hybrid-adaptive differential evolution with a decay function (HyDE-DF)1 is proposed for numerical function optimization. The proposed HyDE-DF is applied to the 100-Digit Challenge in a set of 10 benchmark functions. Results show that HyDE-DF can achieve a 93/100 score, proving its effectiveness for numerical optimization.
A novel multiobjective evolutionary algorithm is proposed for constrained optimisation in this paper. It transforms a constrained optimisation problem into a two objective optimisation problem. One objective is equivalent to solving the original constrained problem, and the other is the degree of constraint violation which only plays a helper role. This multiobjective problem is decomposed into several single objective optimisation problems using a dynamical weighted sum approach. Each single objective eventually tends to an equivalent objective. A decomposition-based multiobjective optimisation differential evolution algorithm is designed for solving these single objective problems simultaneously.
This paper describes a competition entry for the "100-Digit Challenge, and Four Other Numerical Optimization Competitions" at The Genetic and Evolutionary Computation Conference (GECCO) 2019, by assessing the function evaluations up to 1e+12, and large population sizes in Distance-based Success History Differential Evolution for the 100-Digit Challenge and Numerical Optimization Scenarios (DISHchain 1e+12).
One step in the way to sustainable cities and society is the massive growth of electric vehicles and producers of renewable energy (RE) like sun and wind. But these advances are a challenge in the more efficient management of energy due to the uncertainty that they incorporate into the problem. In this paper, a Cellular Estimation Distribution algorithm (CUMDANCauchy-C1) is designed to Solve the Energy Resource Management Problem Under Uncertainty. CUMDANCauchy-C1 is a cellular evolutionary algorithm with univariate estimation and neighborhood One, which learn a combination of Normal and Cauchy distributions from the global population to generate the new individuals.
The framework developed for the "2019 Evolutionary Computation in Uncertain Environments: A Smart Grid Application" was used to test the proposed algorithm. Also a comparison, in the current competition framework, with the winners algorithms of the 2018 edition was done. The experimental results showed that CUMDANCauchy-C1 outperforms the winners algorithms of the 2018 edition.
Legends of the Three Kingdoms is a Chinese card game based on the novel Romance of the Three Kingdoms. As a turn-based strategy game, Legends of the Three Kingdoms is an imperfect information game where agents interact with one another to deduce the unknown information. The most famous solution for an imperfect information game is to find a Nash equilibrium where no agent can benefit from changing strategies. Tencent AI Lab has recently developed a novel learning-based Hierarchical Macro Strategy model for AI to master MOBA games, a sub-genre of RTS games. In this paper, we adopt this method on our AI to solve Legends of the Three Kingdoms in a shrunken size of 4 players.
This extended abstract previews the usage of Gaussian processes in a surrogate-model version of the CMA-ES, a state-of-the-art black-box continuous optimization algorithm. The proposed algorithm DTS-CMA-ES exploits the benefits of Gaussian process uncertainty prediction, especially during the selection of points for the evaluation with the surrogate model. Very brief results are presented here, while much more elaborate description of the methods, parameter settings and detailed experimental results can be found in the original article Gaussian Process Surrogate Models for the CMA Evolution Strategy [2], to appear in the Evolutionary Computation1.
The multiobjective knapsack problem (MOKP) is a combinatorial problem that arises in various applications, including resource allocation, computer science and finance. Evolutionary multiobjective optimization algorithms (EMOAs) can be effective in solving MOKPs. Though, they often face difficulties due to the loss of solution diversity and poor scalability. To address those issues, our study [2] proposes to generate candidate solutions by artificial neural networks. This is intended to provide intelligence to the search. As gradient-based learning cannot be used when target values are unknown, neuroevolution is adapted to adjust the neural network parameters. The proposal is implemented within a state-of-the-art EMOA and benchmarked against traditional search operators base on a binary crossover. The obtained experimental results indicate a superior performance of the proposed approach. Furthermore, it is advantageous in terms of scalability and can be readily incorporated into different EMOAs.
Understanding of exploration and exploitation powers of meta-heuristic stochastic optimization algorithms is very important for algorithm developers. For this reason, we have recently proposed an approach for making a statistical comparison of meta-heuristic stochastic optimization algorithms according to the distribution of the solutions in the search space, which is also presented in this paper. Its main contribution is the support to identify exploration and exploitation powers of the compared algorithms. This is especially important when dealing with multimodal search spaces, which consist of many local optima with similar values, and large-scale continuous optimization problems, where it is hard to understand the reasons for the differences in performances. Experimental results showed that our recently proposed approach gives very promising results.
The search for interpretable reinforcement learning policies is of high academic and industrial interest. Especially for industrial systems, domain experts are more likely to deploy autonomously learned controllers if they are understandable and convenient to evaluate. Basic algebraic equations are supposed to meet these requirements, as long as they are restricted to an adequate complexity. In our recent work "Interpretable policies for reinforcement learning by genetic programming" published in Engineering Applications of Artificial Intelligence 76 (2018), we introduced the genetic programming for reinforcement learning (GPRL) approach. GPRL uses model-based batch reinforcement learning and genetic programming and autonomously learns policy equations from preexisting default state-action trajectory samples. Experiments on three reinforcement learning benchmarks demonstrate that GPRL can produce human-interpretable policies of high control performance.
An important benchmark for evolutionary algorithms are (strictly) monotone functions. For the (1 + 1)-EA with mutation rate c/n, it was known that it can optimize any monotone function on n bits in time O(n log n) if c < 1. However, it was also known that there are monotone functions on which the (1 + 1)-EA needs exponential time if c < 2.2. For c = 1 it was known that the runtime is always polynomial, but it was unclear whether it is quasilinear, and it was unclear whether c = 1 is the threshold at which the runtime jumps from polynomial to superpolynomial.
We show there exists a c0 > 1 such that for all 0 < c ≤ c0 the (1 + 1)-Evolutionary Algorithm with rate c/n finds the optimum in O(n log2 n) steps in expectation. The proof is based on an adaptation of Moser's entropy compression argument. That is, we show that a long runtime would allow us to encode the random steps of the algorithm with less bits than their entropy.
This short note summarizes the results that have been published as "When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument" in the proceedings of the 16th Workshop on Analytic Algorithmics and Combinatorics (ANALCO), SIAM, 2019 [9].
This abstract summarizes the results reported in the paper [3]. In this paper a method named Low-Dimensional Euclidean Embedding (LDEE) is proposed, which can be used for visualizing high-dimensional combinatorial spaces, for example search spaces of metaheuristic algorithms solving combinatorial optimization problems. The LDEE method transforms solutions of the optimization problem from the search space Ω to Rk (where in practice k = 2 or 3). Points embedded in Rk can be used, for example, to visualize populations in an evolutionary algorithm.
The paper shows how the assumptions underlying the the t-Distributed Stochastic Neighbor Embedding (t-SNE) method can be generalized to combinatorial (for example permutation) spaces. The LDEE method combines the generalized t-SNE method with a new Vacuum Embedding method proposed in this paper to perform the mapping Ω → Rk.
We describe a genetic programming-based system for the automated discovery of new test statistics. Specifically, our system was able to discover test statistics as powerful as the t-test for comparing sample means from two distributions with equal variances [1].
Biclustering is a technique that looks for patterns hidden in some columns and some rows of the input data. Evolutionary search-based biclustering (EBIC) is probably the first biclustering method that combines high accuracy of detection of multiple patterns with support for big data. EBIC has been recently extended to a multi-GPU method and allows to analyze very large datasets. In this short paper, we discuss the scalability of EBIC as well as its suitability for RNA-seq and single cell RNA-seq (scRNA-seq) experiments.
1With the implementation of dynamic tariffs, the electricity retailer may define distinct energy prices along the day. These tariff schemes encourage consumers to adopt different patterns of consumption with potential savings and enable the retailer to manage the interplay between wholesale and retail prices. In this work, the interaction between retailer and consumers is hierarchically modelled as a bi-level (BL) programming problem. However, if the lower level (LL) problem, which deals with the optimal operation of the consumer's appliances, is difficult to solve, it may not be possible to obtain its optimal solution, and therefore the solution to the BL problem is not feasible. Considering a computation budget to solve LL problems, a hybrid particle swarm optimization (PSO) - mixed-integer programming (MLP) approach is proposed to estimate good quality bounds for the upper level (UL) objective function. This work is based on [4].
Stochastic synthesis of recursive functions has historically proved difficult, not least due to issues of non-termination and the often ad hoc methods for addressing this. We propose a general method of implicit recursion which operates via an automatically-derivable decomposition of datatype structure by cases, thereby ensuring well-foundedness. The method is applied to recursive functions of long-standing interest and the results outperform recent work which combines two leading approaches and employs 'human in the loop' to define the recursion structure. We show that stochastic synthesis with the proposed method on benchmark functions is effective even with random search, motivating a need for more difficult recursive benchmarks in future. This paper summarizes work that appeared in [1].
A selection hyper-heuristic is used to optimise a number of well-known benchmark problems. The resulting sequences of heuristics and objective function values are used to create a database. The sequences in the database are broken down into subsequences and the concept of a logarithmic return is used to discriminate between "effective" subsequences, which tend to decrease the objective value, and "disruptive" subsequences, which tend to increase the objective value. These subsequences are then employed in a sequenced based hyper-heuristic and evaluated on unseen benchmark problems. The results demonstrate that the "effective" subsequences perform better than the "disruptive" subsequences across a number of problem domains with 99% confidence. The identification of subsequences of heuristic that can be shown to be effective across a number of problems or problem domains could have important implications for the design of hyper-heuristics.
The real-world implementation of Underwater Glider Path Planning (UGPP) over the dynamic and changing environment in deep ocean waters requires complex mission planning under very high uncertainties. Such a mission is also influenced to a large extent by remote sensing for forecasting weather models outcomes used to predict spatial currents in deep sea, further limiting the available time for accurate run-time decisions by the pilot, who needs to re-test several possible mission scenarios in a short time, usually a few minutes.
Hence, this paper presents the recently proposed UGPP mission scenarios' optimization with a recently well performing algorithm for continuous numerical optimization, Success-History Based Adaptive Differential Evolution Algorithm (SHADE) including Linear population size reduction (L-SHADE).
An algorithm for path optimization considering the ocean currents' model predictions, vessel dynamics, and limited communication, yields potential way-points for the vessel based on the most probable scenario; this is especially useful for short-term opportunistic missions where no reactive control is possible.
The newly obtained results with L-SHADE outperformed existing literature results for the UGPP benchmark scenarios. Thereby, this new application of Evolutionary Algorithms to UGPP contributes significantly to the capacity of the decision-makers when they use the improved UGPP expert system yielding better trajectories.
In this paper, we study an integrated production-inventory-routing planning (PIRP) problem which addresses important decisions in the supply chains and it is classified as an NP-hard problem. Studies shows that companies solving their production, inventory and routing decisions simultaneously, can reduce their total costs significantly and respond to the customers' needs efficiently. Nowadays due to strict regulations, companies must take into account environmental considerations as well as cost minimization in their processes. Therefore, in this paper we develop a mixed-integer linear programming model to formulate the green PIRP (GPIRP) problem which optimizes the economic and social dimensions of the supply chains. Also, we propose a two-phase genetic algorithm (GA) in which the inventory and production decisions are solved in the first phase and the vehicle routing and transportation decisions are solved in the second phase. In the computational experiments we conduct sensitivity analysis to investigate the efficiency of our proposed solution algorithm for the large-sized instances
Feature selection is a machine learning concept that entails selecting relevant features while eliminating irrelevant and redundant features. This process helps to speed up learning. In this paper, an Estimation of Distribution Algorithm (EDA) is applied to a feature selection problem originating from a legal business. The EDA was able to generate a realistic solution to the real-world problem.
Resource level allocation entails assigning execution times to a set of resource levels required to complete a task. This is an important problem, particularly in the services sector. We consider a real-world variant of this problem originating from a legal business. The objective considered is the maximisation of damages savings.
We apply an Estimation of Distribution Algorithm (EDA) to this problem and use a machine learning model, Random Forest, as a fitness approximation method. The hybrid EDA presents promising results.
In machine learning a coreset is defined as a subset of the training set using which an algorithm obtains performances similar to what it would deliver if trained over the whole original data. Advantages of coresets include improving training speed and easing human understanding. Coreset discovery is an open line of research as limiting the training might also impair the quality of the result. Differently, virtual points, here called archetypes, might be far more informative for a machine learning algorithm. Starting from this intuition, a novel evolutionary approach to archetype set discovery is presented: starting from a population seeded with candidate coresets, a multi-objective evolutionary algorithm is set to modify them and eventually create archetype sets, to minimize both number of points in the set and classification error. Experimental results on popular benchmarks show that the proposed approach is able to deliver results that allow a classifier to obtain lower error and better ability of generalizing on unseen data than state-of-the-art coreset discovery techniques.
I present a simple and yet effective GA-based approach to content generation in the Sudoku domain. The GA finds multiple full boards which can act as solutions for Sudoku and Killer Sudoku puzzles. In this work I use a binning-based diversity maintenance approach in order to encourage GA to find more solution boards, resluts prove that though both approaches routinely manage to find multiple solution boards it is in fact the simple GA without any diversity maintenance that finds more such boards. Using a simpler approach to manipulate the fitness function to penalize previously found solutions improves the algorithm further.
Generative adversarial networks (GAN) facilitate the learning of probability distributions of complex data in the real world, and allow neural networks to generate the distribution. GANs (GAN and its variants) exhibit excellent performance in applications like image generation and video generation. However, GANs sometimes experience problems during training with regard to the distribution of real data. We applied a genetic algorithm to improve and optimize the GAN's training performance. As a result, the convergence speed and stability during the training process improved compared to the conventional GAN.
Residual stresses, both macroscopic and microscopic, are originated during conventional metallurgical processes. Knowing their magnitude and distribution is of great importance in the structural design of applications where fatigue, stress corrosion or thermal cycling occur (e.g., in the nuclear industry). The importance of these stresses is reflected in the large number of articles published in recent years, mainly focused on studying macroscopic stresses. However, there are no experimental studies that quantify the magnitude of microscopic triaxial stresses. This lack is due in part to the limitations of diffraction techniques (neutrons and synchrotron radiation). Since the measurement volume is much higher than the variation of these microscopic stresses, its calculation is greatly complicated, because the methods used in the case of macroscopic stresses are not valid for microscopic ones. Furthermore, there is no reliable procedure to obtain the relaxed lattice parameter value, a key factor in the calculation of residual stresses. The aim of this project is to develop a methodology for mapping microscopic stresses, particularly in aluminium alloys. The procedure will use experimental diffraction results from large European facilities, mainly by neutron diffraction, which will be analyzed using evolutionary algorithms, computational techniques that handle a large number of variables. The procedure will be based on the analysis of the displacements of the peaks of diffraction and, fundamentally, its widening.
A central problem in training artificial agents to perform complex skills is specifying appropriate cost functions whose optimization will lead to the desired behavior. Specifying detailed cost functions is laborious and often inefficient. The training of agents in competitive and cooperative multi-agent environments provides an avenue to circumvent these limitations: By competition and cooperation agents provide to each other a natural curriculum that can lead to the emergence of complicated skills, even if the rewards of the multi-agent game are simple [1].
Here we explore the emergence of complex strategies and skills in a simple hide and seek game simulated in a 3-D physics environment. We show that training using deep reinforcement learning (RL) leads to the emergence of several rounds of strategies, counter-strategies, and skills composed of several sequential behaviors. Our results suggest that training multiple agents in a sufficiently complex environment using large scale RL can lead to open-ended development of behavior. We furthermore show that emergent skills can be extracted and re-used in distinct environments. This skill transfer is both a useful evaluation metric for multi-agent emergence and suggests that multi-agent pre-training might be a useful strategy to generate targeted skills.
This work proposes a multi-objective extension of a real-world inventory routing problem (IRP), a generalisation of the classical vehicle routing problem (VRP) with vendor managed inventory (VMI) replenishment. While many mathematical formulations and solution models already exist, this study incorporates business-related and risk considerations that makes it unique. It is known that a significant volume of hazardous materials travels every day. Consideration of risks arising from the transportation of hazardous materials as a criterion for selecting distribution routes could potentially reduce the likelihood of accidents and/or the expected consequences of accidents.
Reinforcement learning in general is suitable for putting actions in a specific order within a short sequence, but in the long run its greedy nature leads to eventual incompetence. This paper presents a brief description and implementative analysis of Action Sequence which was designed to deal with such a "penny-wise and pound-foolish" problem. Based on a combination of genetic operations and Monte-Carlo tree search, our proposed method is expected to show improved computational efficiency especially on problems with high complexity in which situational difficulties are often troublesome to resolve with naive behaviors. We tested the method on a video game environment to validate its overall performance.
Complexity of a problem can be substantially reduced through basis change, however, it is not easy to find an appropriate basis in representation because of difficulty of basis evaluation. To address this issue, a method has been proposed to evaluate a basis based on the epistasis that shows the problem difficulty. However, the basis evaluation is very time-consuming. In this study, a method is proposed to evaluate a basis quickly by developing a model that estimates the epistasis from the basis by using deep neural networks. As experimental results of variant-onemax and NK-landscape problems, the epistasis has been estimated successfully by using the proposed method.
We modified GPQuick to use SIMD parallel floating point AVX 512 bit instructions and 48 threads to give up to 139 billion GP operations per second, 139 giga GPops, on a single Intel Xeon Gold 6126 2.60 GHz server. The multi-threaded single instruction multiple data genetic programming GP interpreter has evolved binary trees of more than 396 million instructions using subtree crossover and run populations for a million generations.
This study aimed to improve the performance of machine learning prediction through feature selection using a genetic algorithm (GA) for predicting depression of elderly people based on the survey data of Korean Longitudinal Study of Aging (KLoSA) performed in South Korea. The proposed feature selection method finds an optimized feature set through a fitness function design that maximizes the correlations between the features selected and the label to be predicted while minimizing the correlations between the selected features by using GA. The effectiveness of the proposed GA was shown through comparative experiments.
This paper proposes a hybrid evolutionary approach to feature engineering for mining frequent patterns from multidimensional financial ultra-high frequency time series. Experiments performed on real-world data from the London Stock Exchange Rebuilt Order Book database confirms that the evolutionary algorithm is capable of improving significantly the results of classification.
This work discusses the solution of a Large-scale global optimization problem named Security Constrained Optimal Power Flow (SCOPF) using a method based on Cross Entropy (CE) and Evolutionary Particle Swarm Optimization (EPSO). The obtained solution is compared to the Entropy Enhanced Covariance Matrix Adaptation Evolution Strategy (EE-CMAES) and Shrinking Net Algorithm (SNA). Experiments show the approach reaches competitive results.
With the raise of diseases related with unhealthy lifestyles such as heart-attacks, overweight, diabetes, etc., encouraging healthy and balanced patterns in the population is one of the most important action points for governments around the world. Furthermore, it is actually even a more critical situation when a high percentage of patients are children and teenagers whose habits consist merely in eating fast or ultra-processed food and a sedentary life.
The development of healthy and balanced menu plans becomes a typical task for physicians and nutritionists, and it is at this point that Computer Science has taken an important role. Discovering new approaches for generating healthy and balanced, as well as inexpensive menu plans will play an important role to reduce diseases from current and new generations.
In this paper, a recently proposed multi-objective evolutionary algorithm is compared to traditional multi-objective evolutionary algorithms for solving a novel multi-objective formulation of the Menu Planning Problem designed for school cafeterias. In order to evaluate the performance of the approaches selected for comparison, an exhaustive experimental assessment was made. Firstly we focused on performing a suitable election of the parameter values of the algorithm, so afterwards the best configuration found could be compared to the remaining multi-objective optimisers.
For the consumption-loan planning in the experimental economics, this paper proposes the EDA (Estimation of Distribution Algorithm) using Hamming distance in order to avoid convergence on a local solution. The EDA with Hamming distance restrains the quantitative change in a solution structure on the processes of EDA.
Modern gradient based optimization methods for deep neural networks demonstrate outstanding results on image classification tasks. However, methods that do not rely on gradient feedback fail to tackle deep network optimization. In the held of evolutionary computation, applying evolutionary algorithms directly to network weights remains to be an unresolved challenge. In this paper we examine a new framework for the evolution of deep nets. Based on the empirical analysis, we propose the use of linear sub-spaces of problems to search for promising optimization trajectories in parameter space, opposed to weight evolution. We show that linear sub-spaces of loss functions are sufficiently well-behaved to allow trajectory evaluation. Furthermore, we introduce fitness measure to show that it is possible to correctly categorize trajectories according to their distance from the optimal path. As such, this work introduces an alternative approach to evolutionary optimization of deep networks.
This paper proposes a new approach to solve the Bike Rebalancing Problem (BRP) based on the Honey-Bee Mating Optimization (HBMO) algorithm. The aim is to reduce the overall traveling cost of redistribution operations under various constraints. The performance of the proposed algorithm is evaluated using a set of benchmark instances for the BRP. Preliminary results are obtained and showed that the proposed approach is promising.
Neural networks perform well when they are built for a specific task and the set of inputs and the set of outputs are well defined. However, these results are very limited in scope, and communication between different neural networks to share knowledge that can lead to the performance of more general tasks is still inadequate. Communication between specialized neural networks is the goal of the present work. We utilize independent sets of neural networks trained for specific tasks, while transferring knowledge among the neural networks allows them to evolve chaining the input and output information. The idea is based on computer network architecture, which is a communication system that transfers data between components inside a computer or between computers. The idea can similarly allow each neural network to specialize in its own task while transferring and receiving information from other neural networks. This can allow different neural networks to be plugged in and knowledge transfer to evolve. It can also allow additional information to be requested, when the task at hand is difficult or hard to resolve.
Evolution of biological macromolecules in a tube (in vitro evolution) of modern synthetic biology can be reasonably interpreted as the implementation of genetic algorithms (GA) in a biochemical experiment. This area of modern biology and bioengineering needs both new experimental approaches and new mathematical tools. In our report, we simulate how evolution occurs in vitro using the example of selection of RNA control devices (or RNA-based sensors). We demonstrate that heuristic recombination algorithms are significantly more efficient in a test tube evolution model than the standard mutation and crossover operators. We believe that the implementation of new biochemical methods, based on such heuristic algorithms, can significantly improve the efficiency of in vitro evolution.
Differential Evolution (DE) is a highly competitive numerical optimization method for constrained optimization problems. In this study, a new selective pressure technique is applied in DE, which considers both function values and constraint violations with respect to ϵ-constraint level. The new modification was tested against known variants of constrained DE, and it is shown that selective pressure allows significant improvement of algorithm performance in various application scenarios.
In this paper, we present an application of genetic algorithms to a problem of optimizing a car trajectory in a closed loop car race setting. The goal is to minimize the amount of turning that the car needs to do such that it can drive faster, or the total distance that it needs to travel to finish the race. We compare the results with a procedurally computed trajectory and with an optimized version of it using smoothing methods.1
Designing deep neural networks by human engineers can be challenging because there are various choices of deep neural network structures. We developed a deep neuroevolution system to automatically design deep neural network structures using deep neuroevolution. Our approach defines a set of structures using a probabilistic grammar and searches for best network structures using Probabilistic Model Building Genetic Programming. Our approach takes advantage of the probabilistic dependencies found among the structures of networks. The system was applied to tackle the problem of the physiological signal classification of abnormal heart rhythm. In the classification problem, our discovered model is more accurate than AlexNet. Our discovered model uses about 2% of the total amount of parameters of AlexNet.
Identifying the interaction of search variables of black-box optimization problem is beneficial for optimization task. However, very little research pay attention to the quality of information source, i.e. what information is beneficial for identifying the interactions between variables. In this paper, we propose a new method that utilizes multiple local optima as information sources to identify the interaction between variables. First, a multimodal optimization algorithm is used to search for multiple local optima of the optimization problem. Then, hierarchical clustering is used to cluster and discretize local optima. Finally, the interaction between variables is quantified using the mutual information of local optima. Experimental results on three 12-dimensional multimodal problems show that the proposed method can effectively identify the interactions among decision variables.
As the demand for computationally expensive optimization has increased, so has the interest in surrogate-assisted evolutionary algorithms (SAEAs). However, if a fitness landscape is approximated using only a surrogate model, thereby replacing a fitness function, it is possible for a solution to evolve toward a false optimum based on the surrogate model. Therefore, many conventional studies have been carried out in which the real fitness function and surrogate model are dealt with simultaneously. Nevertheless, such an approach leaves much to be desired because studies should be performed for real fitness function evaluation and surrogate model-aware search mechanisms. In this study, we discovered that the approximation error of the surrogate model at low dimensions has a significant relationship with the performance of SAEAs at high dimensions for three binary encoding problems and three real encoding problems. Therefore, if the approximate error is sufficiently small in the low dimension, then high GA performance can be obtained even when the real fitness function is not used, because a high-quality surrogate model can be obtained in the high dimension.
Emergency resource allocation is an important issue in emergency response to marine oil spill accident. This paper addresses an emergency resource allocation problem which involves multiple relief resource center and one accident point. To solve this problem, a bi-objective model is constructed to minimize the disposal time and the disposal costs. Due to the existence of uncertainty in different aspects involved in the emergency response to the marine oil spill, we also incorporated into the model some fuzzy parameters. An algorithm based on the non-dominated sorting genetic algorithm 2 and the differential evolution algorithm, was presented to obtain the Pareto set. The technique for order preference by similarity to ideal solution method was employed to reach a final compromise solution. A case was studied to demonstrate the effectiveness of the proposed model and the hybrid algorithm, which can support the decision making in the process of the oil spill emergency disposal.
The Traveling Thief Problem (TTP) is a new problem recently proposed in the literature. The TTP combines two well-known optimization problems: the knapsack problem (KP) and the travelling salesman problem (TSP). In this paper, new hybrid ant colony algorithms are presented. We study and compare different approaches for solving the TTP. The first approach is a centralized and static metaheuristic, the second is a dynamic metaheuristic and the third is a distributed metaheuristic. The obtained results prove that our algorithms are efficient for instances of TTP.
The benefits of optimising fleets of vehicles regards scheduling tasks are threefold; reduced costs, reduced road use, and most importantly, reduced emissions. However, optimisation methods, both exact and meta-heuristic, scale poorly. This issue is addressed with Partial-ACO, a novel variant of ACO that scales by ants only partially modifying good solutions. For real-world fleet optimisation problems supplied by a Birmingham company of up to 298 jobs and 32 vehicles, Partial-ACO demonstrates better scalability than ACO and GAs reducing the company's fleet traversal by over 40%.
Random drift particle swarm optimization (RDPSO) is a swarm intelligence algorithm inspired by the trajectory analysis of the canonical particle swarm optimization (PSO) and the free electron model in metal conductors placed in an external electric field. However, the RDPSO algorithm may easily encounter premature convergence when solving multimodal optimization problems. In order to deal with this issue, a new collaborative diversity control strategy for RDPSO is presented in this paper. Within this strategy, two kinds of diversity measures are used and changed in a collaborative manner to make the evolving process of the RDPSO controllable, so that premature convergence can be avoided and a final good solution can be found. Experimental results, when comparing with the canonical RDPSO and the canonical RDPSO using ring neighborhood topology, show that the proposed collaborative diversity control strategy can significantly improve the performance of the RDPSO algorithm for multimodal optimization problems in most cases.
Inferring models of dynamic systems from their time series data is a challenging task for optimization algorithms due to its potentially expensive computational cost and underlying large search space. In this study we aim to infer both the structure and parameters of a dynamic system model simultaneously by Particle Swarm Optimization (PSO), enhanced by effective stratified sampling strategies. More specifically we apply Latin Hyper Cube Sampling (LHS) with PSO. This leads to two novel swarm-inspired algorithms, LHS-PSO which can be used efficiently to learn the structure and parameters of simple and complex dynamic system models. We used a complex biological cancer model called Kinetochores, for assessing the performance of PSO and LHS-PSO. The experimental results demonstrate that LHS-PSO can find promising solutions with corresponding structure and parameters, and it outperforms PSO during our experiments.
Artificial life (A-life) simulations present a natural way to study interesting phenomena emerging in a population of evolving agents. In this paper, we investigate whether allowing A-life agents to select mates can extend the lifetime of a population. In our approach, each agent evaluates potential mates via a preference function. The role of this function is to map information about an agent and its candidate mate to a scalar preference for deciding whether or not to form an offspring. We encode the parameters of the preference function genetically within each agent, thus allowing such preferences to be agent-specific as well as evolving over time. We evaluate this approach in a simple predator-prey A-life environment and demonstrate that the ability to evolve a per-agent mate-selection preference function indeed significantly increases the extinction time of the population. Additionally an inspection of the evolved preference function parameters shows that agents evolve to favor mates who have survival traits.
The performance of self-organized collective decision-making systems highly depends on the interactions with the environment. The environmental bias factors can introduce indirect modifications in the behaviour of such systems, however, not all changes are for the worse. In this paper, we show how the isomorphic changes in the environment can improve the performance of the collective decision-making strategies, mostly used in the current state-of-the-art swarm robotics research. The idea is based on the usage of a special kind of an equivalence relation, namely isomorphism, which provides local changes in the environment while preserving the global information. The obtained results indicate that the isomorphic transformations, sharing a certain structure of the environment, can significantly accelerate the consensus time without compromising correctness of the final decision.
This paper introduces a comparative summary regarding an evolution of multi-state cellular automata by means of various representations of their transition functions. In particular, a conventional table-based representation and an advanced approach using so-called Conditionally Matching Rules is applied. The French flag development from a seed is considered as a case study. The results show some remarkable differences in the cellular automata behaviour that are evidently caused by the representation used. They include the issue of emergence of the pattern from a chaotic state or rather a more systematic construction, a stability of the pattern developed and a limitation of its successful construction to fixed-size automata only. The comparison of the results is enabled by using a custom variant of genetic algorithm that provided working solutions for both representations of the transition function.
This paper presents a novel distributed on-line evolutionary learning algorithm for swarm robotics that can cope with very limited hardware, as expected from using a swarm of low cost robots. The algorithm is able to deal with hardware constraints over the communication bandwidth by sharing only a limited amount of information, using a recombination operator inspired from bacterial conjugation. Using a classic foraging task, we show that the algorithm converges towards stable and efficient solutions even though, as expected, it converges slower when the bandwidth is limited. However, we also show that the proposed algorithm performs a trade-off between convergence speed and absolute performance that depends on the amount of bandwidth available. The recombination operator yields better performance if communication is limited, as recombination makes the most from the genetic material already present in the population. In other words, quality outweighs convergence speed if the bandwidth is limited.
Simultaneous Localisation and Mapping (SLAM) algorithms are expensive to run on smaller robotic platforms such as Micro-Aerial Vehicles. Bug algorithms are an alternative that use relatively little processing power, and avoid high memory consumption by not building an explicit map of the environment. In this work we explore the performance of Neuroevolution - specifically NEAT - at evolving control policies for simulated differential drive robots carrying out generalised maze navigation. We compare this performance with respect to one particular bug algorithm known as I-Bug. We extend NEAT to include Gated Recurrent Units (GRUs) to help deal with long term dependencies. We show that both NEAT and our NEAT-GRU can repeatably generate controllers that outperform I-Bug on a test set of 209 indoor maze like environments. We show that NEAT-GRU is superior to NEAT in this task. Moreover, we show that out of the 2 systems, only NEAT-GRU can continuously evolve successful controllers for a much harder task in which no bearing information about the target is provided to the agent.
The aim of this paper is to apply theoretical bounds known for the Evolutionary Algorithms (EAs) to the genetic engineering technique of Systematic Evolution of Ligands by Exponential enrichment (SELEX). We discuss how the EAs optimizing Royal Road or Royal Staircase fitness functions may be considered as models of evolutionary search "from scratch". We consider the design of synthetic enhancers and promoters in SELEX. This problem asks for a tight cluster of supposedly unknown motifs from the initial random set of DNA sequences using SELEX. We apply the upper bounds on the expected hitting time of a target area of genotypic space (the EA runtime) in order to upper-bound the expected number of rounds of SELEX until a series of binding sites for protein factors is found. The theoretical bounds are compared to the results of computational experiments modelling bacterial promoters and enhancers. Our results suggest that for some cases with large population size, theoretical bounds give favorable prediction, while computational experiments require prohibitive CPU resource.
The route to the solution of complex design problems often lies through intermediate "stepping stones" which bear little resemblance to the final solution. By greedily following the path of greatest fitness improvement, objective-based search overlooks and discards stepping stones which might be critical to solving the problem. Here, we hypothesize that Quality Diversity (QD) algorithms are a better way to generate stepping stones than objective-based search: by maintaining a large set of solutions which are of high-quality, but phenotypically different, these algorithms collect promising stepping stones while protecting them in their own "ecological niche". To demonstrate the capabilities of QD we revisit the challenge of recreating images produced by user-driven evolution, a classic challenge which spurred work in novelty search and illustrated the limits of objective-based search. We show that QD far outperforms objective-based search in matching user-evolved images. Further, our results suggest some intriguing possibilities for leveraging the diversity of solutions created by QD.
Quality-diversity algorithms focus on discovering multiple diverse and high-performing solutions. MAP-elites is such an algorithm, as it partitions the solution space into bins and searches for the best solution possible for each bin. In this paper, multi-behavior variants of MAP-Elites are tested where the MAP-Elites grid partitions the solution space based on a certain dimension, while selection is guided by measures of diversity on another dimension. Four divergent search algorithms are tested for this selection process, targeting novelty or surprise or their combination, and their performance on a soft robot evolution task is discussed.
In continual learning, an agent is exposed to a changing environment, requiring it to adapt during execution time. While traditional reinforcement learning (RL) methods have shown impressive results in various domains, there has been less progress in addressing the challenge of continual learning. Current RL approaches do not allow the agent to adapt during execution but only during a dedicated training phase. Here we study the problem of continual learning in a 2D bipedal walker domain, in which the legs of the walker grow over its lifetime, requiring the agent to adapt. The introduced approach combines neuroevolution, to determine the starting weights of a deep neural network, and a version of deep reinforcement learning that is continually running during execution time. The proof-of-concept results show that the combined approach gives a better generalisation performance when compared to evolution or reinforcement learning alone. The hybridization of reinforcement learning and evolution opens up exciting new research directions for continually learning agents that can benefit from suitable priors determined by an evolutionary process.
Quality Diversity algorithms (QD) evolve a set of high-performing phenotypes that each behaves as differently as possible. However, current algorithms are all elitist, which make them unable to cope with stochastic fitness functions and behavior evaluations. In fact, many of the promising applications of QD algorithms, for instance, games and robotics, are stochastic. Here we propose two new extensions to the QD-algorithm MAP-Elites --- adaptive sampling and drifting-elites --- and demonstrate empirically that these extensions increase the quality of solutions in a noisy artificial test function and the behavioral diversity in a 2D bipedal walker environment.
We present a feasibility study on evolving controllers for a group of wheeled robot predators that need to capture a prey robot. Our solution method works by evolving controllers in simulation for 100 generations, followed by 10 generations on real robots. The best controllers are further evaluated by their sensitivity for the initial positions. The results demonstrate the practical feasibility of this approach and give an indication of the time required to develop good solutions for the predator-prey problem.
The evolutionary cost of morphological complexity in biological populations remains an open question. This study investigates the impact of imposing a cost on morphological complexity given co-adapting behavior-morphology couplings in simulated robots. Specifically we investigate the environmental and evolutionary conditions for which morphological complexity can be evolved without sacrificing behavioral efficacy. This study evaluates the relationship between task difficulty (environment complexity) and evolved morphological complexity. We use multi-objective neuro-evolution to evolve robot controller-morphology couplings in task environments of increasing difficulty where the objectives are to minimize the cost of (morphological) complexity and to maximize behavior quality (task performance). Results indicate that imposing a cost of complexity induces the evolution of simpler morphologies with negligible differences in behavior (task performance) across increasingly complex environments (increasing task difficulty).
To investigate how encodings influence evolving the morphology and control of modular robots, we compared three encodings: a direct encoding and two generative encodings---a compositional pattern producing network (CPPN) and a Lindenmayer System (L-System). The evolutionary progression and final performance of the direct encoding and the L-System was significantly better than the CPPN. The generative encodings converge quicker than the direct encoding in terms of morphological and controller diversity.
We present the first nature-inspired algorithm for the NP-complete Nurikabe pencil puzzle. Our method, based on Ant Colony Optimization (ACO), offers competitive performance with a direct logic-based solver, with improved run-time performance on smaller instances, but poorer performance on large instances. Importantly, our algorithm is "problem agnostic", and requires no heuristic information. This suggests the possibility of a generic ACO-based framework for the efficient solution of a wide range of similar logic puzzles and games. We further suggest that Nurikabe may provide a challenging benchmark for nature-inspired optimization.
The sounds we associate with particular places are tightly interwoven with our memories and sense of belonging. We describe a platform designed to assist in gathering the sounds a group of people associate with a place. A web-based evolutionary algorithm, with human-in-the-loop fitness evaluations, ranks and recombines sounds to find collections that the group rates as familiar. An experiment covering four geographical locations shows that the process does indeed find sounds deemed familiar by participants.
This paper presents an original approach based on evolutionary algorithms for building structures that are stable under gravity for the physics-based puzzle game Angry Birds, with the ultimate objective of creating Angry Birds levels with the minimum number of constraints. We have created a custom open source evolutionary computation library that implement an evolutionary algorithm whose main challenges have been to design a fitness function that, first, avoids when possible the time-consuming actual execution of the game, and, then, takes into account the different ways in which a structure is not sound and eliminates them. In order to test the method six experiments have been carried out, obtaining a variety of stable structures, which is the first path for the generation of levels that are aesthetically pleasing as well as playable.
Computer based music generation (synthesis) has a rich history spanning several decades [1]. Current music evolution methods are interactive (periodic user evaluation to drive evolutionary selection), or otherwise feature-based where specific musical feature metrics are incorporated into the fitness function for synthesized music evaluation and selection [2]. In the former case, various musical styles and compositions have been evolved to suit user preferences, though evolved composition diversity and complexity are limited by user fatigue and the fitness function (for example, what musical features the user evaluates). In the latter case, evolved music diversity and complexity is similarly limited by fitness function metrics. Thus, metrics conforming to specific musical styles or genres will only result in the artificial evolution of musical compositions that resemble such styles or genres [1], [2].
Neighborhood operators play a crucial role in defining effective Local Search solvers, allowing one to limit the explored search space and prune the fitness landscape. Still, there is no accepted formal representation of such operators: they are usually modeled as algorithms in procedural language, lacking in compositionality and readability. In this paper we outline a new formalization capable of representing several neighborhood operators eschewing their coding in a full Turing complete language. The expressiveness of our proposal stems from a rich problem representation, as used in Constraint Programming models. We compare our system to competing approaches and show a clear increment in expressiveness.
We investigate the effectiveness of a set of evolutionary algorithms on noisy combinatorial optimisation problems. Despite some of these having polynomial runtime bounds for noisy OneMax, we find that in practice they are not able to solve this problem in reasonable time, with the exception of the Paired Crossover EA, and UMDA. We further study the performance of these two algorithms on noisy versions of SubsetSum and Knapsack.
The Quadratic Assignment Problem (QAP) is a specially challenging permutation-based np-hard combinatorial optimization problem, since instances of size n > 40 are seldom solved using exact methods. In this sense, many approximate methods have been published to tackle this problem, including Estimation of Distribution Algorithms (EDAs). In particular, EDAs have been used to solve permutation problems by introducing distance based exponential models, such as the Mallows Models. In this paper we approximate the QAP with a Hamming distance based kernels of Mallows Models.
Based on the benchmark instances, we have observed that our approach is competitive, reaching the best-known solution in 71% of the tested instances, especially on large instances (n > 125), where it is able to outperform state of the art results in 43 out of 288 instances.
In this paper, we introduce a copula-based EDA that uses a Discrete Vine-Copula (DVC) model. This model is particularly suited to represent distributions in the permutation representation. To demonstrate the effectiveness of the proposed Discrete-Vine-Copula based EDAs (DVCEDA), we perform a set of experiments on instances of the known TSP problems. The results obtained are promising to extend the work on other class of problems.
This paper studies the Single-Vehicle Cyclic Inventory Routing Problem (SV-CIRP) with the objective of simultaneously minimizing distribution and inventory costs for the customers and maximizing the collected rewards. A subset of customers is selected for the vehicle, including the quantity to be delivered to them. Simulated Annealing (SA) is proposed for solving the problem. Experimental results on 50 benchmark instances show that SA is comparable to the state-of-the-art algorithms. It is able to obtain 12 new best known solutions.
This paper looks to use both centralised and decentralised implementations of Evolutionary Algorithms to solve a dynamic variant of the Multi-Agent Travelling Salesman Problem. The problem is allocating an active set of tasks to a set of agents whilst simultaneously planning the route for each agent. The allocation and routing are closely coupled parts of the same problem, this paper attempts to align the real world implementation demands of a decentralised solution by using multiple populations with well defined interactions to exploit the problem structure.
We address a discrete competitive facility location problem for an entering firm with a binary customers choice rule and an asymmetric objective function. A heuristic optimization algorithm which is based on ranking of candidate locations and specially adopted for the discrete facility location problems is designed. The proposed algorithm is experimentally investigated by solving different instances of the facility location problem with an asymmetric objective function.
In this paper, we undertake an investigation on the effect of balanced and unbalanced crossover operators against the problem of finding non-linear balanced Boolean functions: we consider three different balanced crossover operators and compare their performances with classic one-point crossover. The statistical comparison shows that the use of balanced crossover operators gives GA a definite advantage over one-point crossover.
The Quadratic Assignment Problem (QAP) is a classical NP-hard combinatorial optimization problem. In the paper will be presented suitable metaheuristic algorithm HC12. The algorithm is population based and uses a massive parallel search of the binary space which represents the solution space of the QAP. The presented implementation of the metaheuristic HC12 utilizes the latest GPU CUDA platform. The results are presented on standard test problems from the QAP library.*
Data-intensive Web service composition (DWSC) combines numerous interoperating, distributed and reusable data-intensive services to accomplish a specific task. In this paper, we will propose a Memetic Algorithm (MA) with a novel crossover operator to solve the distributed DWSC problem. In particular, we define two heuristics which enable explicit use of two important sources of information in producing offspring solutions in the MA. 1) location information from the problem to quickly determine the suitability of any solution and 2) useful information embedded in sub-solutions (e.g. common parts of the solutions). Experiments have been performed on benchmark datasets and the results show that our method is more effective than a state-of-the-art baseline method.
Detecting the emotional content of text is one of the most popular NLP tasks. In this paper, we propose a new methodology based on identifying "idealised" words that capture the essence of an emotion; we define these words as having the minimal distance (using some metric function) between a word and the text that contains the relevant emotion (e.g. a tweet, a sentence). We look for these words through searching the space of word embeddings using CMA-ES. Our method produces state of the art results, surpassing classic supervised learning methods.
Support Vector Machines (SVMs) deliver state-of-the-art performance in real-world applications and are established as one of the standard tools for machine learning and data mining. A key problem of these methods is how to choose an optimal kernel function. The real-world applications have also emphasized the need to adapt the kernel to the characteristics of heterogeneous data in order to boost the classification accuracy. Therefore, our goal is to automatically search a task specific kernel function. We use reinforcement learning based search mechanisms to discover custom kernel functions and verify the effectiveness of our approach by conducting an empirical evaluation with the discovered kernel function on MNIST classification. Our experiments show that the discovered kernel function shows significantly better classification performance than well-known classic kernels. Our solution will be very effective for resource constrained systems with low memory footprint which rely on traditional machine learning algorithms like SVMs for classification tasks.
Transfer learning attracts increasing attention in many fields in recent years. However, studies on transfer learning for symbolic regression are still rare. This work proposes a new instance weighting framework for genetic programming (GP) based symbolic regression for transfer learning. The key idea is to use differential evolution to search for optimal weights during the evolutionary process of GP, which helps GP identify and learn from more useful source domain instances and eliminate the effort of less useful source domain instances. The results show that the proposed method achieves notably better cross-domain generalisation performance in a very stable way than GP without the instance weighting framework and support vector regression.
We present an evolutionary approach for Data Augmentation (DA) in deep Face Detection (FD). The approach is fully automatic and creates new face instances by recombining facial parts from different faces. We explore the selection of the facial parts that construct each new face instance using two strategies: random and evolutionary. The evolutionary strategy employs a Genetic Algorithm (GA) with automatic fitness assignment based on a pre-trained face detector. The evolutionary approach is able to find new face instances that exploit the vulnerabilities of the detector. Then we add these new instances to the training dataset, retrain the detector, and analyse the improvement of the performance of the detector. The presented approach is tested using deep object detectors, trained with instances from the CelebFaces Attributes (CelebA) dataset. The experimental results show that the presented approach improves face detection performance when comparing to base models trained using standard DA techniques. Also, the approach generates new realistic faces with interesting and unexpected features.
Support vector machine (SVM) classifiers can cope with many different classification tasks but improperly selected hyperparameters may deteriorate their performance. Moreover, datasets are getting bigger in terms of their size and the number of features. This is often coupled with low training data quality and presence of redundant features, which can adversely affect classification accuracy and time performance. Furthermore, high memory and computational complexity of SVM training can be a limiting factor for its application over huge datasets. We address these issues with evolutionarily-tuned SVM, where we utilize evolutionary algorithms for optimizing hyperparameters, along with selecting features and training instances. The performance of our method is compared on several benchmark datasets to other methods for optimizing SVMs, as well as to other classifiers. The results show that our algorithm gives high performance in both accuracy and classification time when compared with the state-of-the-art methods for SVM optimization.
Modern intrusion detection systems must be able to discover new types of attacks in real-time. To this aim, automatic or semi-automatic techniques can be used; outlier detection algorithms are particularly apt to this task, as they can work in an unsupervised way. However, due to the different nature and behavior of the attacks, the performance of different outlier detection algorithms varies largely. In this ongoing work, we describe an approach aimed at understanding whether an ensemble of outlier algorithms can be used to detect effectively new types of attacks in intrusion detection systems. In particular, Genetic Programming (GP) is adopted to build the combining function of an ensemble of local and global outlier detection algorithms, which are used to detect different types of attack. Preliminary experiments, conducted on the well-known NSL-KDD dataset, are encouraging and confirm that, depending on the type of attacks, it would be better to use only local or only global detection algorithms and that the GP-based ensemble improves the performance in comparison with commonly used combining functions.
We used multiobjective genetic algorithms with neuroevolution of augmenting topologies (NEAT) to evolve effective micro behaviors for opposing groups with heterogeneous compositions in StarCraft II, an RTS game. We used the Fast Nondominated Sorting Genetic Algorithm to maximize damage done and minimize damage received in a skirmish, and used this two objective fitness to guide NEAT to evolve the structure and weights of a neural network based controller. The evolved NEAT network controls the movement and attack commands for each unit. We show that non-dominated selection and NEAT can be used together to generate effective micro for groups with two types or three types of units on each side. The evolved micro also generalized well to random configurations, doing well along both objectives. We also manually co-evolved against the best performing individuals produced during a run for multiple cycles and show that this improves micro resulting in better performance against the default Starcraft II AI.
While the performance of many neural network and machine learning schemes has been improved through the automated design of various components of their architectures, the automated improvement of Adaptive Resonance Theory (ART) neural networks remains relatively unexplored. Recent work introduced a genetic programming (GP) approach to improve the performance of the Fuzzy ART neural network employing a hyper-heuristic approach to tailor Fuzzy ART's category choice function to specific problems. The GP method showed promising results. However, GP is not the only tool that can be used for automatic improvement. Among other methods, Nested Monte Carlo Search (NMCS) was recently applied to expression discovery and outperformed traditional evolutionary approaches by finding better solutions in fewer evaluations. This work applies NMCS to the discovery of new Fuzzy ART category choice functions targeted to specific problems with results demonstrating its ability to find better performing Fuzzy ART networks than the GP approach.
Reinforcement learning (RL) problems often feature deceptive local optima, and methods that optimize purely for reward often fail to learn strategies for overcoming them [2]. Deep neuroevolution and novelty search have been proposed as effective alternatives to gradient-based methods for learning RL policies directly from pixels. We introduce and evaluate the use of novelty search over agent action sequences by Levenshtein distance as a means for promoting innovation. We also introduce a method for stagnation detection and population regeneration inspired by recent developments in the RL community [5], [1] that is derived from novelty search. Our methods extend a state-of-the-art method for deep neuroevolution using a simple genetic algorithm (GA) designed to efficiently learn deep RL policy network weights [6]. Results provide further evidence that GAs are competitive with gradient-based algorithms for deep RL in the Atari 2600 benchmark. Results also demonstrate that novelty search over agent action sequences can be effectively used as a secondary source of evolutionary selection pressure.
This paper identifies scalability bounds of the evolutionary induced decision trees (DT)s. In order to conquer the barriers concerning the large-scale data we propose a novel multi-GPU approach. It incorporates the knowledge of the global DT induction and EA parallelization. The search for a tree structure and tests is performed sequentially by a CPU, while the fitness calculations are delegated to GPUs, thus the core evolution is unchanged. The results show that the evolutionary induction is accelerated several thousand times by using up to 4 GPUs on datasets with up to 1 billion objects.
Representations, or sensor-independent internal models of the environment, are important for any type of intelligent agent to process and act in an environment. Imbuing an artificially intelligent system with such a model of the world it functions in remains a difficult problem. However, using neuro-evolution as the means to optimize such a system allows the artificial intelligence to evolve proper models of the environment. Previous work has found an information-theoretic measure, R, which measures how much information a neural computational architecture (henceforth loosely referred to as a brain) has about its environment, and can additionally be used speed up the neuro-evolutionary process. However, it is possible that this improved evolutionary adaptation comes at a cost to the brain's ability to generalize or the brain's robustness to noise. In this paper, we show that this is not the case; to the contrary, we find an improved ability of the to evolve in noisy environments when the neuro-correlate R is used to augment evolutionary adaptation.
The use of Convolutional Neural Networks (CNNs) have proven to be a solid approach often used to solve many Machine Learning problems, such as image classification and natural language processing tasks. The manual design of CNNs is a hard task, due to the high number of configurable parameters and possible configurations. Recent studies around the automatic design of CNNs have shown positive results. In this work, we propose to explore the design of CNN architectures through the use of Grammatical Evolution (GE), where a BNF grammar is used to define the CNN components and structural rules. Experiments were performed using the MNIST and CIFAR-10 datasets. The obtained results show that the presented approach achieved competitive results and maintaining relatively small architectures when compared with similar state-of-the-art approaches.
While optimized neural network architectures are essential for effective training with gradient descent, their development remains a challenging and resource-intensive process full of trial-and-error iterations. We propose to encode neural networks with a differentiable variant of Cartesian Genetic Programming (dCGPANN) and present a memetic algorithm for architecture design: local searches with gradient descent learn the network parameters while evolutionary operators act on the dCGPANN genes shaping the network architecture towards faster learning. Studying a particular instance of such a learning scheme, we are able to improve the starting feed forward topology by learning how to rewire and prune links, adapt activation functions and introduce skip connections for chosen regression tasks. The evolved network architectures require less space for network parameters and reach, given the same amount of time, a significantly lower error on average.
On the XCS classifier system, an ideal assumption in the latest XCS learning theory means that it is impossible for XCS to distinguish accurate rules from any other rules with 100% success rate in practical use. This paper presents a preliminary work to remove this assumption. Furthermore, it reveals a dilemma in setting a crucial XCS parameter. That is, to guarantee 100% success rate, the learning rate should be greater than 0.5. However, a rule fitness updated with such a high learning rate would not converge to its true value so rule discovery would not act properly.
Biclustering is a growing in popularity machine learning technique which searches for patterns in subsets of rows and subsets of columns. One of the recent advances in biclustering was the development of EBIC, a multi-GPU method based on evolutionary computation, which was demonstrated to outperform some of the leading methods in the field. In this short paper, we evaluate a couple of potential improvements to the method.
Genetic programming (GP) may also evolve biased classifiers when having the class imbalance issue. Class imbalance is a difficult but important issue, and high-dimensionality brings difficulty when addressing the class imbalance issue. This paper focuses on addressing the performance bias of GP in classification with high-dimensional unbalanced data, with the goal of increasing the accuracies of the majority class and the minority class, as well as saving the training time. In this paper, a new fitness function is developed to address the class unbalanced issue, and moreover, a strategy is proposed to reuse previous good GP trees when using multiple GP processes to build a multi-classifier system. Experimental results show the better performance of the proposed method.
Recent research showed that deep neural networks can be trained to create shared languages to communicate and cooperate with each other. These approaches used fixed, handcrafted network architectures which were trained with reinforcement learning. We extend this approach by using neuroevolution to automate network design and find network weights of communicating agents. We show that neuroevolution is a viable approach for training agents to develop novel languages so as to communicate amongst themselves.
Artificial neural networks typically use backpropagation methods for the optimization of weights. In this paper, we aim at investigating the potential of applying the so-called evolutionary strategies (ESs) on the weight optimization task. Three commonly used ESs are tested on a multilayer feedforward network, trained on the well-known MNIST data set. The performance is compared to the Adam algorithm, in which the result shows that although the (1 + 1)-ES exhibits a higher convergence rate in the early stage of the training, it quickly gets stagnated and thus Adam still outperforms ESs at the final stage of the training.
This work adopts the notion of Ceteris Paribus (CP) as an interpretation of the Decision Maker (DM) preferences and incorporates it in a constrained multiobjective problem known as virtual machine placement (VMP). VMP is an essential multiobjective problem in the design and operation of cloud data centers concerned about placing each virtual machine to a physical machine (a server) in the data center. We analyze the effectiveness of CP interpretation on VMP problems and propose an NSGA-II variant with which preferred solutions are returned at almost no extra time cost.
Many real-world problems can be naturally formulated as discrete multi-objective optimization (DMOO) problems. In this research we propose a novel bio-inspired Physarum competition algorithm (PCA) to tackle DMOO problems by modelling the Physarum discrete motility over a hexagonal cellular automaton. Our algorithm is based on the chemo-attraction forces towards food resources (Objective Functions) and the repulsion negative forces between the competing Physarum. Numerical experimental work clearly demonstrated that our PCA algorithm had the best performance for the spread indicator against three state-of-the-art evolutionary algorithms, and its effectiveness in terms of commonly used metrics. These results have indicated the superiority of PCA in exploring the search space and keeping diversity, this makes PCA a promising algorithm for solving DMOO problems.
Incorporating a restart operator into a multi-objective evolutionary algorithm (MOEA) yields its performance improvement. Restarting an algorithm aims at preventing stagnation and reaching solutions uniformly distributed along the whole Pareto front. The presented experimental results for two MOEAs with the restart operator demonstrate vast potential of this metaheuristic. The use of the restart operator is limited by the necessity to adjust its key parameters for the problem solved.
The Iterated Prisoner's Dilemma (IPD) is an intriguing problem for which the Nash Equilibrium is not globally optimal. Typically treated as a single-objective problem, a player's goal is to maximize their own score. In some work, minimizing the opponent's score has been added as an additional objective. We explore the role of mutual cooperation in IPD player performance. We implement a genetic algorithm in which the population is divided into four multi-objective sub-populations: selfish, communal, cooperative, and selfless, the last three of which use a measure of mutual cooperation as an objective. Game play occurs among all members, without regard to sub-population, while crossover and selection occur only within a sub-population. Testing is against a population of well-known strategies and is single objective, using only self score. We find that players evolved to cooperate perform very well, in some cases dominating the competition. Thus, learning to play nicely with others is a successful strategy for maximizing personal reward.
Kriging assisted reference vector guided evolutionary algorithm (K-RVEA) is a recently proposed algorithm to deal with many-objective optimization problems involving computationally expensive objective functions. It employs Kriging as the a surrogate and identifies multiple infill locations based on angle penalized distance (APD) metric guided by a set of reference vectors originating from the ideal point. In this paper we investigate the performance implications of the underlying schemes, in particular (a) is APD based selection necessary since it involves an additional parameter, (b) can the full archive be used for surrogate training as opposed to fixed archive size in K-RVEA (c) can the infill solutions be further improved through angle constrained local search and finally (d) understand the limitations of single set of reference vectors and investigate the benefits of dual set of reference vectors (one originating from the ideal and the other from the Nadir). These investigations are based on the suite of standard and inverted DTLZ and WFG problems with up to 10 objectives.
In the era of the big data and e-science revolution, the execution of such applications known as High Performance Computing (HPC) is becoming a challenging issue. In order to face these challenges, a new promising Large Scale Distributed Systems (LSDS) has emerged suchlike Grid and Cloud Computing. As a matter of fact, these HPC applications are commonly arranged as a form of interdependent tasks named workflows. Nevertheless, the new challenging topic is that the scheduling of these scientific workflows in the LSDS is a well-known NP-hard problem. The goal of this work is to design an Non-dominated Sorting Genetic Algorithm Version II (NSGA-II)-based approach for optimizing a multi objective scheduling of scientific workflows in hybrid distributed systems. This paper work deals with the proposition of two execution models: i) A Cumulative one aiming to improve the Pareto front quality in term of Makespan-Cost trade-off; ii) An Incremental execution fashion, what kind of Cost-driven approach leading to a solution diversity of the Pareto front in the objective space. Experiments conducted with multiple common scientific workflows point out significant improvement against the classic NSGA-II algorithm.
In the area of multi-objective optimization, a special challenge is dynamic optimization problems. These problems change their optimal values or optimal configurations of input variables over time, making it harder for metaheuristic algorithms to track these changes and find the new optima. To test the search ability of such dynamic multi-objective algorithms, a dynamic benchmark called the Dynamic Distance Minimization Problem (dDMP) was proposed in the literature. The dDMP implements multiple changes, not only in location and fitness values of the Pareto-optimal sets, but also in the complexity of the problem. This work aims to test the performance of two well-known dynamic multi-objective algorithms on different instances of the dDMP with varying complexity. This involves changes in the number of objectives and changes of the distance metric at runtime, which has not been done before with this problem in the literature. The results show that both algorithms struggled to recover after the number of objectives was reduced and then increased again. When the distance metric was changed over time both algorithms performed reasonable well. However, there were gaps in the found Pareto fronts when switching between the Euclidean and the Manhattan distance metrics.
A large number of real-world multi-objective problems are subject to uncertainty, leading to the need to develop optimization methods that deal with it appropriately. In this work, we extend the decomposition-based algorithm to search for set-based robust solutions with the achievement scalarizing functions. Our preliminary work shows promising results that indicate our approach is able to outperform state-of-the-art algorithms that aim for set-based robust solutions.
Several studies report that the solutions obtained by indicator-based evolutionary algorithm (IBEA) with the binary additive epsilon indicator are biased to specific regions in the objective space. This paper reveals the reason of the bias is due to the asymmetricity of the epsilon indicator without considering the dominance relationship. This paper also proposes a modified epsilon indicator to obtain uniformly distributed solutions in the objective space. The proposed indicator uses indicator value based on Chebyshev distance when the dominance relationship does not exist between two solutions. The results of the experiments show that the diversity of IBEA is enhanced by the proposed indicator on WFG5 problem.
Many real-world applications contain complex optimization problems with several conflicting objectives. Finding a solution which can satisfy all the objectives is usually a challenging task for optimization algorithms. When dealing with these complex multi-objective problems, decision-makers want to find the best tradeoff between the conflicting objectives. Another challenge occurs in problems where multiple configurations of the input variables might yield the same objective function values. Such problems are called multimodal problems. For a decision maker, it might be of importance to obtain enough information about all the alternative optimal solutions that reach the same objective value. Traditionally, Evolutionary Algorithms make use selection processes based only on objective function values, which might be a disadvantage when faced with multimodal problems. In this article, we present two operators to use in multimodal multi-objective algorithms, namely a modified crowding distance operator and a neighbourhood Polynomial mutation, which take into account the distribution of solution in the decision space at run-time. Our experimental results demonstrate that the proposed operators are able to outperform the performance of a state-of-the-art method on six current multimodal benchmark functions.
Multi-view problems generalize standard machine learning problems to situations in which data entities are described from multiple different perspectives, a situation that arises in many applications due to the consideration of multiple data sources or multiple metrics of dissimilarity between entities. Multi-view algorithms for data clustering offer the opportunity to fully consider and integrate this information during the clustering process, but current algorithms are often limited to the use of two views.
Here, we describe the design of an evolutionary algorithm for the problem of multi-view data clustering. The use of a many-objective evolutionary algorithm addresses limitations of previous work, as the resulting method should be capable of scaling to settings with four or more views. We evaluate the performance of our proposed algorithm for a set of traditional benchmark datasets, where multiple views are derived using distinct measures of dissimilarity. Our results demonstrate the ability of our method to effectively deal with a many-view setting, as well as the performance boost obtained from the integration of complementary measures of dissimilarity for both synthetic and real-world datasets.
The key characteristic of the Multi-Objective Evolutionary Algorithm Based on Decomposition (MOEA/D) is that a multi-objective problem is decomposed into multiple single-objective subproblems. In standard MOEA/D, all subproblems receive the same computational effort. However, as each subproblem relates to different areas of the objective space, it is expected that some subproblems are more difficult than others. Resource Allocation techniques allocates computational effort proportional to each subproblem's difficulty. This difficulty is estimated by a priority function. Using Resource Allocation, MOEA/D could spend less effort on easier subproblems and more on harder ones, improving efficiency. We propose that using diversity as the priority criteria results in better allocation of computational effort. Therefore we propose a new priority function: decision space diversity. We compare the proposed diversity based priority with previous approaches on the UF benchmarks. The proposed decision space priority achieved high IGD values, excellent rate of non-dominated solutions on the benchmark problem.
Many combinatorial optimization problems involve scheduling or ordering work that will ultimately be completed by a company's employees. If solution quality is measured by a simple weighted sum of the constraint violations for each employee, an optimizer may produce solutions in which a small number of employees suffer a highly disproportionate share of these violations. We present the results of experiments in generating rosters whilst considering fairness as an additional optimization objective.
In this paper, an effective optimization approach, which integrated the probabilistic surrogate model, non-dominated sorting genetic algorithm, and reliability index method, is proposed to multi-objective reliability-based design optimization. To reduce the computational cost and improve the efficiency of the optimization process, the problem can be surrogated by probabilistic response surface method and Bayesian neural network as high fidelity metamodel with statistical modelling method. After verification through the simulation results on numerical test problem, these techniques have been applied to engineering problem in optimizing simultaneously multi-performances or objective functions subject to reliability constraints.
Constrained multi-objective optimization problems(CMOPs) have wide applications in many areas. One of the difficulties in solving CMOPs is to handle constraints and optimize objective values simultaneously. In this paper, three improvements are proposed for CMOPs. Firstly, a new strategy of adaptive grouping is proposed to ensure the diversity of the algorithm. Secondly, each sub-population evolves according to the designed crossover operator, which improves the searchability of the algorithm, thus accelerates the convergence process of the algorithm. Finally, a new sorting method based on the information of representative solutions is designed to measure the quality of individuals. The experimental results show that the proposed algorithm performs better in convergence and diversity.
One of the main tasks in agriculture is deciding which crop should be planted on which field. Agricultural companies often cultivate dozens of crops on hundreds of fields, making this problem extremely computationally complex. It was solved within evolutionary many-objective optimisation (EMO) framework. Objective functions included: profit, yield risk, price risk, scatteredness, crop rotation and environmental impact (total amounts of fertiliser and pesticide used). As the decision variables were categories (crops) and not real values, NSGA-III was adapted by changing the genetic operators of mutation and crossover from numerical to categorical. Optimisation was performed on the dataset provided by a partnering agricultural company. Out of the resulting population of solutions, characteristic crop configurations were chosen and compared to the benchmark, i.e. company's current strategy.
Most multi-objective evolutionary algorithms (MOEAs) of the state of the art treat the decision variables of a multi-objective optimization problem (MOP) as a whole. However, when dealing with MOPs with a large number of decision variables (more than 100) their efficacy decreases as the number of decision variables of the MOP increases. Problem decomposition, in terms of decision variables, has been found to be extremely efficient and effective for solving large scale optimization problems. In this work, we study the effect of what we call "operational decomposition", which is a novel framework based on coevolutionary concepts to apply MOEAs's crossover operator without adding any extra cost. We investigate the improvements that NSGA-III can achieve when combined with operational decomposition. This new scheme is capable of improving efficiency of a MOEA when dealing with large scale MOPs having from 200 up to 1200 decision variables.
Dynamical compartmental models capture the population dynamics of Multi-objective Optimization Evolutionary Algorithms. In these models, solutions at each generation are classified in compartments according to Pareto dominance. The size of each model compartment is affected by the other components and changes throughout the generations. Once the dynamics are learned, given the composition of each compartment for the initial population, the model can predict their changes in future generations. In this work, we use these models to select an appropriate population size for a given budget of fitness evaluations. We first learn models parameters for a subset of population sizes and then extract new models from existing ones by spline interpolation. The proposed method allows the investigation of population sizes in-between the ones for which data is available. We verify the proposed method using MNK-Landscapes with twenty bits and one epistatic interaction, running the Adaptive ϵ-Sampling e-Hood MOEA on several population configurations.
The success of Multi-Objective Evolutionary Algorithms based on Decomposition (MOEA/D) has generated great interest in the use of a reference set of weight vectors to promote diversity within non-dominated solutions. However, the quality of the solution set obtained heavily depends on the relation between the weight distribution and the Pareto front's shape.
This study focuses on a performance comparison of classical techniques for weight vector generation, either based on mixture design or low discrepancy sequences, and a novel approach for updating the weight vectors during the evolutionary process. This approach uses information from the non-dominated individuals to generate weights vectors through a repulsion criterion. Preliminary experiments indicate that this dynamic strategy provides significant benefits when compared to the static Simplex Lattice Design (SLD).
An important benefit of multi-objective search is that it maintains a diverse population of candidates, which helps in deceptive problems in particular. Not all diversity is useful, however: candidates that optimize only one objective while ignoring others are rarely helpful. A recent solution is to replace the original objectives by their linear combinations, thus focusing the search on the most useful tradeoffs between objectives. To compensate for the loss of diversity, this transformation is accompanied by a selection mechanism that favors novelty. This paper improves this approach further by introducing novelty pulsation, i.e. a systematic method to alternate between novelty selection and local optimization. In the highly deceptive problem of discovering minimal sorting networks, it finds state-of-the-art solutions significantly faster than before. In fact, our method so far has established a new world record for the 20-lines sorting network with 91comparators. In the real-world problem of stock trading, it discovers solutions that generalize significantly better on unseen data. Composite Novelty Pulsation is therefore a promising approach to solving deceptive real-world problems through multi-objective optimization.
Multi-objective problems (MOP) are of significant interest to both multi-criteria decision making (MCDM) and evolutionary multi-objective (EMO) research communities. A core technique common in both is scalarization, which combines multiple objectives into one in a way that solving it provides a solution to the original MOP. In this paper, we look closely at two scalarization methods --- augmented achievement scalarization function (AASF) and penalty boundary intersection (PBI). While the former has its roots in MCDM literature, the latter was developed in EMO field with focus on decomposition-based algorithms. We observe the conventional limits of the parameters involved in these methods and then demonstrate that by relaxing those limits one could be made to behave like the other. The aim is to gain a deeper understanding of both these measures, as well as expand their parametric range to provide more control over the search behavior of EMO algorithms. It also lays groundwork for further development of complete analytical derivations of equivalence conditions between the two metrics.
This study focuses on a bi-objective mathematical programming formulation of the supply chain design and operation problem, which aims at simultaneously minimizing total costs and delays in order delivery. For small instances, a commercial solver (Gurobi Optimization) with different scalarizing techniques achieves a good representation of the real Pareto front. In the perspective of treating real size problems, NSGA-II and MOEA/D are applied to the same instances. Computational results highlight mitigated performances of both MOEAs and provide some insigts regarding future research paths for adapting MOEAs to such complex problems.
Many applications such as hyper-parameter tunning in Machine Learning can be casted to multiobjective black-box problems and it is challenging to optimize them. Bayesian Optimization (BO) is an effective method to deal with black-box functions. This paper mainly focuses on balancing exploration and exploitation in multi-objective black-box optimization problems by multiple samplings in BBO. In each iteration, multiple recommendations are generated via two different trade-off strategies respectively the expected improvement (EI) and a multiobjective framework with the mean and variance function of the GP posterior forming two conflict objectives. We compare our algorithm with ParEGO by running on 12 test functions. Hypervolume (HV, also known as S-metric) results show that our algorithm works well in exploration-exploitation trade-off for multiobjective black-box optimization problems.
Expensive black-box problems are usually optimized by Bayesian Optimization (BO) since it can reduce evaluation costs via cheaper surrogates. The most popular model used in Bayesian Optimization is the Gaussian process (GP) whose posterior is based on a joint GP prior built by initial observations, so the posterior is also a Gaussian process. Observations are often not noise-free, so in most of these cases, a noisy transformation of the objective space is observed. Many single objective optimization algorithms have succeeded in extending efficient global optimization (EGO) to noisy circumstances, while ParEGO fails to consider noise. In order to deal with noisy expensive black-box problems, we extending ParEGO to noisy optimization according to adding a Gaussian noisy error while approximating the surrogate. We call it noisy-ParEGO and results of S-metric indicate that the algorithm works well on optimizing noisy expensive multiobjective black-box problems.
In the preference-based multi-objective optimization, the lack of priori-knowledge makes it difficult for the decision maker to specify an informed preference. Thus, the knees are regarded as the naturally preferred solutions on the Pareto optimal front. However, most research is based on a given large number of solutions and a posteriori identifies the knee candidates among them.
Based on the α-dominance relationship, this paper proposes a new framework to a priori search the knee regions. Firstly, a number of reference vectors are generated in the objective space. During the environmental selection, all solutions are associated to their closest reference vectors. The solutions associated to different reference vectors are deemed to be non-α-dominated with each other. If they are correlated with the same reference vector, the α-dominance relationship is adopted to sort the solutions into different frontiers. Therefore, the knee candidates are assigned to the first layer and selected with a higher priority, so that more knee information from the previous generation will be preserved and more potential knee regions will be explored. The comparative experiments demonstrate that the proposed method is competitive in identifying convex knee regions.
Real-world optimization problems often have expensive objective functions in terms of cost and time. It is desirable to find near-optimal solutions with very few function evaluations. Surrogate-assisted optimizers tend to reduce the required number of function evaluations by replacing the real function with an efficient mathematical model built on few evaluated points. Problems with a high condition number are a challenge for many surrogate-assisted optimizers including SACOBRA. To address such problems we propose a new online whitening operating in the black-box optimization paradigm. We show on a set of high-conditioning functions that online whitening tackles SACOBRA's early stagnation issue and reduces the optimization error by a factor between 10 to 1012 as compared to the plain SACOBRA, though it imposes many extra function evaluations. Covariance matrix adaptation evolution strategy (CMA-ES) has for very high numbers of function evaluations even lower errors, whereas SACOBRA performs better in the expensive setting (≤ 103 function evaluations). If we count all parallelizable function evaluations (population evaluation in CMA-ES, online whitening in our approach) as one iteration, then both algorithms have comparable strength even on the long run. This holds for problems with dimension D ≤ 20.
This work presents interesting multi-point search algorithms exploiting several surrogate models, implemented in Minamo, the multi-disciplinary optimization platform of Cenaero. Many types of surrogate models are used in the literature with their own forces and weaknesses. More generally, each one models differently a given problem and provides its own representation of the reality. The idea of this paper is to exploit simultaneously different types of surrogate models in order to catch automatically their forces and to outshine some of their weaknesses. This strategy is based on a multi-point enrichment at each iteration, each candidate point being provided by one kind of surrogate model and/or criterion. This strategy can be tuned by selecting different infill criteria, based on different surrogate models, in order to improve more specifically different aspects such as feasibility, exploration and/or exploitation. The performance of this surrogate-based optimization framework is illustrated on well-known constrained benchmark problems available in the literature (such as GX-functions and MOPTA08 test cases). Good performance both in terms of identification of feasible regions and objective gains is demonstrated.
When a new optimization algorithm is proposed, it is compared with state-of-the-art methods. That comparison is made using implementations of the algorithms, but names and versions of the implementations are usually not revealed. This paper compares five implementations of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) taken from a trusted source. The comparisons were performed using the Comparing Continuous Optimizers (COCO) platform. The results show that all examined implementations produce a different outcome. The variation of the results stems from differences in the auxiliary codes of the implementations and from implementing an algorithm which is still under development. It is therefore important to use an appropriate implementation for experiments. Using a weak implementation can lead to the wrong conclusions.
Quality-Diversity (QD) algorithms are a recent type of optimisation methods that search for a collection of both diverse and high performing solutions. They can be used to effectively explore a target problem according to features denned by the user. However, the field of QD still does not possess extensive methodologies and reference benchmarks to compare these algorithms. We propose a simple benchmark to compare the reliability of QD algorithms by optimising the Rastrigin function, an artificial landscape function often used to test global optimisation methods.
Contextual policy search (CPS) is a class of multi-task reinforcement learning algorithms that is particularly useful for robotic applications. A recent state-of-the-art method is Contextual Covariance Matrix Adaptation Evolution Strategies (C-CMA-ES). It is based on the standard black-box optimization algorithm CMA-ES. There are two useful extensions of CMA-ES that we will transfer to C-CMA-ES and evaluate empirically: ACM-ES, which uses a comparison-based surrogate model, and aCMA-ES, which uses an active update of the covariance matrix. We will show that improvements with these methods can be impressive in terms of sample-efficiency, although this is not relevant any more for the robotic domain.
Constrained optimization problems are often characterized by multiple constraints that, in the practice, must be satisfied with different tolerance levels. Here, we evaluate the applicability of MAP-Elites to "illuminate" constrained search spaces by mapping them into feature spaces where each feature corresponds to a different constraint. We demonstrate the feasibility of this approach on a large set of benchmark problems, in various dimensionalities, and with different algorithmic configurations. As expected, numerical results show that a basic version of MAP-Elites cannot compete with state-of-the-art algorithms that use gradient information or constraint handling techniques. Nevertheless, it can find constraint violations vs. objectives trade-offs and thus provide new problem information. As such, it could be used in the future as an effective building-block for designing new constrained optimization algorithms.
Swarm Intelligence (SI) is a behavior, used first by Beni and Wang, corresponding to a system working with single and self-organized agents, interacting the ones with each other. This operating concept is implemented in many algorithms. Developed by Kennedy, Eberhart and Shi, Particle Swarm Optimization (PSO) is one of them. Its behavior is based on the movements of birds swarm, and its effectiveness, for looking for the optimal solution of a given problem, is well established. Nevertheless, PSO is known for its weakness in local search. Moreover, the behavior of PSO strongly depends on internal parameters settings. In order to address these problems, we propose a new type of self-adaptive Quantum PSO (QPSO), called QUAntum Particle Swarm Optimization (QUAPSO), based on quantum superposition to set the velocity parameters and hybridized with a Kangaroo Algorithm in order to optimize its efficiency in local search. QUAPSO was compared with five known algorithms from the literature (classical PSO, Kangaroo Algorithm, Simulated Annealing Particle Swarm Optimization, Bat Algorithm and Simulated Annealing Gaussian Bat Algorithm) on a set of 19 test functions. The results show that QUAPSO outperforms the competing algorithms.
In this work we study the generalization of Stochastic Derivative-Free Optimization (SDFO) algorithms from Euclidean spaces to Riemannian manifolds. In the literature, Riemannian adaptations of SDFO relies on the Riemannian exponential map, which imposes local restrictions. We aim to address this restriction using only the intrinsic geometry of the Riemannian manifold.
We first propose Riemannian SDFO (RSDFO), a generalized framework for adapting SDFO algorithms on Euclidean spaces to Riemannian manifolds. We then propose a novel algorithm --- Extended RSDFO, and discuss its convergence behaviour on finite volume Riemann manifolds.
The copula-based EDA algorithms nowadays represent a promising technique for problem optimization in the continuous domain. This paper provides a detailed analysis on how six key parameters of the parallel copula-based EDA with model migration (mCEDA) influence the quality of optimization. In order to improve the performance of that kind of algorithm the most suitable setting of these control parameters is evaluated on the well known CEC 2013 benchmark using inferential statistics.
Benchmark sets and landscape features are used to test algorithms and to train models to perform algorithm selection or configuration. These approaches are based on the assumption that algorithms have similar performances on problems with similar feature sets. In this paper, we test different configurations of differential evolution (DE) against the BBOB set. We then use the landscape features of those problems and a case base reasoning approach for DE configuration selection. We show that, although this method obtains good results for BBOB problems, it fails to select the best configurations when facing a new set of optimisation problems with a distinct array of landscape features. This demonstrates the limitations of the BBOB set for algorithm selection. Moreover, by examination of the relationship between features and algorithm performance, we show that there is no correlation between the feature space and the performance space. We conclude by identifying some important open questions raised by this work.
Due to difficulties such as multiple local optima and flat landscape, it is suggested to use global optimization techniques to discover the global optimum of the auxiliary optimization problem of finding good Gaussian Processes (GP) hyperparameters. We investigated the performance of genetic algorithms (GA), particle swarm optimization (PSO), differential evolution (DE), and covariance matrix adaptation evolution strategy (CMA-ES) for optimizing hyperparameters of GP. The study was performed on two artificial problems and also one real-world problem. From the results, we observe that PSO, CMA-ES, and DE/local-to-best/1 consistently outperformed two variants of GA and DE/rand/1 with per-generation-dither on all problems. In particular, CMA-ES is an attractive method since it is quasi-parameter free and it also demonstrates good exploitative and explorative power on optimizing the hyperparameters.
In this paper, we analyze the convergence behavior of Differential Evolution (DE) and theoretically prove that under certain adversarial conditions, the generic DE algorithm may not at all converge to the global optimum even on apparently simpler fitness landscapes. We characterize these function classes and initialization conditions theoretically and provide mathematical supports to the non-convergence behavior of DE. To overcome these adversarial conditions, we propose a slightly modified variant of DE called Differential Evolution with Noisy Mutation (DENM), which incorporates a noise term in the mutation step. We analytically show that DENM can converge to the global optima within a finite budget of function evaluations.
This paper surveys and compares a wide range of derivative-free optimization algorithms in an open source context. We also propose a genetic variant of differential evolution, an adaptation of population control for the multimodal noise-free case, new multiscale deceptive functions, and as a contribution to the debate on genetic crossovers, a test function with useless variables on which 2-points crossover achieves great performance. We include discrete and continuous and mixed settings; sequential and parallel optimization; rotated, separable and partially rotated settings; noisy and noise-free setting. We include real world applications and compare with recent optimizers which have not yet been extensively compared to the state of the art.
Large-scale global optimization (LSGO) problems are known as hard problems for many evolutionary algorithms (EAs). LSGO problems are usually computationally costly, thus an experimental analysis for choosing an appropriate algorithm and its parameter settings is difficult or sometimes impossible. In this study, we have investigated the performance of novel EA for LSGO based on Adaptive Differential Evolution with Success-History (SHADE) and cooperative coevolution (CC) with random adaptive grouping (RAG). SHADE is a self-adaptive DE algorithm. RAG approach is able to identify effective combinations of variables (subcomponents) in the problem decomposition stage. Thus, the proposed approach contains only two controlled parameters: the population size and the number of subcomponents. We have evaluated the performance of CC-SHADE-RAG with different settings on the IEEE CEC LSGO benchmark. The experimental results show that the approach outperforms some state-of-the-art LSGO techniques. CC-SHADE-RAG performs better with a small number of subcomponents and large size of population. We have performed statistical analysis for the experimental results and have established that the population size has greater effect on the algorithm performance than the number of subcomponents. This information can be used for further development of a completely self-adaptive LSGO technique by introducing an adaptive population sizing scheme in CC-SHADE-RAG.
This paper proposes a region-learning based JADE algorithm, namely RL-JADE, to solve numerical optimization problems. To exploit as much as possible the most promising areas known by the current population, the worst parts of population are eliminated, and some new individuals are regenerated in the area where the best parts of population locate. RL-JADE is tested based on COCO benchmarks, and compared with other DE-variants. Experimental results show that RL-JADE has a better performance than JADE on the tested 5-D problems.
This paper proposes a new mutation operator Current --- to --- better-AC with an angle control rule for DE algorithm. This new mutation operator has the potential to automatically adjust the exploration and exploitation based on the angle control rule. Under the control of this rule, the individuals pair for generating mutation vector is tendentiously selected. In Current --- to --- better-AC, only high-ranking individuals are permitted to use the angle control rule, while low-ranking individuals are encouraged to move to a random high-ranking individual. This method is embedded into JADE as a new algorithm AC-JADE, and its performance is tested by COCO benchmarks. Experimental results show that AC-JADE performs better than JADE.
The GECCO Workshop on Real-Parameter Black-Box Optimization Benchmarking Series is a series of benchmarking workshops held every year since 2009 that evaluates the performance of new optimization algorithms.
Originally, the workshop organizers provided results for every year the workshop took place. In this article, we directly compare algorithms from every workshop held from 2009 to 2018. We analyze the compared algorithms on the noiseless benchmark function set using the competition's official empirical approach, and with a statistical approach called Deep Statistical Comparison.
Our goal is to investigate how algorithm performance has evolved throughout the years, and to show differences between empirical and statistical approaches to evaluating results.
Self-adjustment of parameters can significantly improve the performance of evolutionary algorithms. A notable example is the (1 + (λ, λ)) genetic algorithm, where the adaptation of the population size helps to achieve the linear runtime on the OneMax problem. However, on problems which interact badly with the self-adjustment procedure, its usage can lead to performance degradation compared to static parameter choices. In particular, the one fifth rule is able to raise the population size too fast on problems which are too far away from the perfect fitness-distance correlation.
We propose a modification of the one fifth rule in order to have less negative impact in scenarios when the original rule reduces the performance. Our modification, while still having a good performance on OneMax, both theoretically and in practice, also shows better results on linear functions with random weights and on random satisfiable MAX-SAT instances.
Nowadays, the metric space properties limit the methods of indexing for content-based similarity search. The target of this paper is a data-driven transformation of a semimetric model to a metric one while keeping the data indexability high. We have proposed a genetic algorithm for evolutionary design of semimetric-to-metric modifiers. The precision of our algorithm is near the specified error threshold and indexability is still good. The paper contribution is a proof of concept showing that genetic algorithms can effectively design semimetric modifiers applicable in similarity search engines.
This work presents a multi-population biased random-key genetic algorithm (BRKGA) for the electric distribution network reconfiguration problem (DNR). DNR belongs to the class of network design problems which include transportation problems, computer network restoration and telecommunication network design and can be used for loss minimization and load balancing, being an important tool for distribution network operators. A BRKGA is a class of genetic algorithms in which solutions are encoded as vectors of random keys, i.e. randomly generated real numbers from a uniform distribution in the interval [0, 1). A vector of random keys is translated into a solution of the optimization problem by a decoder. The decoder used generates only feasible solutions by using an efficient codification based upon the fundamentals of graph theory, restricting the search space. The parallelization is based on the single program multiple data paradigm and is executed on the cores of a multi-core processor. Time to target plots, which characterize the running times of stochastic algorithms for combinatorial optimization, are used to compare the performance of the serial and parallel algorithms. The proposed method has been tested on two standard distribution systems and the results show the effectiveness and performance of the parallel algorithm.
Event Takeover Values (ETV) measure the impact of each individual in the population dynamics of evolutionary algorithms (EA). Previous studies argue that ETV distribution of panmictic EAs fit power laws with exponent between 2.2 and 2.5 and that this property is insensitive to fitness landscapes and design choices of the EAs. One exception is cellular EAs, for which there are deviations of the power law for large values. In this paper, ETVs for structured and panmictic EAs with different population size and mutation probability on several fitness landscapes were computed. Although the ETVs distribution of pamictic EAs is heavy-tailed, the log-log plot of the complementary cumulative distributed function shows no linearity. Furthermore, Vuong's tests on the distributions generated by several instances of the problems conclude that power law models cannot be favored over log-normal models. On the other hand, the tests confirm that cEAs impose significant deviations to the distribution tail.
We consider the application of a genetic algorithm (GA) to the problem of hiding information in program executables. In a nutshell, our approach is to re-order instructions in a program in such way that aims to maximize the amount of data that can be embedded while, at the same time, ensuring the functionality of the executable is not altered. In this work, we focus on the problem of identifying a large set of instructions which may be re-ordered, and some initial results on the IA-64 architecture are then presented that illustrate the potential benefit of such an approach.*
This work proposes a biased random key genetic algorithm (BRKGA) for the integrated scheduling of manufacturing, transport, and storage/retrieval operations inflexible manufacturing systems (FMSs). Only recently, research on this problem has been reported; however, no heuristic approaches have yet been reported. The computational results show the BRKGA to be capable of finding good quality solutions quickly.
Limiting the number of required settings is an important part of any evolutionary method development. The final objective of this process is a method version that is parameter-less. Based on the research results presented that far, the leading methods in the combinatorial optimization are Linkage Tree Genetic Algorithm (LTGA), Parameter-less Population Pyramid (P3) and Dependency Structure Matrix Genetic Algorithm II (DSMGA-II). P3 was originally proposed as a parameter-less method, while LTGA and DSMGA-II in their original propositions both require one parameter that is the population size. Recently a population-sizing technique was used to propose a parameter-less version of LTGA (psLTGA). However, the population-sizing was not introduced in DSMGA-II that misses its effective parameter-less version. Therefore, to fill this gap, in this paper we propose a Population-sizing DSMGA-II (psDSMGA-II) that is parameter-less. We also show that psDSMGA-II is more effective than its predecessor and that it may successfully compete with psLTGA and P3.
This paper proposes the use of genetic algorithms as shrinkers for shrinking the counterexamples generated by QuickChick, a property-based testing framework for Coq. The present study incorporates the flexibility and versatility of evolutionary algorithms into the realm of rigorous software development, in particular, making the results of property-based testing observable and comprehensible for human. The program code for merge sort is investigated as a showcase in the study. Due to the lack of similar proposals in the literature, random sample is used to compete with the proposal for comparison. The experimental results indicate that the proposed genetic algorithm outperforms random sample. Moreover, the minimal counterexamples, through which programmers are able to pinpoint the program mistakes with ease, can be successfully obtained by using genetic algorithms as shrinkers.
The Convex Search Algorithm (CSA) is a generalization across representations of Evolutionary Algorithms (EAs) with crossover and no mutation. The Standard Evolutionary Search Algorithm (SESA) is a more accurate generalization of EAs with crossover and no mutation, using a standard two-parents crossover. This work extends the runtime analysis of the CSA on quasi-concave landscapes [4] to the SESA. We instantiate the analysis to binary strings and integer vectors endowed with the Hamming distance and the Manhattan distance. We find that the SESA requires a larger population size to converge to a global optimum; resulting in a larger runtime upper bound than the CSA. Empirical studies on LeadingOnes confirmed the existence of a smallest population size above which both algorithms are guaranteed to find the global optimum. Below this threshold, the SESA is less successful than the CSA.
In recent decades, deep learning approaches have shown impressive results in many applications. However, most of these approaches rely on manually crafted architectures for a specific task in large design space, allowing room for sub-optimal designs, which are more prone to be stuck in local minima and to overfit. Therefore, there is considerable motivation in performing architecture search for solving a specific task. In this work, we propose an initialization technique for design space exploration of deep neural networks architectures based on Latin Hypercube Sampling (LHS). When compared with random initialization using standard datasets in machine learning such as MNIST, and CIFAR-10, the proposed approach shows to be promissory on the neural architectural search domain, outperforming the commonly used random initialization.
We propose a novel method for evolutionary network analysis that uses the genetic algorithm (GA), called the multiple world genetic algorithm, to coevolve appropriate individual behaviors of many agents on complex networks without sacrificing diversity. We conducted the experiments using simulated games of social networking services to evaluate the proposed method. The results indicate that it could effectively evolve the diverse strategy for each agent and the resulting fitness values were almost always larger than those derived through evolution using the conventional evolutionary network analysis using the GA.
Biased Random-Key Genetic Algorithm (BRKGA) is a variation of genetic algorithms that represents the solutions of a problem as real-key vectors defined by values randomly generated in the continuous interval [0, 1) and uses a deterministic decoder to map these vectors into feasible solutions to the problem. In this work, is investigated the impact of the substitution of the uniform distribution, used in the generation of BRKGA random keys, by the use of the Levy distribution. This variation is inserted in two versions of the BRKGA present in the literature. The proposed versions were compared with the literature versions using a set of global optimization benchmark functions, present in several works of the research area. The algorithms are compared in terms of performance and quality of the solutions and the variations proposed in this work reach very promising results.
We propose a novel genetic algorithm to solve the image deformation estimation problem by preserving the genetic diversity. As a classical problem, there is always a trade-off between the complexity of deformation models and the difficulty of parameters search in image deformation. 2D cubic B-spline surface is a highly free-form deformation model and is able to handle complex deformations such as fluid image distortions. However, it is challenging to estimate an apposite global solution.
To tackle this problem, we develop a genetic operation named probabilistic bitwise operation (PBO) to replace the crossover and mutation operations, which can preserve the diversity during generation iteration and achieve better coverage ratio of the solution space. Furthermore, a selection strategy named annealing selection is proposed to control the convergence. Qualitative and quantitative results on synthetic data show the effectiveness of our method.
Genetic algorithms (GAs) are stochastic optimization methods that mimic the evolution of living organisms. Among them, GA which uses real-coded genes is called Real-Coded GA (RCGA). Several generation alternation models have been proposed to improve RCGA. Problem decomposition is a known method for improving search performance. Therefore, in this study, a practical, real-world method of problem decomposition was considered. Here, it was considered that the individual evaluation method utilized in group work could be applied to the individual evaluation of RCGA. We focused on the group work which practice problem decomposition in the real world. In group work, each of members collaborates with the others and is evaluated comprehensively from the group's outcome and personal contribution to the group. Therefore, in this study a model is proposed to introduce the concept of group work into a generation alternation model of RCGA. We evaluated the search performance of the proposed method using benchmark functions and confirmed the performance improvement by comparing with the conventional methods.
Finding meaningful structures in big data is challenging, especially within big and noisy data. In this short paper, we present the results of the application of 6 different biclustering methods to a massive human RNA-seq dataset with over 35k genes from over 125k samples. We assess which biclustering methods can handle that large data and compare the results to the mini-batch k-means, a popular clustering approach. Finally, we assess the importance of evolutionary-based approaches in biclustering 'big data'.
Parallel genetic algorithms (PGAs) can be used to accelerate optimization by exploiting large-scale computational resources. In this work, we describe a PGA framework for evolving spiking neural networks (SNNs) for neuromorphic hardware implementation. The PGA framework is based on an islands model with migration. We show that using this framework, better SNNs for neuromorphic systems can be evolved faster.
We present a new algorithm for network clustering based upon genetic algorithm methods to optimize modularity. The algorithm proposes an innovative, more abstract representation, along with newly designed domain-specific genetic operators. We then analyze the performance of the algorithm using popular real-world data sets taken from multiple domains. The analysis demonstrates that our algorithm consistently finds high quality or even optimal solutions without any a priori knowledge of the network or the desired number of clusters. Furthermore, we compare our results with five previously published methods and yield the highest quality for the largest of the benchmark datasets.
In black-box optimization scenarios, researchers have no control over the fitness functions, and hence genetic algorithms (GAs) are usually compared by the number of function evaluations. Commonly used statistics include arithmetic mean, median, and standard deviation. However, these statistics can be misleading. For example, when there exist unsolvable instances within limited time, median simply ignores those instances, and arithmetic mean is not applicable at all. In this paper, we propose comparison methods from a practical point of view. Specifically we propose three use cases which cover most of the situations that GA practitioners may encounter. Among these three use cases, two of them are matchups, which requires a pair of GAs to be compared with each other, while the other provides as a standalone performance indicator of GAs.
Parameter-less Population Pyramid (P3) is a parameter-less optimization method that employs the Dependency Structure Matrix (DSM) to discover dependencies between genes. The results published so far show that P3 effectiveness is low when bimodal trap problems are in use. Therefore, Parameter-less Population Pyramid with Feedback (fP3) was proposed. fP3 extends P3 by the feedback operation which improves P3's effectiveness when bimodal trap problems are used. However, fP3 is no longer a parameter-less method. In this paper, we propose an adaptive strategy that sets the feedback probability in each method's iteration. Therefore, newly proposed Parameter-less Population Pyramid with Automatic Feedback (afP3) is parameter-less and preserves fP3 advantages.
In January 2019, DeepMind revealed AlphaStar to the world---the first artificial intelligence (AI) system to beat a professional player at the game of StarCraft II---representing a milestone in the progress of AI. AlphaStar draws on many areas of AI research, including deep learning, reinforcement learning, game theory, and evolutionary computation (EC). In this paper we analyze AlphaStar primarily through the lens of EC, presenting a new look at the system and relating it to many concepts in the field. We highlight some of its most interesting aspects---the use of Lamarckian evolution, competitive co-evolution, and quality diversity. In doing so, we hope to provide a bridge between the wider EC community and one of the most significant AI systems developed in recent times.
Cellular Automata (CA) can be successfully applied in various image processing tasks because they have a number of advantages over the traditional methods of computations: simplicity of implementation, the complexity of behaviour, parallelisation, extensibility, scalability, robustness. In this paper, an edge detection method for binary images, based on CA and Evolutionary Algorithms (EA) is presented. The rule of a two-dimensional CA is evolved by the means of two EAs, one that evolves the rule to detect edge points and another one that evolves the rule to detect non-edge points. The focus is on the implementation of the EAs, how individuals are represented, how the evolving process is evaluated and which genetic operators are used. The results of the experiments show a better performance of the proposed approach in comparison with similar approaches presented in the literature.
We aim to showcase the benefit of transfer optimization for route planning problems by illustrating how the solution accuracy of travelling salesman problem instances can be enhanced via autonomous and positive transfer of knowledge from related source problems that have been encountered previously. Our approach is able to achieve better solution accuracy by exploiting useful past experiences at runtime, based on a source-target similarity measure learned online.
Concurrent evolutionary algorithms use threads that communicate via messages. Parametrizing the work in every thread and the way they communicate results is a major challenge in its design. In this paper we work with concurrent evolutionary algorithms implemented in Perl 6, and explore different options of single-thread evolution parametrization, communication and mixing of results, showing that scalability is achieved in a multi-core environment.
Semantic segmentation on a pixel basis is necessary for the semantic understanding of an image. Although the use of CNN is mainstream in the case where there are sufficient test images, in this research we aim to develop a method that is robust in an environment with few test images. Specifically, we improve Textonboost by using a differential evolution method.
This paper develops a hybrid between an RBF-assisted Evolutionary Programming (EP) algorithm and the CONORBIT trust region method for constrained expensive black-box optimization. The proposed hybrid combines the advantages of each approach and results in better performance than either the RBF-assisted EP or CONORBIT alone on test problems from the CEC 2010 benchmark.
The CMA-ES algorithm searches a fitness landscape by sampling from a multivariate normal distribution and updating its mean by taking a weighted average of the highest fitness candidate solutions. In this work, we explore the possibility of using Genetic Programming to evolve new mean-update selection methods that take into account information other than just raw fitness values. These results show that CMA-ES can be tuned to specific problem classes to achieve better results.
For evolutionary algorithms (EAs), selection is one of the main components which decides solutions for the new population. Most selection strategies are fitness-based and prodigal in fitness evaluations, since many evaluated solutions are discarded immediately due to their worse values. It is desirable to predict the quality of new solutions without the evaluations before selection, thus the efficiency of EAs can be improved. Naturally selection can be considered as a classification problem: selected solutions belong to the 'good' class and the discarded ones belong to the 'bad' class. This paper demonstrates this idea by introducing a classification-based selection (CBS) strategy for EAs. The CBS is integrated into a state-of-the-art algorithm and studied on a test suite. The experimental results evidence the efficiency of CBS on saving the number of fitness evaluations when compared with the original algorithm.
We introduce a new enhanced algorithm with a population entropy based population adaptation strategy under the framework of SHADE (PE-SHADE). The distribution state of the population is characterized, and then the population size is adapted with a population increasing strategy and a population reduction strategy. Experimental results on CEC2014 statistically support the effectiveness of the proposed algorithm.
Feature selection and construction can potentially help reduce dimensionality and build more effective features that aim to improve performance. This paper utilises Genetic Programming (GP) to automatically select and construct high-level features for three cybersecurity-related tasks namely Ransomware detection, Spam detection, and Phishing website detection. The effectiveness of the features constructed by the proposed method has been assessed using three commonly-used machine learning algorithms on three datasets and compared against the performance of these machine learning algorithms applied to the original set of features and those features selected and constructed by another GP-based. The experimental results show that the proposed method has significantly improved the performance compared to the other methods.
The Uncertain Capacitated Arc Routing Problem (UCARP) is an important combinatorial optimisation problem. Genetic Programming (GP) has shown effectiveness in automatically evolving routing policies to handle the uncertain environment in UCARP. However, when the scenario changes, the current routing policy can no longer work effectively, and one has to retrain a new policy for the new scenario which is time consuming. On the other hand, knowledge from solving the previous similar scenarios may be helpful in improving the efficiency of the retraining process. In this paper, we propose different knowledge transfer methods from a source scenario to a similar target scenario and examine them in different settings. The experimental results showed that by knowledge transfer, the retraining process is made more efficient and the same performance can be obtained within a much shorter time without having any negative transfer.
This work concerns the automatic adaptation of the probabilities of occurrence of the genetic operators in Genetic Programming. We experiment with different adaptation methods, different types of problems, and different tree-based Genetic Programming flavors with a variable number of genetic operators. Based on the published literature and on our own results, we claim that operator probabilities should be neither fixed nor carefully adapted, but instead they should be constantly and randomly changed during the evolution.
We investigate in this paper the suitability of multi-objective algorithms for Symbolic Regression (SR), where desired properties of parsimony and diversity are explicitly stated as optimization goals. We evaluate different secondary objectives such as length, complexity and diversity on a selection of symbolic regression benchmark problems. Our experiments comparing two multi-objective evolutionary algorithms against standard GP show that multi-objective configurations combining diversity and parsimony objectives provide the best balance of numerical accuracy and model parsimony, allowing practitioners to select suitable models from a diverse set of solutions on the Pareto front.
Robustness is a key characteristic of genetic programming candidate solutions, providing the ability to maintain a satisfactory level of performance under dynamic and uncertain environments. In this paper we perform experiments on Tartarus problem instances[2] exploring the hypothesis that the introduction of a fitness distribution bias, inspired by the Dunning-Kruger effect [5], will lead to an increase in the diversity and robustness of candidate solutions.
In this paper, we search for Boolean functions with properties useful in cryptography for preventing side-channel attacks. We use three genetic programming methods including linear genetic programming, which has not been used to design these functions before. Our results aim to provide a fair comparison by performing parameter optimization for each individual method, and deliver an insight into how well they cope with the demand for growing number of function inputs and higher degrees of correlation immunity.
Approaches that attempt to combine feature selection and symbolic feature evolution processes have gained popularity in the recent years to solve symbolic regression problems. However, much of the existing research is applicable/has been applied to solve explicit equations (y = f (x)) only, which cannot be directly applied to discover implicit equations (f (x) = 0). In this paper, we investigate a potential approach that uses Linear Programming to construct meaningful implicit equations out of a set of evolving features.
Here, we demonstrate the use of tags (evolvable labels that can be specified with imperfect matching) to identify memory positions in genetic programming (GP). Specifically, we conducted a series of experiments using simple linear-GP representations on five problems from the general program-synthesis benchmark suite [2]. We show that tag-indexed memory does not substantively affect problem solving success relative to more traditional, direct-indexed memory.
For the problem of symbolic regression, we propose a novel space partition based gene expression programming (GEP) algorithm named SP-GEP, which helps GEP escape from local optimum and improves the search accuracy of GEP by letting individuals jump efficiently between segmented subspaces and preserving population diversity. It firstly partitions the space of mathematical expressions into k subspaces that are mutually exclusive. Then, in order for individuals to jump efficiently between these subspaces, it uses a subspace selection method, which combines multi-armed bandit and ϵ-greedy strategy. Through experiments on a set of standard SR benchmarks, the results show that the proposed SP-GEP always keeps higher population diversity, and can find more accurate results than canonical GEPs.
In machine learning, transfer learning is concerned with utilising prior knowledge as a way to improve the process of training a new model in a different, but related, domain. Transfer learning has been shown to be beneficial across a large set of problems. One of the main questions any transfer learning approach must address is "What to transfer?". This paper proposes a new transfer learning method in genetic programming (GP) to improve solving symbolic regression problems by extracting all potentially good and unique building blocks from a source problem. The proposed method is compared against standard GP and a state-of-the-art GP method on ten regression datasets. The experimental results show that the proposed method has achieved significantly better or comparable performance to that of the competitive methods. Furthermore, the proposed method shows better initial population and convergence compared to the other methods.
Dynamic graphs are an essential tool for representing a wide variety of concepts that change over time. In the case of static graph representations, random graph models are often useful for analyzing and predicting the characteristics of a given network. Even though random dynamic graph models are a trending research topic, the field is still relatively unexplored. The selection of available models is limited and manually developing a model for a new application can be difficult and time-consuming. This work leverages hyper-heuristic techniques to automate the design of novel random dynamic graph models. A genetic programming approach is used to evolve custom heuristics that emulate the behavior of real-world dynamic networks.
The search performance of conventional Genetic Programming (GP) methods is strongly guided by the performance of the fitness function. In each generation, the fitness function evaluates every program in the population and measures the distance between the final output of the programs and the desired output. Human programmers often rely on the feedback from the intermediate execution states, which are the semantics, to localize and resolve software bugs. However, the semantics of a program is seldom explicitly considered in the fitness function to assess the quality of a program in GP. In this paper, we invent methods to improve fitness evaluation leveraging semantics in GP. We propose semantics flow analysis for programs using information theoretic concepts. Next, we develop a novel semantic fitness evaluation technique to rank programs using semantics based on the semantics flow analysis. Our evaluation results show that adopting our method can improve the success rates in Grammar-Based GP.
In this paper, we examine the effectiveness of multimodal genetic programming (MMGP) on the wall-following problem, which is a well-known benchmark problem of genetic programming (GP). MMGP aims to obtain multiple local optimal programs, including global optimal programs, that is, programs that achieve the same goal with different program structures. In this paper, we apply MMGP to the wall-following problem. The purpose of the wall-following problem is to find a program to control a robot having twelve distance sensors and four movements to follow irregular walls. We expect that there are several local optimal programs in the wall-following problem, which use different combinations of sensors. An experiment is conducted to investigate whether MMGP can get local optimal programs simultaneously for the wall-following problem. This experiment compares MMGP with a simple GP. The experimental results reveal that MMGP can achieve higher acquisition ratio of local optimal programs than the simple GP.
This work chronicles research into the solution of portfolio problems with metaheuristic solvers. In particular, a genetic algorithm for solving the cardinality constrained portfolio optimisation problem with minimum asset proportions is presented and tested on the datasets of [1]. These datasets form benchmark instances used to test portfolio optimisers and are based upon indices ranging from 31 to 225 assets. The results of the GA are indicatively compared to solutions of [2] for a variety of minimum proportions, suggesting that solutions exhibit certain clustering characteristics for higher proportions. Further work is also discussed. This research is based upon the first part of the ongoing PhD thesis of the first author.
Robust optimisation, in recent years, has surfaced as an essential technique to handle data uncertainty in mathematical programming models. However, the resulting robust counterparts are often hard to solve even for modern state-of-the-art Mixed Integer Programming solvers, underlining the need for approximate algorithms. Based on the works of Gonçalves and Resende [3], we propose genetic algorithms for the network slice design problem (NSDP) under uncertainty. We investigate the performance of the proposed algorithms using realistic datasets from SNDlib [4].
Phoneme awareness provides the path to high resolution speech recognition to overcome the difficulties of classical word recognition. Here we present the results of a preliminary study on Artificial Neural Network (ANN) and Hidden Markov Model (HMM) methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet, with a specific focus on evolutionary optimisation of bio-inspired classification methods. A set of audio clips are recorded by subjects from the United Kingdom and Mexico. For each recording, the data were pre-processed, using Mel-Frequency Cepstral Coefficients (MFCC) at a sliding window of 200ms per data object, as well as a further MFCC timeseries format for forecast-based models, to produce the dataset. We found that an evolutionary optimised deep neural network achieves 90.77% phoneme classification accuracy as opposed to the best HMM of 150 hidden units achieving 86.23% accuracy. Many of the evolutionary solutions take substantially longer to train than the HMM, however one solution scoring 87.5% (+1.27%) requires fewer resources than the HMM.
This paper presents a new multi-gene genetic programming (MGGP) approach for prediction of bond strength of Fiber Reinforced Polymers (FRP) bars to concrete. The MGGP technique models the bond behavior of FRP bars by integrating the capabilities of standard genetic programming and classical regression. The factors that affect the bond strength of FRP bars were identified from the test data of 223 beam-test specimens in the literature and fed into the GP system to produce a model. Our study concludes that the proposed MGGP model predicts bond strength of FRP bars in concrete quite accurately. Indeed, the proposed MGGP model equation has better performance than the code equation currently used by the American Concrete Institute (ACI).
This paper describes an application of multi-objective genetic algorithms (MOGA) to an optimization problem in the context of lithographic testing. The overall aim is to find a general procedure for reducing mark read-out effort in lithographic tests with limited degradation of related performance indicators. In these tests, silicon wafers are exposed with marks which are then read-out to determine lithographic performance expressed, for instance, with the 99.7 percentile of the read-out marks.
The problem was solved by applying MOGA in two stages. The first stage aims at determining a reduced layout in which for each field the same marks are read-out. In the second stage, the aim is to determine a reduced layout in which different fields have different marks read-out. In both stages the conflicting objectives are two: the number of read-out marks and the chosen performance indicator.
The recombination and mutation operators applied in MOGA are different in the two stages and are derived from a statistical analysis of the input data.
This approach, when applied to overlay test data, leads to a 50% reduction in read-out marks with 10% degradation range in performance indicators when compared to the full layout.
In this paper, we study a multi-depot electric bus vehicle scheduling problem (MD-EVSP) and propose a genetic algorithm based column generation approach (GA-CG) for it. In GA-CG, a column refers to a driving plan of a vehicle. CG first generates a set of candidate columns. Then, a GA is used to select a subset of candidate columns from the column set to form the final solution. Experiments show that GA-CG can effectively solve MD-EVSP and runs much fast than branch and price algorithm.
We attack the problem of generating balanced and efficient routing for automated inspection by using genetic algorithms to solve the equivalent Min-Max k Windy Chinese Postman Problem. Specifically, we use k robots to collectively inspect every member of a steel truss bridge. Experimental results show that the genetic algorithm produces efficient routes that are well-balanced among the robots. Additionally, we demonstrate that with our novel representation, as the number of robots increases, the generated routes exhibit near-linear speedup in the time needed to complete the inspection task - k robots take [MATHS HERE] the time needed by one robot. Finally, our genetic algorithm produces similar results on a set of benchmark arc routing problem instances from the literature.
We extend the context-free grammar mapping method in the Grammatical Evolution search heuristic. Grammatical Evolution guarantees the generation of transparent and syntactically correct sentences(phenotypes), but not necessarily semantically correct or feasible ones. Generating syntactically valid phenotypes with postprocessing to filter out semantically invalid ones suffers from some issues, e.g. introduction of bias toward short phenotypes and loss in search efficiency. These issues become significant in legal application domains. We first demonstrate that applying Grammatical Evolution with a context free grammar to legal non-compliance detection problems might not be a tenable solution. Then we demonstrate how the addition of context sensitivity improves both the search efficiency and achieves a greater diversity in the case of the iBoB problem regarding legal non-compliance.
We present EvoIsland, an interactive evolutionary augmented reality interface for parametric 3D OpenSCAD models. Our mobile prototype enables content creators to explore the full range of design possibilities encoded in a model's source code through the combination and separation of hexagonal evolutionary tiles embedded with genetic data. As these tiles are grouped into islands, localized clusters of design populations emerge for creators to explore. Interactions that take place within our EvoIsland prototype provide content creators with a novel approach for shaping evolutionary populations in an immersive environment.
We propose a genetic algorithm (GA)-based optimal equipment assignment method for oil spill response. We devised a repair operation suitable for constrained equipment assignment. In addition, the assignment strategies were evaluated by simulation to ensure that it would conform to current South Korean standards. At sixteen locations in South Korea, the assignment for the response work optimized by the GA took 1.1% less time on average than the current assignment. Furthermore, an optimal assignment was determined, which achieved a 29% reduction in the total capacity of the oil skimmers compared to the current standard.
Various EA-based methods have been applied to design and optimize logic circuits since the early nineties. The unconventional methods, however, typically suffer from various scalability issues preventing them to be adopted in practice. Recent improvement in the fitness computation procedure connected with the introduction of formal methods in the fitness evaluation such as SAT solvers or BDDs enabled pushing of the limits forward and approaching the complexity of industrial problems. It was demonstrated that EAs can be applied to optimize gate-level circuits consisting of thousands of gates without introducing any decomposition technique. Despite that, the efficiency decreases with increasing the circuit complexity. This problem can be managed by adopting the concept of the so-called iterative resynthesis based on the extraction of smaller sub-circuits from a complex circuit, their local optimization followed by the implantation back to the original circuit. Recently, a method based on the computation of so-called cuts was proposed. In this paper, we propose an alternative approach which is able to select more complex sub-graphs consisting of more nodes and more inputs. Compared to the previous method, the proposed approach allows to improve the efficiency of the optimization. More than 9% and 20% reduction was observed on the highly optimized logic and arithmetic circuits, respectively.
Quantum Key Distribution (QKD) allows for two parties to establish a shared secret key which is secure against an all-powerful adversary (a task impossible to achieve using only classical communication). Furthermore, any attack against these systems creates an observable "noise signature." This work develops a new solution representation for QKD protocols allowing a GA to evolve optimal protocols to counter observed attacks against the communication.
Prostate cancer (PCa) is the most common oncological disease in Western men. Even though a significant effort has been carried out by the scientific community, accurate and reliable automated PCa detection methods are still a compelling issue. In this clinical scenario, high-resolution multiparametric Magnetic Resonance Imaging (MRI) is becoming the most used modality, also enabling quantitative studies. Recently, deep learning techniques have achieved outstanding results in prostate MRI analysis tasks, in particular with regard to image classification. This paper studies the feasibility of using the Semantic Learning Machine (SLM) neuroevolution algorithm to replace the fully-connected architecture commonly used in the last layers of Convolutional Neural Networks (CNNs). The experimental phase considered the PROSTATEx dataset composed of multispectral MRI sequences. The achieved results show that, on the same non-contrast-enhanced MRI series, SLM outperforms with statistical significance a state-of-the-art CNN trained with backpropagation. The SLM performance is achieved without pre-training the underlying CNN with backpropagation. Furthermore, on average the SLM training time is approximately 14 times faster than the backpropagation-based approach.
This study presents a method to evolve planar mechanism prototypes using an evolutionary computing approach. Ultimately, the idea is to provide drafts for designers at the conceptual design stage of mechanism design which meet their design brief. The designers can review, analyze and incorporate the drafts into the development of a mechanical system. This work proposes a stepping stone towards a generative design system which is capable of evolving such planar mechanisms with a focus on the configuration and shape of their components. In this paper, we present a use-case of evolving four-bar linkages with attached shaped component to study the evolutionary approach. In addition to kinematics, we consider properties such as the shape of levers, placement of rotation joints, torque, friction, restitution, and mass. The design aim was to move a mechanism as fast as possible in the forward direction whilst overcoming obstacles in a virtual environment involving gravity. Our method was evaluated using complex environments with different obstacles. The results verify that it was able to evolve a variety of high performing mechanism drafts. The study provides a promising outlook of using this method to support the conceptual stage of mechanical design.
In real world applications, variation in deployment environments, such as changes in data collection techniques, can affect the effectiveness and/or efficiency of machine learning (ML) systems. In this work, we investigate techniques to allow a previously trained population of Linear Genetic Programming (LGP) insider threat detectors to adapt to an expanded feature space. Experiments show that appropriate methods can be adopted to enable LGP to incorporate the new data efficiently, hence reducing computation requirements and expediting deployment under the new conditions.
Particle Swarm Optimization is a well researched meta-heuristic for single- and multi-objective problems. It is based on the movement of particles, which enables its application to collective search in robotic applications. However, in the robotic context some assumptions regarding performance measurement of PSO-algorithms do not apply. Concerning energy and time, while reading a sensor is usually cheap, the movement is costly in robotic applications. Traditional methods do not take into account any cost metrics associated with the actual movement of the particles. Therefore, new metrics are required to understand how well a PSO-algorithm can perform as a collective search mechanism. This article proposes two metrics, the Normalized Movement Energy Cost and Normalized Movement Time Cost that enable researchers to analyze an algorithm's performance not just regarding the obtained solution quality, but also with respect to movement time and energy costs.
This paper concerns optimizing counter-epidemic strategies selecting actions which differ with respect to the cost, damage to the animal population and the potential of stopping the disease. Important factors that have to be taken into account are the time it takes for the vaccine to become effective and the dynamic of the epidemic. This paper focuses on optimizing control strategies rather than individual decisions in the case of a particular epidemic. An evolutionary optimizer using simulations for solution evaluation is run in order to gather training data. Based on the training data a regression model is built which can subsequently be used to improve the results obtained when solving new instances of the proposed problem. Experimental results show, that the model can be used to transfer knowledge from previously solved instances to new ones.
The stable conformation of a chemical structure can be searched by optimizing the spatial coordinates of the given arrangement of functional groups that determine the structure. Typically, the process seeks to optimize the geometry of a known particular arrangement. In our work, we combine a genetic algorithm with Open Babel's optimization algorithms to perform a bi-level search, optimizing simultaneously the combination of functional components and their geometry, aiming to find stable chemical structures with minimum energy. We focus our application to the optimization of chemical structures that are similar to the actual samples of graphene seawater desalination filters observed in the laboratory. This kind of filters has the potential to reduce cost and increase the efficiency of desalination to obtain drinking water.
Evolutionary algorithms have been applied to numerous architectural design applications in what is popularly known as evolutionary design [3], [4], [6]. Such applications include architectural support [7] and structural design for buildings [5] and floor-plan layout design [8]. However, evolutionary design of optimally shaped building façades is less explored in evolutionary architectural design applications [6], [12], [13].
This research investigates the evolutionary design of building façades, optimally shaped for a given climate. This study applies evolutionary methods to optimally design sun-shades (covering windows on building façades). Ideally, sun-shades will maximally block direct sunlight but minimize window coverage, thus allowing unobstructed views out of the window and maximizing ambient natural lighting inside. Also, sun-shades help to passively control building climate and determine occupant comfort. Optimal sun-shade designs allow direct sunlight (solar penetration) to enter interior spaces in winter months, heating the building, and minimize solar penetration in summer months, cooling the building [11].
This study applies an Evolutionary Strategy (ES) [1] to automate sun-shade design such that solar penetration is minimized for both east and west facing windows, given summer solstice daylight hours in various geographic locations. An ES was selected given the demonstrated effectiveness of such evolutionary optimization on a range of engineering design problems with various constraints [9]. We focus on sun-shade design for rectangular shaped windows (vertical Y axis is 1.5 times the length of the horizontal X axis), where we anticipate sun-shade design will be replicated for many identical windows comprising a building's façade, as is the case for many modern tall buildings [14].
The ES was initialized with 20000 uniform random [1] points in a continuous three-dimensional (1.0 x 1.0 x 1.0) space adjacent to the window (figure 1). These points were possible mesh vertices for sun-shade design and thus the design solution space. The fitness function computed sun-shade effectiveness via calculating how many sun-rays were blocked assuming an increasing or decreasing sun height above the horizontal plane (angle V in figure 1). Thus, we tested the portion of sun-rays blocked by an evolving sun-shade (mesh formed by 20000 vertices) over half of daylight hours (separate sun-shades were evolved for east and west facing façades). In successive generations, sun-shade mesh vertices blocking sun-rays (at varying degrees of inclination and declination) aimed at the window were selected for as vertices in evolving designs.
Evolving sun-shade effectiveness was computed as the intersection of sun-rays at 15 second intervals during simulated half-days. For east facing façades, from the point where sun is on the horizontal plane (Y axis in figure 1) and incrementally increases until it is directly above the vertical axis of the building façade (Y-Z plane in figure 1), and for west facing façades where the sun starts at this midday point and incrementally declines. Sun-shades were evolved for east and west facing façades given half of summer solstice daylight hours1 (for east versus west façades) indicative of Cape Town, South Africa, and Amsterdam, the Netherlands (~ 14 hours, 25 minutes and 16 hours, 48 minutes, respectively).
At these two geographic locations, 15 second intervals indicated incremental sun movements during day-light hours. For Cape Town, this was approximated as 0.052° increases and decreases and for Amsterdam, 0.045° increases and decreases (for east and west facing façades, respectively). Half-day simulations thus tested, every 15 seconds, sun-ray intersection (vector: Xp, Yp, Zp at angle V from the horizontal or vertical plane) with any point in the sun-shade. This was a point-cloud in generation 1 and mesh-points in subsequent generations (figure 1). Points intersecting the sun-ray were given maximum (normalized) fitness of 1.0, and points within a given ray distance were assigned a logarithmically decreasing fitness that equalled 0.0 at the maximum ray distance. To account for random variation and diffusion of sun-ray light, each 15 seconds, a random angle (in the range: [-0.01°, +0.01°]) was added to the sun-ray's vector value V.
Evolutionary design used a µ+λ ES [1], where (λ = 20000) off-spring were created per generation. This combined population was ranked by fitness and the least ft λ genotypes discarded. Each genotype encoded an (x, y, z) point in an N point-mesh (evolving sun-shade design), and corresponding σ mutation step-size for each coordinate. For simplicity, the X, Y, Z dimensions of the 3D solution space for evolving sun-shades (adjacent to the window) was normalized the range [0.0, 1.0] and the window dimensions normalized to the range [0.0, 1.5] for the X, Y window axes, respectively. Thus, sun-shades only evolved to cover the top two-thirds of a window, ensuring that sufficient ambient light still entered the building and that occupants have a view out of the window.
One generation was the evaluation of all 20000 genotypes (in sun-ray simulations), where the fittest 10% were selected, mutation operators: σxNx(0,1), σyNy(0, 1), σzNz(0,1) applied to permutate each genotype's coordinate and step-size values (p=1.0 and p=0.05, respectively), such that (λ=20000) offspring genotypes were created. All µ+λ genotypes were then evaluated and the fittest 20000 selected as survivors [1]. Sun-shade evolution for Cape Town and Amsterdam constituted experiment set 1 and 2, respectively. Each experiment set was 10 ES runs, for east and west facing façades, and each run was 100 generations (ES run stopping condition).
Sun-shade fitness was the portion of points (constituting a sunshade design) that blocked or partially blocked sun-rays during each half-day simulation. Points that intersected a sun-ray were assigned a maximum fitness of 1.0, and points close to a sun-ray (< ray distance) were assigned a partial fitness in the range: (0.0, 1.0). In generation 1, all 20,000 possible points were considered for sun-shade design. In subsequent generations only points given a fitness value were considered part of the evolving sun-shade (point-mesh) design. For simplicity, sun-shade fitness was normalized to the range: [0.0, 1.0], where 0.0 indicated no sun-rays blocked and 1 indicated all sun-rays blocked (over all day-light hours tested).
As a benchmark comparison for evolved sun-shade effectiveness, the fittest sun-shades evolved for east and west facing façades (at both locations) were selected from each run and compared to ten heuristic design sun-shades (figure 1). The effectiveness of these sun-shades was similarly computed using sun-ray simulations of 15 second intervals during half-day periods for east and west facing façades and a given number of day-light hours at both locations.
Thus for each heuristic design sun-shade a fitness value was similarly calculated, normalized to the range: [0.0, 1.0], where 0 indicated no sun-rays were blocked and 1.0 indicated that all sun-rays were blocked during a sun-ray simulation.
Results indicated that, on average, evolved sun-shades, for both shorter and longer day lengths and east versus west facing façades, were significantly more effective (with statistical significance, two-tailed t-test, p < 0.05, [2]) compared to the ten tested heuristic designed sun-shades. Results also indicated that evolutionary design is suitable for automating optimal sun-shade (and potentially building façade) design and support current hypotheses on the efficacy of evolutionary design for improving current architectural designs and automating efficient and effective industrial design production [3], [4], [12]. Ongoing work is evaluating sun-shade evolution in comparison to other heuristic designs in various geographic locations, as well as evolving sun-shades that dynamically adapt their form to suit varying daylight lengths and sun intensity.
In this paper, we try to combine the best from the world of heuristics and algebraic constructions for the design of S-boxes: we evolve algebraic constructions that produce S-boxes with as low as possible differential uniformity. Our approach is novel yet very simple and is allowing us to obtain constructions valid for any S-box size of practical interest.
The first and last mile problem describes the difficulty of starting and completing a journey when using public transport, where there are limited options once away from centralised transport infrastructure such as train stations. It is particularly an issue for commuter journeys where time lost on regular repeated trips becomes a barrier to the use of public transport so that commuters use their own cars, which consequently contributes towards increasing pollution. With the potential advent of connected autonomous vehicles (CAVs) on our roads in future, it may be possible to reduce the travel time of commuters between the train station and their home or place of work and therefore potentially reduce the number of car trips that are made. In this paper, we model commuters using CAVs, and for the first time optimise the locations where CAV hubs should be placed to increase the number of commuters opting to travel by trains using an evolutionary optimiser. In a real world case study between Exmouth and Digby & Sowton stations in UK, we demonstrate that with only 11 hubs shared between the origin and the destination stations, we may halve the number of car trips.
We propose a hybridization approach called Regularized-Surrogate-Optimization (RSO) aimed at overcoming difficulties related to high-dimensionality. It combines standard Kriging-based SMBO with regularization techniques. The employed regularization methods use the least absolute shrinkage and selection operator (LASSO). An extensive study is performed on a set of artificial test functions and two real-world applications: the electrostatic precipitator problem and a multilayered composite design problem. Experiments reveal that RSO requires significantly less time than Kriging to obtain comparable results. The pros and cons of the RSO approach are discussed and recommendations for practitioners are presented.
DADIMODO is an Evolutionary Algorithm based software program for solving an inverse problem arising from biological Small Angle X-ray Scattering (SAXS) data analysis. The problem consists in refining the three-dimensional model of a multi-domain protein complex against SAXS experimental data. We use an "all-atoms" structure representation allowing for an energy control for every newly generated model so as to prevent steric clashes and converge to a physically feasible structure. Sophisticated variation operators had to be specifically tailored for this application and constitute our optimization algorithm's particularity. Atomic structure manipulations being computationally expensive, we report the critical implementation on a computer cluster for parallelized feasibility check and evaluation stage. DADIMODO has been successfully used in many biostructural data analysis cases and was recently made available to the corresponding research community as a web service.
Alzheimer's Disease (AD) is a growing pandemic affecting over 50 million individuals worldwide. While individual molecular traits have been found to be associated with AD at the DNA, RNA, protein, and epigenetic level, the underlying genetic etiology of AD remains unknown. Integrating multiple omics datatypes simultaneously has the potential to reveal interactions within and between these molecular features. In order to identify disease driving mechanism, a standardized framework for integrating multiomics data is needed. Due to high variability in size, structure, and availability of high-throughput omics data, there is currently no gold standard for combining different data types together in a biologically meaningful way. Thus, we propose a pathway-centric, neural network-based framework to integrate multiomics AD data. In this knowledge-driven approach, we evaluate different gene ontologies to map data to the pathway level. Preliminary results show integrating multiple datatypes under this framework produces more robust AD pathway models compared to models from single data types alone.
When applying genetic algorithms (GAs) in the design real products, reducing the computational cost of fitness functions is one of the major challenges. In some cases, the computational cost of calculating specific eigenvalues is a predominant factor and needs to be reduced. We proposed the use of a GA with "neural-network (NN) assistance," which enables this computational cost to be reduced. With this GA, the NN assistance infers the approximate eigenvalues. Then, these approximate eigenvalues are used when starting the convergence calculation to obtain the precise eigenvalues. This procedure is effective in reducing the total computational cost of some of the fitness functions of real products. In addition, the precision of the eigenvalue is retained because the precise eigenvalues are obtained by the convergence calculation. The results of our case study show that the GA using our method achieves a 3x speed-up in fitness computation while maintaining equivalent solution quality.
Electrical wires are a very important part of a vehicle as well of many other objects with electrical functions. These wires are summed up to harnesses. Engineers often manually create the virtual paths for wires by considering their expert knowledge. More variants of a product often require different wires and the criteria for cables do vary. Criteria, e.g. heat or the length, define the wire path and configuration and have to be taken into account. Engineers may not be able to consider all criteria, notably when there is an extensive amount of product variants, or when the object configuration is changing. This paper will evaluate if a multi-objective evolutionary algorithm can compute feasible wire paths where two of these criteria are optimised. We are showing that discretised free-space in a graph structure enables an evolutionary algorithm to compute wire suggestions by using customised reproduction operators for path planning. The developed crossover and mutation operators are suitable for multi-objective path finding, may be used in other fields as well and will be extended in future research.
The1 Aircraft-Gateway Allocation Problem (AGAP) has long been a key research issue in the field of management and operations. When a satellite hall is expanded, the passengers' transit process and time will increase. This paper re-models and optimizes AGAP. From the perspectives of airports, passengers and airlines, the objective of minimizing overall transit tension is added to the traditional optimization goals of maximizing the number of flights allocated to appropriate gates and minimizing the average passengers transit time. A non-dominated genetic algorithm (NSGA-III) is used to solve the new multi-objective AGAP model. Finally, using the modified case data of Pudong Airport in China Eastern Airlines, the models and methods are analyzed. Firstly, the feasibility of the model and method is verified. Secondly, by comparing with the effect of NSGA-II algorithm, the NSGA- III is proved better in obtaining a more ideal Pareto frontier in solving three target optimization problems.
In the last 20 years, literally dozens of optimization algorithms based on swarm intelligence have been proposed. Particle Swarm Optimization, Artificial Bee Colony, Cuckoo Search, Firefly Optimization, and Cat Swarm Optimization are just a small sample of the exuberance of swarm-like algorithms. Although they differ in implementation details, they all share a common structure: an update rule is applied to each solution, followed by a drop rule that decides whether to keep the updated solution or not. In this poster we explore the idea of automatically generating swarm-like optimizers. Our proposal is divided in two stages: First we decompose popular, human-crafted, swarm-like optimizers such as PSO, CS, ABC (as well as DE/GA) into a list of basic rules. Second, we use Grammatical Evolution to procedurally generate variations on this base structure by recombining these operators. We generate three instances of algorithms, and observe that they have comparable performance to DE and PSO. Our framework will be useful to gain insight on the design space of meta-heuristics and the nature of swarm-like algorithms.
Automated search in the form of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), plus manual code changes, transforms 512 Newton-Raphson floating point start numbers from an open source GNU C library, glibc, table driven square root function to create a new bespoke custom mathematical implementation of double precision binary logarithm log2 for C in seconds.
In general, when a problem of high dimension is solved with particle swarm optimization (PSO), a large number of particles is used. However, it is quite difficult to understand this method's search process. To address this, the best known solutions of PSO were fixed as the center point, after which the solutions were reduced to 2 dimensions using Sammon mapping and the search process was visualized using a heatmap. As a result, the PSO process of searching for various optimization functions was able to be understood intuitively, and PSOs possessing different parameters could be compared
Web applications have become increasingly essential in many domains that operate on confidential data related to business. SQL injection attack is one of the most significant web application security risks. Detecting SQL injection vulnerabilities is essential for protecting the underlying web application. However, manually enumerating test cases is extremely challenging, if not impossible, given the potentially infinite number of user inputs and the likely nonexistence of one-to-one mapping between user inputs and malicious SQL statements. This paper proposes an automatic security test case generation approach to detect SQL injection vulnerabilities for web applications, following a search-based software engineering (SBSE) paradigm. Particularly, we propose a novel fitness function that evaluates the similarity between the SQL statements produced by feeding user inputs in the system under test and a known malicious SQL statement. For the search algorithm, we exploit differential evolution, which is robust in continuous optimization but it is under-investigated in SBSE. Based on three real-world web applications, we conduct experiments on 19 configurations that are of diverse forms of SQL statements and types of attacks. Results demonstrate that our approach is more effective, with statistical significance and high effect sizes, than the state-of-the-art.
Mechanized theorem proving is becoming the basis of reliable systems programming and rigorous mathematics. Despite decades of progress in proof automation, writing mechanized proofs still requires engineers' expertise and remains labor intensive. Recently, researchers have extracted heuristics of interactive proof development from existing large proof corpora using supervised learning. However, such existing proof corpora present only one way of proving conjectures, while there are often multiple equivalently effective ways to prove one conjecture. In this abstract, we identify challenges in discovering heuristics for automatic proof search and propose our novel approach to improve heuristics of automatic proof search in Isabelle/HOL using evolutionary computation.
Web service composition aims to provide added values by loosely coupling web services to accommodate users' complex requirements. Evolutionary computation techniques have been used to efficiently find near-optimal composite services to satisfy users' requirements reasonably well. Often, the quality of a composite service is measured by two important quality criteria that are related to the non-functional quality (i.e., Quality of service, QoS for short) and function quality (i.e., Quality of semantic matchmaking, QoSM for short). One recent work [2] proposed a Hybrid method that combines NSGA-II and MOEA/D with swap-based local search to enhance the performance of NSGA-II. This Hybrid method handles two quality criteria in QoS as two trade-off objectives. However, the local search of this method is randomly applied to a predefined large number of subproblems without focusing on the most suitable candidate solutions. In this paper, we propose a memetic NSGA-II with EDA-based local search. Particular, EDA performs the local improvements of a few well-selected composite services in different regions of the Pareto front. We also aim to handle two practical trade-off objectives with respect to QoS and QoSM. Our experiments have shown that our proposed method outperforms the recent state-of-the-art algorithms and the baseline NSGA-II method with respect to effectiveness and efficiency.
The binary value function, or BinVal, has appeared in several studies in theory of evolutionary computation as one of the extreme examples of linear pseudo-Boolean functions. Its unbiased black-box complexity was previously shown to be at most [MATHS HERE], where n is the problem size.
We augment it with an upper bound of log2 n + 2.42141558 - o(1), which is more precise for roughly a half of values of n. We also present a lower bound of log2 n + 1.1186406 - o(1) and provide an algorithm to compute the exact black-box complexity of BinVal for a given n.
It seems very intuitive that for the maximization of the OneMax problem [MATHS HERE] the best that an elitist unary unbiased search algorithm can do is to store a best so far solution, and to modify it with the operator that yields the best possible expected progress in function value. This assumption has been implicitly used in several empirical works. [Doerr, Doerr, Yang: GECCO 2016] formally proved that this approach is indeed almost optimal.
In this work we demonstrate that drift maximization is not optimal. More precisely, we show that for most fitness levels between n/2 and 2n/3 the optimal mutation strengths are larger than the drift-maximizing ones. This implies that the optimal RLS is more risk-affine than the variant maximizing the step-wise expected progress. We show similar results for the mutation rates of the classic (1+1) Evolutionary Algorithm (EA) and its resampling variant, the (1+1) EA>0.
As a result of independent interest we show that the optimal mutation strengths, unlike the drift-maximizing ones, can be even.
Performance analysis of randomised search heuristics is a rapidly growing and developing field. We contribute to its further development by introducing a novel analytical perspective that we call unlimited budget analysis. It has its roots in the very recently introduced approximation error analysis and bears some similarity to fixed budget analysis. The focus is on the progress an optimisation heuristic makes towards a set goal, not on the time it takes to reach this goal, setting it far apart from runtime analysis. We present the framework, apply it to simple mutation-based algorithms, covering both, local and global search. We provide analytical results for a number of simple example functions for unlimited budget analysis and compare them to results derived within the fixed budget framework for the same algorithms and functions.
Estimation of Distribution Algorithms have been successfully used for solving many combinatorial optimization problems. One type of problems in which Estimation of Distribution Algorithms have presented strong competitive results are permutation-based combinatorial optimization problems. In this case, the algorithms use probabilistic models specifically designed for codifying probability distributions over permutation spaces. One class of these probability models is distance-based exponential models, and one example of this class is the Mallows model. In spite of the practical success, the theoretical analysis of Estimation of Distribution Algorithms for permutation-based combinatorial optimization problems has not been extensively developed. With this motivation, this paper presents a first mathematical analysis of the convergence behavior of Estimation of Distribution Algorithms based on the Mallows model by using an infinite population to associate a dynamical system to the algorithm. Several scenarios, with different fitness functions and initial probability distributions of increasing complexity, are analyzed obtaining unexpected results in some cases.
In this paper we propose the use of Minimum Spanning Tree-based clustering to recursively cluster large sets of potentially Pareto-optimal solutions. We present preliminary results for the multiobjective traveling salesperson problem. The clustering is based on the distances between solutions defined in the decision space but it generates clusters that correspond to clearly defined regions in the objective space.
In this paper we propose a tabu search-based memetic algorithm (TSM) for the multi-objective flexible job shop scheduling problem (FJSSP), with the objectives to minimize the makespan, the total workload and the critical workload. The problem is addressed in a Pareto manner, which targets a set of Pareto optimal solutions. The novelty of our method lies in the use of tabu search (TS) as the local search method as well as a mutation operator and the use of the hypervolume indicator to avoid stagnation by increasing the flow of individuals in the local search. To the best of our knowledge, the use of TS in the context of multi-objective FJSSP has not been reported so far. We apply our algorithm on well known test instances and compare our results to state-of-the art algorithms. The results show that our approach yields competitive solutions in 6 of the 10 instances against two of their algorithms proving that the use of TS as a local search method can provide competitive results.
Predicting remaining useful life (RUL) of plasma equipment becomes an important issue for semiconductor manufacturing in this decade. If RUL can be accurately estimated, the schedule of maintenance can be proper to moderate the waste and cost of the production. Digital Radio Frequency Matching Box (RF-MB) is an essential equipment in the semiconductor manufacturing process. The status of RF-MB will be recorded by the Fault Detection and Classification (FDC). In order to establish the RUL of RF-MB, we use Fisher Discriminant Analysis (FDA) for feature selection to concentrate the leading variables in FDC. We marked the first 2 days of the RF-MB operation as "Good" and marked the last 2 days before the failure of RF-MB as "Bad". We used extended Classifier System with continuous-valued inputs (XCSR) to learn the well-labeled FDC data. The results show that XCSR can quickly find patterns and meaningful variables. The average accuracy of XCSR is 97.3% and the average missing rate of rules is only about 1.6%. The results confirmed that XCSR is capable of alerting related operator before the plasma component reaching its residual life. In the future, we will use XCS with Function approximation (XCSF) to more accurately approximate the function of RUL. We look forward to building a complete assessment of RUL.
In application domains like data analysis or image processing, ever-increasing performance demands push the capabilities of computational systems to their limits. With technology scaling plateauing out, engineers are forced to rethink their approach to system design. The research field of approximate computing provides a new design paradigm which trades off accuracy against computational resources. In a complex system, multiple approximation methods can be combined to maximize the resulting benefits, but because of error propagation in the system, doing this in a controlled manner is challenging. To solve this problem, we propose to use concepts developed in the field of evolutionary machine learning to optimize approximation parameters, focusing on systems implemented on FPGA hardware. Our approach uses the rules of a learning classifier system to adjust approximation parameters. The resulting effects on both the application quality and resource usage are estimated on-the-fly and fed back to the rules with every fitness update, allowing the system to be carefully tuned to specific design goals. We illustrate the application of the proposed system to a real-world image processing problem and highlight some practical implications. As this is work in progress, we outline remaining open questions and future directions for our research.
1Since Internet addiction (IA) was reported in 1996, research on IA assessment has attracted considerable interest. The development of a real-time detector system can help communities, educational institutes, or clinics immediately assess the risk of IA in Internet users. However, current questionnaires were designed to ask Internet users to self-report their Internet experiences for at least 6 months. Physiological measurements were used to assist questionnaires in the short-term assessment of IA, but physiological properties cannot assess IA in real-time due to a lack of algorithms. Therefore, the real-time detection of IA is still a work in progress. In this study, we adopted an extended classifier system with continuous real-coded variables (XCSR), which can solve the non-Markovian problem with continuous real-values to produce optimal policy, and determine high-risk and low-risk IA using Chen Internet addiction scale (CIAS) data or respiratory instantaneous frequency (IF) components of Internet users as input information. The result shows that the classification accuracy of XCSR can reach close to 100%. We also used XCSR to verify the items of CIAS and extract important respiratory indexes to assess IA. We expect that a real-time detector that immediately assesses the risk of IA may be designed in this way.
The paper describes the first attempts toward designing and evaluating anticipatory classifier systems working in a real-valued input domain using interval predicates representation. Promising results are obtained by testing two environments - real-valued multiplexer and checkerboard from the classical XCSR problem domain.
Learning Classifier Systems (LCSs) are a unique machine learning paradigm. The probably most well-known and investigated instance of these is XCS. LCSs, and with them, XCS, have developed in parallel to mathematically more rigorously founded paradigms such as today's reinforcement learning. This is probably the reason why XCS was initially defined without a formal basis. Nevertheless, the pursuit of a formal understanding of XCS has been one of the primary goals since its invention. Over the years, this led to a large and seemingly underestimated body of formal analysis of it. We present our try at a comprehensive overview of the various angles from which XCS was regarded formally. With this paper, we aim at (1) mitigating the misconception we sometimes observed that research on XCS contains some sort of 'formal theory gap', (2) supporting researchers interested in formal advances regarding XCS and (3) identifying future research directions.
To briefly represent a dataset, it is crucial to find common attributes among the data. Extended learning classifier system (XCS) finds common attributes of multiple data and acquires generalized rules that match multiple data. In real-world problems, it may be challenging to find common attributes due to noise in the data and the inability of XCS to acquire the generalized rules. Considering a classification problem as an example, noise may be included at each input, output, as well as in the evaluation of the output. To tackle this problem, our previous work proposed XCSs that acquire appropriately generalized rules, specifically for a problem with one of the three mentioned type of noises added. In real-world problems, it is difficult to identify the type of noise in advance, which requires an XCS to cope with multiple types of noise. For this issue, this paper proposes an XCS that can handle any noise on the input, output, and evaluation of the output, and aims at investigating the effectiveness of the proposed XCS in the multiplexer problems including any of the three types of noise.
Connected vehicles are revolutionizing the way in which transport and mobility are conceived. The main technology behind is the so-called Vehicular Ad-Hoc Networks (VANETs), which are communication networks that connect vehicles and facilitate various services. Usually these services require a centralized architecture where the main server collects and disseminates information from/to vehicles. In this paper, we focus on improving the downlink information dissemination in VANETs with this centralized architecture. With this aim, we model the problem as a Vertex Covering optimization problem and we propose four new nature-inspired methods to solve it: Bat Algorithm (BA), Firefly Algorithm (FA), Particle Swarm Optimization (PSO), and Cuckoo Search (CS). The new methods are tested over data from a real scenario. Results show that these metaheuristics, especially BA, FA and PSO, can be considered as powerful solvers for this optimization problem.
The Firefighter Problem (FFP) is a graph-based optimization problem that is an abstraction of real-life problems such as epidemics control, economic crises prevention, etc. In the FFP spreading of fire is simulated on a graph in discrete time steps. In the original formulation of the problem a fixed number of graph nodes Nf can be defended in each time step. In this paper the problem is reformulated, and three different solution representations are studied. In one of the representations (N+P), the Nf parameter is a decision variable and in the other two (P using permutations and T using integer vectors) it is determined when the solution is decoded. Because higher Nf values mean more resources used for defense it is desirable to minimize this value, but on the other hand we want to minimize the number of graph nodes consumed by fire. Therefore the Parameterless FFP is tackled using two well-known multiobjective evolutionary algorithms: the MOEA/D and the NSGA-II as a multiobjective optimization problem with two and three objectives. The results presented in the paper show that for the Parameterless FFP the best solution representation is N+P.
Finding communities of interrelated nodes is a learning task that often holds in problems that can be modeled as a graph. In any case, detecting an optimal partition in a graph is highly time-consuming and complex. For this reason, the implementation of search-based metaheuristics arises as an alternative for addressing these problems. This manuscript focuses on optimally partitioning dynamic network instances, in which the connections between vertices change dynamically along time. Specifically, the application of Novelty Search mechanism for solving the problem of finding communities in dynamic networks is studied in this paper. For this goal, this procedure has been embedded in the search process undertaken by three different bio-inspired meta-heuristic schemes: Bat Algorithm, Firefly Algorithm and Particle Swarm Optimization. All these methods have been properly adapted for dealing with this discrete and dynamic problem, using a reformulated expression of the modularity coefficient as its fitness function. A thorough experimentation has been conducted using a benchmark composed by 12 synthetically created instances, with the main objective of analyzing the performance of the proposed Novelty Search mechanism when facing this problem. In light of the outperforming behavior of our approach and its relevance dictated by two different statistical tests, we conclude that Novelty Search is a promising procedure for finding communities in evolving graph data.
An appropriate bus timetable is vital for bus enterprises to improve service quality and save operational cost. Most existing literature on bus timetable optimization divide a whole day into several periods and assume that departure time intervals are the same for each period. As passengers' flow varies over time in a period, giving the same time interval for each period cannot really meet the demand of passengers. In this paper, we study the optimization of bus timetable with unequal time intervals. Aiming at characteristics of this problem, a memetic algorithm is devised that combines a genetic algorithm with elite strategy and a local search. To handle infeasible solutions, a repair method is proposed to repair solutions that do not meet the constraint. A new metric reflecting the degree of bus carrying capability matching passengers' need is introduced. The metric together with passengers' waiting time is used to evaluate a bus timetable. Experiments show that compared to the actually used timetables, timetables optimized by the proposed approach are able to save about 4.54-12.84% cost and 11.87-37.76% passengers' waiting time.
Intelligent AI systems using approaches containing emergent elements often encounter acceptance problems. Results do not get sufficiently explained and the procedure itself can not be fully retraced because the flow of control is dependent on stochastic elements. Trust in such algorithms must be established so that users will accept results, without questioning whether the algorithm is sound. In this position paper we present an approach in which the user gets involved in the optimization procedure by letting them chose alternative solutions from a structure-archive which is created by the MAP-Elites algorithm. Analysis of these alternatives along the criteria of multiobjective optimization problems makes solutions comprehensible and hence is a means to build trust. We propose that the solution-focused nature of MAP-Elites allows the history of a solution to be easily shown to the user, explaining why that solution was included in those presented to the user. Here we demonstrate our ideas using a logistics problem previously explored by the authors.
MABE (Modular Agent-based Evolver) is an open-source evolutionary computation (EC) research platform designed to be used by biologists, engineers, computer scientists, and other researchers. MABE's primary goal is to reduce the time between thinking up a new hypothesis and generating results. The design assumes that there are common elements in many EC research projects. MABE improves efficiency by allowing for the reuse of these common elements and standardizing of interfaces for non-common elements so that they can be used interchangeably. As of the writing of this paper, the MABE framework is five years old. Here, we reflect on the current version of MABE, including its successes and shortcomings, and propose upgrades for the next release.
Some of the modern evolutionary multiobjective algorithms have a high computational complexity of the internal data processing. To further complicate this problem, researchers often wish to alter some of these procedures, and to do it with little effort.
The problem is even more pronounced for steady-state algorithms, which update the internal information as each single individual is computed. In this paper we explore the applicability of the principles behind the existing framework, called generalized offline orthant search, to the typical problems arising in steady-state evolutionary multiobjective algorithms.
We show that the variety of possible problem formulations is higher than in the offline setting. In particular, we state a problem which cannot be solved in an incremental manner faster than from scratch. We present an efficient algorithm for one of the simplest possible settings, incremental dominance counting, and formulate the set of requirements that enable efficient solution of similar problems. We also present an algorithm to evaluate fitness within the IBEA algorithm and show when it is efficient in practice.
Evolutionary Algorithms (EAs) have become a well-recognized population metaheuristic. The flexibility of EAs is the primary characteristic of its broad domain of practical applications. By contrast, its flexibility made it challenging to design formal languages to represent optimization models. Modeling Languages, in especial, are useful high-level languages for compact formulation and description of optimization problems. However, the lack of integration between EAs and modeling languages can delay the development of sophisticated and advanced generalized symbolic EAs, including gray box algorithms. Thus, this paper presents a Symbolic Evolutionary Algorithm Software Platform which proposes a modeling language: Procedural Modeling Language (PML). This software platform has a particular link to a symbolic compiler that allows the creation of sub-models in real-time. In the course of this paper, a study case is used to exemplify our proposed platform.
jMetal is a Java-based framework for multi-objective optimization with metaheuristics providing, among other features, a wide set of algorithms that are representative of the state-of-the-art. Although it has become a widely used tool in the area, it lacks support for automatic tuning of algorithm parameter settings, which can prevent obtaining accurate Pareto front approximations, especially for inexperienced users. In this paper, we present a first approach to combine jMetal and irace, a package for automatic algorithm configuration; the NSGA-II is chosen as the target algorithm to be tuned. The goal is to facilitate the combined use of both tools to jMetal users to avoid wasting time in adjusting manually the parameters of the algorithms. Our proposal involves the definition of a new algorithm template for evolutionary algorithms, which allows the flexible composition of multi-objective evolutionary algorithms from a set of configurable components, as well as the generation of configuration files for adjusting the algorithm parameters with irace. To validate our approach, NSGA-II is tuned with a benchmark problems and compared with the same algorithm using standard settings, resulting in a new variant that shows a competitive behavior.
Reproducible experimental work is a vital part of the scientific method. It is a concern that is often, however, overlooked in modern computational intelligence research. Scientific research within the areas of programming language theory and mathematics have made advances that are directly applicable to the research areas of evolutionary and swarm intelligence. Through the use of functional programming and the established abstractions that functional programming provides, it is possible to define the elements of evolutionary and swarm intelligence algorithms as compositional computations. These compositional blocks then compose together to allow the declarations of an algorithm, whilst considering the declaration as a "sub-program". These sub-programs may then be executed at a later time and provide the blueprints of the computation. Storing experimental results within a robust data-set file format, which is widely supported by analysis tools, provides additional flexibility and allows different analysis tools to access datasets in the same efficient manner. This paper presents an open-source software library for evolutionary and swarm-intelligence algorithms which allows the type-safe, compositional, monadic and functional declaration of algorithms while tracking and managing effects (e.g. usage of a random number generator) that directly influences the execution of an algorithm.
ECJ is now 20 years old. Begun as a genetic programming and evolutionary computation library in Java, it has since established itself as historically one of the most popular EC toolkits worldwide. In 2016 we received a National Science Foundation grant to improve ECJ in many ways with an eye toward making it a useful toolkit not just for EC but for the broader metaheuristics community. This paper is a report on our efforts to this end. We discuss new metaheuristics frameworks and representations added to ECJ and the design challenges that they raise for a general-purpose framework, as well as testing facilities and other support tools. We conclude with our future directions for the library.
Multi-modal multi-objective problems (MMMOPs) have two or more distinct Pareto-optimal sets (PSs) mapping to the same Pareto-front (PF). Identifying all such PSs assists in informed decision-making. However, existing multi-objective evolutionary algorithms are not equipped to discover multiple PSs. Recently, a few studies have been conducted to design algorithms for such MMMOPs. However, the diversity of the solutions in the PF, obtained by these algorithms, are poor. Moreover, two effective strategies, identified to address MMMOPs, are niching methods and population filtering, based on convergence and diversity of solutions in PF along with diversity of solutions in PS. Motivated by these requirements, this study presents Differential Evolution for MMMOPs (DE-TriM). Its novel contributions include mating pool selection strategy and resource allocation scheme based on reference vector based decomposition of objective space. The effectiveness of DE-TriM is validated by its performance analysis on 11 benchmark MMMOPs in terms of four performance measures as compared to three recent optimization algorithms. The results demonstrate similar performance of DE-TriM in decision space and its superior performance in objective space as compared to the state-of-the-art multi-modal multi-objective evolutionary algorithm.
Local Optima Networks (LONs) have been proposed as a coarsegrained model of discrete (combinatorial) fitness landscapes, where nodes are local optima and edges are search transitions based on an exploration search operator. This paper presents one of the first complex network analysis of continuous fitness landscapes. We use benchmark functions with well-known global structure, and an existing implementation of a Basin-Hopping algorithm to extract the networks. We also explore the impact of varying the Basin-Hopping perturbation step-size. Our results suggest that the landscape's connectivity pattern (global structure) strongly varies with the perturbation step-size, with extreme values of this parameter being detrimental to search and fragmenting the global structure. Our LON visualisations strikingly illustrate the landscape's global (funnel) structure, indicating that LONs serve as a tool for visualising high-dimensional functions.
There are situations where the need for optimisation with a global precision tolerance arises - for example, due to measurement, numerical or evaluation errors in the objective function. In such situations, a global tolerance ϵ > 0 can be predefined such that two objective values are declared equal if the absolute difference between them is less than or equal to ϵ. This paper presents an overview of fitness landscape analysis under such conditions. We describe the formulation of common landscape categories in the presence of a global precision tolerance. We then proceed by discussing issues that can emerge as a result of using tolerance, such as the increase in the neutrality of the fitness landscape. To this end, we propose two methods to exhaustively explore plateaus in such application domains - one of which is point-based and the other of which is set-based.
Following a general trend in artificial intelligence, Evolutionary Computation has, in recent years, witnessed substantial performance gains from landscape-aware selection and parameter tuning of algorithmic modules. Such approaches, however, are critically relying on suitable benchmarks, or training sets, that provide the appropriate blend of performance and generality. With this position paper we argue that, on a landscape analysis basis, the benchmark design problem will form a substantial part of the next-generation of automated, on-demand algorithm design principles.
Local optima networks (LONs) represent the landscape of optimisation problems. In a LON, graph vertices represent local optima in the search domain, their radii the basin sizes, and directed edges between vertices the ability to transit from one basin to another (with the edge width denoting how easy this is). Recently, a network construction approach inspired by LONs has been proposed for multi-objective problems which uses an undirected graph, representing mutually non-dominating solutions and neighbouring links, but not basin sizes. In contrast, here we introduce two formulations for multi/many-objective problems which are analogous to the traditional LON, using dominance-based hill-climbing to characterise the search domain. Each vertex represents a set of locally optimal solutions, with basins and ease of transition between them shown. These LONs vary depending on whether a point-based (dominance neutral optima) or set-based (Pareto local optima) representation is used to define mode construction. We illustrate these alternative formulations on some illustrative problems. We discuss some of the underlying computational issues in constructing LONs in a multi-objective as opposed to uni-objective problem domain, along with the inherent issue of neutrality - as each a vertex in these graphs almost invariably represents a set in our proposed constructs.
Local Optima Networks (LONs) are a valuable tool to understand fitness landscapes of optimization problems observed from the perspective of a search algorithm. Local optima of the optimization problem are linked by an edge in LONs when an operation in the search algorithm allows one of them to be reached from the other. Previous work analyzed several combinatorial optimization problems using LONs and provided a visual guide to understand why the instances are difficult or easy for the search algorithms. In this work we analyze for the first time the MAX-SAT problem. Given a Boolean formula in Conjunctive Normal Form, the goal of the MAX-SAT problem is to find an assignment maximizing the number of satistified clauses. Several random and industrial instances of MAX-SAT are analyzed using Iterated Local Search to sample the search space.
In this position paper we describe challenges related to uncertainty handling when solving stacking problems within storage zones in the steel production value chain. Manipulations in those zones are often relocations of materials performed with gantry cranes. Thereby the crane operators themselves or dispatchers constantly solve a complex stacking problem with the goal of minimizing relocation effort under constraints to adhere to various time windows and to satisfy quality demands.
Dynamic constrained optimization problems (DCOPs) provide larger complexity for an optimization algorithm by changing the problem landscape throughout the optimization process. Introducing constraints to an already changing dynamic environment increases the observed complexity of the problem space. Allowing such constraints to have irregular shapes which change along with the problem space itself provides an even greater level of complexity for an optimization algorithm. This paper proposes a function generator capable of creating dynamically constrained dynamic environments by extending the moving peaks benchmark (MPB) function generator. An analysis of the resulting environments produced by the generator is performed using fitness landscape analysis (FLA). A visual inspection of the resulting generated environments is also included.
Detecting the environmental changes in dynamic optimization problems is an essential phase for a dynamic evolutionary algorithm. By determining the time points of change in the problem, the evolutionary algorithm is capable of adapting and responding to these changes efficiently. It might be more crucial for multiobjective optimization problems, since lack of efficient change detectors may not prevent evolutionary process utilizing invalid nondominated solutions due to the occurrence of changes. The change detection becomes a challenge when dealing with problems that expose less detectable environmental changes, which is a common characteristic of some real-world problems. In this paper, we investigate the performance of sensor-based and population-based change detection schemes on less detectable environmental changes. Additionally, a hybrid scheme is proposed that incorporates sensor-based schemes with the population-based ones. We validate the performance of all three schemes on four different less detectable environment problems by considering different characteristics of dynamism, where hybrid techniques significantly outperform the other alternatives.
Human-based evolutionary computation (EC), for which people act as executors of all evolutionary operators, can be used to solve problems in human organizations. We previously developed a human-based EC system that represents solutions as tags (words) and allows people to evaluate solutions by clicking corresponding tags. Although the system was easy and intuitive to use, it could not handle problems for which solutions are represented as long sentences. In addition, the system could not trace the evolution of solutions. Traceability is a must for the system to be widely and reliably used. In this study, we thus develop a human-based EC system that allows solutions to be represented as both sentences and tags. A function for tracing the evolution of solutions is embedded into the system. The function asks a solution creator to specify which existing solutions influenced the solution creation. We conduct an experiment in which 18 human subjects use the system and then fill out a survey. The results show that the system creates better solutions than those created by each human subject independently. Furthermore, the evolution tree generated from the information given by solution creators is used to confirm that the system allows the evolution of solutions to be traced.
Many complex real-world problems such as bin-packing are optimised using evolutionary computation (EC) techniques. Involving a human user during this process can avoid producing theoretically sound solutions that do not translate to the real world but slows down the process and introduces the problem of user fatigue. Gamification can alleviate user boredom, concentrate user attention, or make a complex problem easier to understand. This paper explores the use of gamification as a mechanism to extract problem-solving behaviour from human subjects through interaction with a gamified version of the bin-packing problem, with heuristics extracted by machine learning. The heuristics are then embedded into an evolutionary algorithm through the mutation operator to create a human-guided algorithm. Experimentation demonstrates that good human performers augment EA performance, but that poorer performers can be detrimental to it in certain circumstances. Overall, the introduction of human expertise is seen to benefit the algorithm.
Per-Instance Algorithm Selection and Automatic Algorithm Configuration have recently gained important interests. However, these approaches face many limitations. For instance, the performance of these methods is deeply influenced by factors like the accuracy of the underlying prediction model, features space correlation, incomplete performance space for new instances, instances sampling and many others. In this paper, an effort to address such limitations is described. Indeed, we propose a cooperative architecture, labeled as the "SAPIAS" concept, composed of a self-adaptive online Algorithm Selection system and an offline Automatic Algorithm Configuration system, working together in order to deliver the most accurate performance. Additionally, SAPIAS is proposed as a methodic concept that the metaheuristics community might adopt to fill in the gap between theory and practice in the field, by providing for theoreticians the ability to continuously analyze the evolution of the problems characteristics and the behavior of the solving techniques as well as providing a ready to use solving framework for practitioners.
The set of primitive operations available to a generative hyper-heuristic can have a dramatic impact on the overall performance of the heuristic search in terms of efficiency and final solution quality. When constructing a primitive set, users are faced with a tradeoff between generality and time spent searching. A set consisting of low-level primitives provides the flexibility to find most or all potential solutions, but the resulting heuristic search space might be too large to find adequate solutions in a reasonable time frame. Conversely, a set of high-level primitives can enable faster discovery of mediocre solutions, but prevent the fine-tuning necessary to find the optimal heuristics. By varying the set of primitives throughout evolution, the heuristic search can utilize the advantages of both high-level and low-level primitive sets. This permits the heuristic search to either quickly traverse parts of the search space as needed or modify the minutiae of the search to find optimal solutions in reasonable amounts of time not feasible with implicit levels of primitive granularity. This paper demonstrates this potential by presenting empirical evidence of improvements to solvers for the Traveling Thief Problem, a combination of the Traveling Salesman Problem and the Knapsack Problem, a recent and difficult problem designed to more closely emulate real world complexity.
This work uses genetic programming to explore the design space of local optimisation algorithms. Optimisers are expressed in the Push programming language, a stack-based language with a wide range of typed primitive instructions. The evolutionary framework provides the evolving optimisers with an outer loop and information about whether a solution has improved, but otherwise they are relatively unconstrained in how they explore optimisation landscapes. To test the utility of this approach, optimisers were evolved on four different types of continuous landscape, and the search behaviours of the evolved optimisers analysed. By making use of mathematical functions such as tangents and logarithms to explore different neighbourhoods, and also by learning features of the landscapes, it was observed that the evolved optimisers were often able to reach the optima using relatively short paths.
Model-based evolutionary algorithms (MBEAs) are praised for their broad applicability to black-box optimization problems. In practical applications however, they are mostly used to repeatedly optimize different instances of a single problem class, a setting in which specialized algorithms generally perform better. In this paper, we introduce the concept of a new type of MBEA that can automatically specialize its behavior to a given problem class using tabula rasa self-learning. For this, reinforcement learning is a naturally fitting paradigm. A proof-of-principle framework, called SL-ENDA, based on estimation of normal distribution algorithms in combination with reinforcement learning is defined. SL-ENDA uses an RL-agent to decide upon the next population mean while approaching the rest of the algorithm as the environment. A comparison of SL-ENDA to AMaLGaM and CMA-ES on unimodal noiseless functions shows mostly comparable performance and scalability to the broadly used and carefully manually crafted algorithms. This result, in combination with the inherent potential of self-learning model-based evolutionary algorithms with regard to specialization, opens the door to a new research direction with great potential impact on the field of model-based evolutionary algorithms.
Dynamic graphs are an essential tool for representing a wide variety of concepts that change over time. Examples include modeling the evolution of relationships and communities in a social network or tracking the activity of users within an enterprise computer network. In the case of static graph representations, random graph models are often useful for analyzing and predicting the characteristics of a given network. Even though random dynamic graph models are a trending research topic, the field is still relatively unexplored. The selection of available models is limited and manually developing a model for a new application can be difficult and time-consuming. This work leverages hyper-heuristic techniques to automate the design of novel random dynamic graph models. A genetic programming approach is used to evolve custom heuristics that emulate the behavior of a variety of target models with high accuracy. Results are presented that illustrate the potential for the automated design of custom random dynamic graph models.
This paper details an investigation of the extent to which performance can be improved for the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) by tuning the selection of individuals used for the mean-update algorithm. A hyper-heuristic is employed to explore the space of algorithms which select individuals from the population. We show the increase in performance obtained with a tuned selection algorithm, versus the unmodified CMA-ES mean-update algorithm. Specifically, we measure performance on instances from several real-valued benchmark function classes to demonstrate generalization of the improved performance.
Dynamic optimisation problems (DOPs) are optimisation problems that change over time. Typically, DOPs have been defined as a sequence of static problems, and the dynamism has been inserted into existing static problems using different techniques. In the case of dynamic permutation problems, this process has been usually done by the rotation of the landscape. This technique modifies the encoding of the problem and maintains its structure over time. Commonly, the changes are performed based on the previous state, recreating a concatenated changing problem. However, despite its simplicity, our intuition is that, in general, the landscape rotation may induce severe changes that lead to problems whose resemblance to the previous state is limited, if not null. Therefore, the problem should not be classified as a DOP, but as a sequence of unrelated problems. In order to test this, we consider the flow shop scheduling problem (FSSP) as a case study and the rotation technique that relabels the encoding of the problem according to a permutation. We compare the performance of two versions of the state-of-the-art algorithm for that problem on a wide experimental study: an adaptive version that benefits from the previous knowledge and a restarting version. Conducted experiments confirm our intuition and reveal that, surprisingly, it is preferable to restart the search when the problem changes even for some slight rotations. Consequently, the use of the rotation technique to recreate dynamic permutation problems is revealed in this work.
In this paper we provide a comparative empirical analysis of four different generating sets for the algebraic Differential Evolution for Permutations (DEP) applied to the Traveling Salesman Problem (TSP). In particular, DEP has been extended in order to use the reversal moves as generating set. Two different randomized decomposers are proposed for the reversal generators. The experiments have been conducted on a selected set of commonly adopted TSP instances, and the results show the newly proposed generating set leads to better performances with respect to other three generating sets based on alternative search moves.
In this work we analyze, from a qualitative point-of-view, the structure of the connections among the local optima in the fitness landscapes of the Quadratic Assignment Problem (QAP). In particular, we are interested in determining which search moves, intended as pairwise exchanges of permutation items, are beneficial for moving from one optimum to another. Novel algebraic methods are introduced for determining, and measuring the effectiveness, of the exchange moves connecting two given optima. The analysis considers real-like QAP instances whose local optima networks are clustered in communities. The results of the conducted experimentation shows the presence of few preferred search moves that look more effective for moving across intra-community optima, while the same is not so apparent when the optima are taken from different communities.
A non-empty string x over an ordered alphabet is said to be a Lyndon word if it is alphabetically smaller than all of its cyclic rotations. Any string can be uniquely factored into Lyndon words and efficient algorithms exist to perform the factorization process in linear time and constant space. Lyndon words find wide-ranging applications including string matching and pattern inference in bioinformatics. Here we investigate the impact of permuting the alphabet ordering on the resulting factorization and demonstrate significant variations in the numbers of factors obtained. We also propose an evolutionary algorithm to find optimal orderings of the alphabet to enhance this factorization process and illustrate the impact of different operators. The flexibility of such an approach is illustrated by our use of five fitness functions which produce different factorizations suitable for different downstream tasks.
This paper proposes an evolutionary approach to solve the Inventory Routing Problem (IRP) using knowledge extracted from Local Optima Networks (LONs). Solving the IRP involves a simultaneous optimization of the transportation routes and the inventory levels. One of important steps in solving IRP is determining the optimal route of each supplying vehicle for each date of the planning horizon. As the transportation cost is based only on the distance matrix, constant in time and independent of the supplying vehicle, this step consists of solving a TSP problem on a certain subset of facilities. This paper aims at improving solving IRP by deriving some knowledge on the full TSP problem and reusing it in solving TSP sub-problems. Experiments carried out on popular benchmark IRP instances prove that using the knowledge derived from LONs increases the efficiency of the evolutionary algorithm and the proposed approach outperforms simple evolutionary algorithms in solving IRP.
This work presents interesting multi-point search algorithms exploiting several surrogate models, implemented in MI-NAMO, the multi-disciplinary optimization platform of Cenaero. Many types of surrogate models are used in the literature with their own strengths and weaknesses. More generally, each one models differently a given problem and provides its own representation of the reality. The idea of this paper is to exploit simultaneously different types of surrogate models in order to catch automatically their strengths and to outshine some of their weaknesses. This strategy is based on a multi-point enrichment at each iteration, each candidate point being provided by one kind of surrogate model and/or criterion. This strategy can be tuned by selecting different infill criteria, based on different surrogate models, in order to improve more specifically different aspects such as feasibility, exploration and/or exploitation. The performance of this surrogate-based optimization framework is illustrated on well-known constrained benchmark problems available in the literature (such as GX-functions and MOPTA08 test cases). Good performance both in terms of identification of feasible regions and objective gains is demonstrated.
We propose a new approach for building recommender systems by adapting surrogate-assisted interactive genetic algorithms. A pool of user-evaluated items is used to construct an approximative model which serves as a surrogate fitness function in a genetic algorithm for optimizing new suggestions. The surrogate is used to recommend new items to the user, which are then evaluated according to the user's liking and subsequently removed from the search space. By updating the surrogate model after new recommendations have been evaluated by the user, we enable the model itself to evolve towards the user's preferences.
In order to precisely evaluate the performance of that approach, the human's subjective evaluation is replaced by common continuous objective benchmark functions for evolutionary algorithms. The system's performance is compared to a conventional genetic algorithm and random search. We show that given a very limited amount of allowed evaluations on the true objective, our approach outperforms these baseline methods.
Surrogate models are used to reduce the burden of expensive-to-evaluate objective functions in optimization. By creating models which map genomes to objective values, these models can estimate the performance of unknown inputs, and so be used in place of expensive objective functions. Evolutionary techniques such as genetic programming or neuroevolution commonly alter the structure of the genome itself. A lack of consistency in the genotype is a fatal blow to data-driven modeling techniques: interpolation between points is impossible without a common input space. However, while the dimensionality of genotypes may differ across individuals, in many domains, such as controllers or classifiers, the dimensionality of the input and output remains constant. In this work we leverage this insight to embed differing neural networks into the same input space. To judge the difference between the behavior of two neural networks, we give them both the same input sequence, and examine the difference in output. This difference, the phenotypic distance, can then be used to situate these networks into a common input space, allowing us to produce surrogate models which can predict the performance of neural networks regardless of topology. In a robotic navigation task, we show that models trained using this phenotypic embedding perform as well or better as those trained on the weight values of a fixed topology neural network. We establish such phenotypic surrogate models as a promising and flexible approach which enables surrogate modeling even for representations that undergo structural changes.
This paper describes the approach for calibration of environmental models with the presence of time and quality restrictions. Advantages of the suggested strategy are based on two main concepts. The first advantage was provided by reducing the overall optimisation time due to the surrogate modelling of fitness function with the iterative gradual refinement of the environmental model fidelity (spatial and temporal resolution) for improving the fitness approximation. For the demonstration of the efficiency of surrogate-assisted multi-fidelity approach, it was compared with the baseline evolutionary calibration approach. The second advantage was assured by additional increasing of optimisation quality in the presence of strict deadline due to the building the strategy of multi-fidelity fitness approximation directly during the evolutionary algorithm execution. In order to prove the efficiency of the proposed dynamic strategy, it was compared with the preliminary meta-optimisation approach. As a case study, the wind wave model SWAN is used. The conducted experiments confirm the effectiveness of the proposed anytime approach and its applicability for the complex environmental models' parameters calibration.
Surrogate models are invaluable tools that greatly assist the process of computationally expensive analyses and optimization. Engineering optimization reaps the benefit from surrogate models in order to perform expensive optimization that could potentially be computationally intractable in the pre-high-performance computing age. Moreover, surrogate models provide a means to allow engineering design exploration with high-fidelity computer simulations. Despite their wide use and substantial research progresses, there are still some key issues and challenges that need to be addressed by researchers. Most of these issues stem from the growing complexity of engineering design optimization and exploration in real-world problems. In other words, the sophistication of the problem that we have to tackle increases faster than that of computing power and technology. It is thus imperative to have accurate and yet computationally efficient surrogate models that are suitable for real-world engineering problems. In this paper, we discuss key issues and challenges of the application of surrogate models in engineering design optimization and exploration. This paper is directed toward general readers, in which we aim to present general discussions regarding the effectiveness, issues, and future of surrogate-based optimization and exploration in engineering.
In this position paper, we discuss the need for systematic benchmarking of surrogate-assisted evolutionary algorithms and give an overview of existing suitable function suites. Based on the findings, we hope to encourage more comparative studies in this field supported by benchmarks and outline how a concerted effort of the community could create better insight into the various previously proposed algorithms and concepts.
A new framework1 mixing evolutionary approach, discrete-event simulation and deep neural networks is proposed to achieve multi-asset collection/image acquisition scheduling in a surveillance context. It combines an extended graph-based hybrid genetic algorithm (GA) used for satellite image acquisition scheduling, with a predictive simulation-based deep neural network and knowledge-based capabilities to solve an heterogeneous collection asset scheduling problem. Plan execution simulation and neural networks predict track trajectories target behaviors. In contrast, a knowledge-based approach is used to estimate target identification. Both assessments are exploited to instantiate key solution quality parameters of a generalized decision model aimed at maximizing task collection value subject to a variety of collector capacity constraints. The mixed framework departs from basic point target/area coverage task modeling, introducing tracking and identification tasks while expanding resource allocation to various space, air and ground-based deployable image acquisition/collection asset types.
The primary focus of the machine learning model is to train a system to achieve self-reliance. However, due to the absence of the inbuilt security functions the learning phase itself is not secured which allows attacker to exploit the security vulnerabilities in the machine learning model. When a malicious adversary manipulates the input data, it exploits vulnerabilities of machine learning algorithms which can compromise the entire system. In this research study, we are conducting a vulnerability assessment of the malware classification model by injecting the datasets with an adversarial example to degrade the quality of classification obtained currently by a trained model. The objective is to find the security gaps that are exploitable in the model. The vulnerability assessment is done by introducing the malware classification model to an AML environment using the Black-Box attack. The simulation provided an insight into the inputs injected into the classifiers and proves the inherent security vulnerability exists in the classification model.
Malware is a malicious code which intends to harm computers and networks. Each year, a huge number of malicious programs are released. Therefore, detecting malware has become one of the most important challenges for the security of computer systems. Various methods have been defined for detecting and classifying malware, such as signature-based and heuristic-based techniques. This paper proposes a new malware detection method based on the operational codes (OpCodes) within an executable file by using the evolutionary algorithm. There are several steps in the proposed method, which includes disassembling the executable files, generating a graph of OpCodes and using the evolutionary algorithm to find the most similar graph to each suspicious instance. Finally, the label of each suspicious instance is detected based on the most similar graph obtained from the evolutionary algorithm with each class (family of malware and benign). The results show that, the proposed method can be used as a method for malware detection and malware category.
In this paper, we ask a question whether evolutionary algorithms can evolve cryptographic algorithms when no precise design criteria are given. Our strategy utilizes Cartesian Genetic Programming in the bi-level optimization setting with multiple populations trying to evolve a cryptographic algorithm and break it. To challenge our design paradigm, we consider a number of scenarios with varying criteria on the system and its security. We are able to obtain interesting results in several scenarios where the attacker is not able to understand the text with more than a random chance. Interestingly, our system is able to develop various versions of one-time pads, which are the only systems that ensure perfect secrecy. Although our system is far from practical, we consider it interesting since it gives good results that are also human-readable.
The link prediction problem, which involves determining the likelihood of a relationship between objects, has numerous applications in the areas of recommendation systems, social networking, anomaly detection, and others. A variety of link prediction techniques have been developed to improve predictive performance for different application domains. Selection of the appropriate link prediction heuristic is critical which demonstrates the need for tailored solutions. This work explores the use of hyper-heuristics to automate the selection and generation of customized link prediction algorithms. A genetic programming approach is used to evolve novel solutions from functionality present in existing techniques that exploit characteristics of a specific application to improve performance. Applications of this approach are tested using data from a real-world enterprise computer network to differentiate normal activity from randomly generated anomalous events. Results are presented that demonstrate the potential for the automated design of custom link prediction heuristics that improve upon the predictive capabilities of conventional methods.
The use of the virtual local area networks (VLAN) technology is a well-known method of restricting access to network elements in heterogeneous network infrastructures, including critical infrastructures. The essence of the VLAN usage is to create a set of virtual subnets and in the distribution of computers on these subnets. This distribution is described by a "computer-subnet" matrix, called the VLAN-based access control scheme. The formation of an access control scheme is an NP-complete problem, and for its solution several methods are known, including genetic algorithms. However, for large critical infrastructures, the solution of this problem by known methods is very laborious due to its large dimension. The paper proposes to introduce the concept of "roles" for VLANs and use the role-based approach to optimize VLAN-based access control schemes. To solve the optimization problem, an improved genetic algorithm is proposed. Improvements are associated with the multi-chromosome representation of individuals, a new type of fitness function, a two-dimensional type of crossover operation, and a number of other aspects. The results of the experiments show the high efficiency of the proposed genetic algorithm.
Google's Android platform dominates the smartphone world, but this tremendous popularity has led to an appealing target for malicious actors. This is especially troublesome because the diverse nature of Android devices, and the modifications that are made by each manufacturer, make consistency in system software difficult across all devices. As new attack and evasion behaviours emerge, security researchers work to create more sophisticated detection systems. Many of these systems are based on machine learning approaches. An especially promising avenue that is being actively researched is the use of evolutionary systems, where detectors are bred, rather than built. In this paper, we examine two such evolutionary systems, training them on established datasets before comparing their performance on large, unknown datasets. Our experiments demonstrate that these systems are effective in predicting contemporaneous and evolved malicious applications, and meet or exceed the results of a state-of-the-art, rule-based system.
Distributed Denial of Service (DDoS) cyber attacks continue to increase and cause disruptions in both industry and politics. As more critical information and services are provided through networks, it is important to keep these networks available. However, since adversaries are continuously changing and adapting, stationary defense strategies do not effectively secure networks against attacks. We investigate Nash equilibria in cyber security problems by modeling attacker-defender interactions using competitive coevo-lutionary algorithms. In particular, we examined the performances of two algorithms (and their variations) that look for Nash equilibria, NashSolve and HybridCoev, and compared their performances against other existing heuristics. Using two evaluation techniques, one that looked at average fitness scores and one that created a compendium of MEU, MinMax, and inverse Pareto front ratio scores, we found that NashSolve and HybridCoev did not perform significantly better for both attacker and defender populations relative to other heuristics.
Stochastic population-based nature-inspired metaheuristics have been proven as a robust tool for mining association rules. These algorithms are very scalable, as well as very fast compared with some deterministic ones that search for solutions exhaustively. Typically, algorithms for association rule mining identify a lot of rules depending, on the transaction database and number of attributes. Therefore, evaluating these rules is very complex. On the other hand, establishing the relationships between discovered association rules can be considered as a very hard problem that cannot easily be solved manually. In this paper, we propose a new algorithm based on stochastic population-based nature-inspired metaheuristics for discovering dependencies among association rules.
Border detection of melanoma and other skin lesions from images is an important step in the medical image processing pipeline. Although this task is typically carried out manually by the dermatologists, some recent papers have applied evolutionary computation techniques to automate this process. However, these works are only focused on the polynomial case, ignoring the more powerful (but also more difficult) case of rational curves. In this paper, we address this problem with rational Bézier curves by applying the bat algorithm, a popular bio-inspired swarm intelligence technique for optimization. Experimental results on two examples of medical images of melanomas show that this method is promising, as it outperforms the polynomial approach and can be applied to medical images without further pre/post-processing.
In our proposed method, an object can be detected from time-series images taken by two or a few cameras. When one of the cameras detects that the object has moved, a system locates that object from the images taken by the other camera(s) by using the timestamps. Then, a position data of that object can be obtained autonomously without taking a lot of photographs of that object and teaching data to the system in advance. Moreover, the system passes the position data and a label of that object to YOLO, which is a learning model for discriminating objects. YOLO learns the data and become possible to indicate the label of the object. The results of an experiment showed that this system could discriminate the moving object only by two cameras without the previously prepared teaching data.1
In this paper, four systems for disaster countermeasures were implemented. The first one was a 3D reconstruction system. It would construct a 3D model from images taken by a drone. The second one was a human detection and posture analyzing system. A drone would find a person in eyesight, take a picture and let it be posture analyzed. This would help finding people in need. The third one was a self flying system for drones. Drones would be given a mission to carry out. Real time analysis of their eyesight would allow autonomous decisions for flying routes. The fourth one was a system, where one server would control multiple drones at the same time to gather information about remote disaster areas. In an experiment, the AI drone flew autonomously and created useful data for monitoring the area. As multiple drones could be used at the same time, this system has a large potential for making disaster area monitoring easier and disaster countermeasures more efficient.1
When a big disaster occurs, a rescue team has to send the food and/or supplies to an unspecified number of affected people. Therefore, an evolutionary method that can provide service to unspecific large number of users is needed. In this paper, an evolutionary method, Self-Administered Genetic Algorithm (SAGA) is proposed. This method applies systems of intelligent autonomous agents that are operated on a block chain network and evaluated by cryptocurrency. The intelligent autonomous agents can operate on themselves by spending their cryptocurrency to a node of the block chain. The intelligent autonomous agents (parents) increase by creating copies (children) of themselves. The children initially acquire cryptocurrency from their parents. As a result, the intelligent autonomous agents can evolve autonomously in SA-GA. SA-GA can be a method that optimizes a system consisting of a large number of autonomous agents1.
Caregivers in nursing facilities are too busy to pay attention constantly to care receivers. Recently, some cameras have been set up in some nursing facilities. However, the caregivers cannot continuously monitor the care receivers on a display in the daytime. Now two-dimensional (2D) poses of many people in an image can be detected by OpenPose software; it can detect skeletons of humans by using a deep learning method. Therefore, we thought that if 2D poses of care receivers were detected by OpenPose immediately before a new action (preliminary action) and if the poses could be classified by a deep learning method, it might be possible to infer the care receivers' intentions. In this paper, we created a learning model that can discriminate the preliminary action based on coordinate data of keypoints detected by OpenPose from an image of a person. We examined whether a subject's action that was going to interrupt a conversation could be predicted or not. The result suggested that the learning model can discriminate the preliminary action of the subject by the coordinate data1.
Automatic text summarisation has drawn considerable interest in the field of software engineering. It can improve the efficiency of software developers, enhance the quality of products, and ensure timely delivery. In this paper, we present our initial work towards automatically generating human-like multi-document summaries from heterogeneous software artefacts. Our analysis of the text properties of 545 human-written summaries from 15 software engineering projects will ultimately guide heuristics searches in the automatic generation of human-like summaries.
Genetic improvement uses automated search to find improved versions of existing software. Software can either be evolved with general-purpose intentions or with a focus on a specific application (e.g., to improve it's efficiency for a particular class of problems). Unfortunately, software specialisation to each problem application is generally performed independently, fragmenting and slowing down an already very time-consuming search process. We propose to incorporate specialisation as an online mechanism of the general search process, in an attempt to automatically devise application classes, by benefiting from past execution history.
The optimisation of non-functional properties of software is of growing importance in all scales of modern computing (from embedded systems to data-centres). In mobile computing, smart devices have complex interactions between their hardware and software components. Small changes in the environment can greatly impact the measurements of non-functional properties of software. In-vivo optimisation of applications on a platform can be used to evolve robust new solutions. However, the portability of such solutions' performance across different platforms is questionable. In this paper we discuss the issue of optimising the non-functional properties of applications running in the Android ecosystem. We also propose a distributed framework that can mitigate such issues.
The subject of f-restrictions has garnered considerable interest recently as it encompasses many different types of combinatorial objects, all of which have unique and important applications. One of the most popular of these is an ingredient in the generation of covering arrays, which are used for discovering faulty interactions among software components. We focus on existential f-restrictions, which have a structure that can be exploited by genetic algorithms. In particular, recent work on such restrictions considers affine transformations while maximizing the corresponding "score" of the formed restriction. We propose to use genetic algorithms for existential f-restrictions by providing a general framework that can be applied to all such objects.
CMA-ES plus manual code changes rapidly transforms 512 Newton-Raphson start points from a GNU C library table driven version of sqrt into a double precision reciprocal square root function. The GI x-1/2 is far more accurate than Quake's InvSqrt, Quare root.
Genetic Improvement (GI) uses automated search to improve existing software. Most GI work has focused on empirical studies that successfully apply GI to improve software's running time, fix bugs, add new features, etc. There has been little research into why GI has been so successful. For example, genetic programming has been the most commonly applied search algorithm in GI. Is genetic programming the best choice for GI? Initial attempts to answer this question have explored GI's mutation search space. This paper summarises the work published on this question to date.
Stress testing is part of today's bank risk management and often required by the governing regulatory authority. Performing such a stress test with stress scenarios derived from a distribution, instead of pre-defined expert scenarios, results in a systematic approach in which new severe scenarios can be discovered. The required scenario distribution is obtained from historical time series via a Vector-Autoregressive time series model.
The worst-case search, i.e. finding the scenario yielding the most severe situation for the bank, can be stated as an optimization problem. The problem itself is a constrained optimization problem in a high-dimensional search space. The constraints are the box constraints on the scenario variables and the plausibility of a scenario. The latter is expressed by an elliptic constraint.
As the evaluation of the stress scenarios is performed with a simulation tool, the optimization problem can be seen as black-box optimization problem. Evolution Strategy, a well-known optimizer for black-box problems, is applied here. The necessary adaptations to the algorithm are explained and a set of different algorithm design choices are investigated. It is shown that a simple box constraint handling method, i.e. setting variables which violate a box constraint to the respective boundary of the feasible domain, in combination with a repair of implausible scenarios provides good results.
Computational models are crucial in understanding brain function. Their architecture is designed to replicate known brain structures, and the behavior that emerges is then compared to observed fMRI and other imaging techniques. As the models become more complex with more parameters, they can explain more of the observed phenomena, and may eventually be used for diagnosis and design of treatments of brain disorders. However, those parameters need to be carefully optimized for the models to work, which becomes intractable as the models grow. In this preliminary work, CMA-ES has been configured to optimize continuous parameters of a functional connectivity model, resulting in a better fit to empirical data than manually selected parameters in all trial runs. This approach will be combined with other EC techniques to optimize other parameters. The techniques will be scaled up to more detailed structural and functional data and local parameters.
Real-world problems often involve optimization of multiple conflicting objectives. Significant research has been directed recently towards development of multi-objective evolutionary algorithms that are scalable, i.e., able to deal with problems involving more than 3 objectives, commonly referred to as many-objective optimization problems. This has led to the emergence several new techniques that can deliver a set of trade-off solutions to approximate the Pareto optimal front of the problem. However, means to select solution(s) from this large trade-off set for final implementation/decision making has received relatively scarce attention. This paper aims to study and demonstrate the performance of recursive expected marginal utility (EMUr) approach for informed decisionmaking. Towards this goal, we apply the EMUr approach to identify solutions of interest for two practical examples and analyze the obtained set of solutions. The study highlights the desirable trade-off characteristics that the chosen solutions have over the rest of the trade-off set, highlighting its potential as a decision-making tool, especially in cases where other preference information or domain knowledge is unavailable.
Hyperparameter tuning is an important mixed-integer optimisation problem, especially in the context of real-world applications such as games. In this paper, we propose a function suite around hyperparameter optimisation of game AI based on the card game Splendor and using the Rinascimento framework. We propose several different functions and demonstrate their complexity and diversity. We further suggest various possible extensions of the proposed suite.
When a machine learning algorithm is able to obtain the same performance given a complete training set, and a small subset of samples from the same training set, the subset is termed coreset. As using a coreset improves training speed and allows human experts to gain a better understanding of the data, by reducing the number of samples to be examined, coreset discovery is an active line of research. Often in literature the problem of coreset discovery is framed as i. single-objective, attempting to find the candidate coreset that best represents the training set, and ii. independent from the machine learning algorithm used. In this work, an approach to evolutionary coreset discovery is presented. Building on preliminary results, the proposed approach uses a multi-objective evolutionary algorithm to find compromises between two conflicting objectives, i. minimizing the number of samples in a candidate coreset, and ii. maximizing the accuracy of a target classifier, trained with the coreset, on the whole original training set. Experimental results on popular classification benchmarks show that the proposed approach is able to identify candidate coresets with better accuracy and generality than state-of-the-art coreset discovery algorithms found in literature.
One of the biggest challenges in evolutionary computation concerns the selection and configuration of a best-suitable heuristic for a given problem. While in the past both of these problems have primarily been addressed by building on experts' experience, the last decade has witnessed a significant shift towards automated decision making, which capitalizes on techniques proposed in the machine learning literature.
A key success factor in automated algorithm selection and configuration are good training sets, whose performance data can be leveraged to build accurate performance prediction models. With the long-term goal to build landscape-aware parameter control mechanisms for iterative optimization heuristics, we consider in this discussion paper the question how well the 24 functions from the BBOB test bed cover the characteristics of (hyper-)parameter tuning problems. To this end, we perform a preliminary landscape analysis of two hyper-parameter selection problems, and compare their feature values with those of the BBOB functions. While we do see a good fit for one of the tuning problems, our findings also indicate that some parameter tuning problems might not be very well represented by the BBOB functions. This raises the question if one can nevertheless deduce reliable performance-prediction models for hyper-parameter tuning problems from the BBOB test bed, or whether for this specific target the BBOB benchmark should be adjusted, by adding or replacing some of its functions.
Independently of the aspect of training automated algorithm selection and configuration techniques, hyper-parameter tuning problems offer a plethora of problems which might be worthwhile to study in the context of benchmarking iterative optimization heuristics.
A typical scenario when solving industrial single or multiobjective optimization problems is that no explicit formulation of the problem is available. Instead, a dataset containing vectors of decision variables together with their objective function value(s) is given and a surrogate model (or metamodel) is build from the data and used for optimization and decision-making. This data-driven optimization process strongly depends on the ability of the surrogate model to predict the objective value of decision variables not present in the original dataset. Therefore, the choice of surrogate modelling technique is crucial. While many surrogate modelling techniques have been discussed in the literature, there is no standard procedure that will select the best technique for a given problem.
In this work, we propose the automatic selection of a surrogate modelling technique based on exploratory landscape features of the optimization problem that underlies the given dataset. The overall idea is to learn offline from a large pool of benchmark problems, on which we can evaluate a large number of surrogate modelling techniques. When given a new dataset, features are used to select the most appropriate surrogate modelling technique. The preliminary experiments reported here suggest that the proposed automatic selector is able to identify high-accuracy surrogate models as long as an appropriate classifier is used for selection.
Reinforcement learning is gaining prominence in the machine learning community. It dates back over three decades in areas such as cybernetics and psychology, but has more recently been applied widely in robotics, game playing and control systems. There are many approaches to reinforcement learning, most of which are based on the Markov decision process model. The goal of reinforcement learning is to learn the best strategy (referred to as a policy in reinforcement learning) of an agent interacting with its environment in order to reach a specified goal. Recently, evolutionary computation has been shown to be of benefit to reinforcement learning in some limited scenarios. Many studies have shown that the performance of evolutionary computation algorithms is influenced by the structure of the fitness landscapes of the problem being optimised. In this paper we investigate the global structure of the policy search spaces of simple reinforcement learning problems. The aim is to highlight structural characteristics that could influence the performance of evolutionary algorithms in a reinforcement learning context. Results indicate that the problems we investigated are characterised by enormous plateaus that form unimodal structures, resulting in a kind of needle-in-a-haystack global structure.
The use of well-established statistical testing procedures to compare the performance of evolutionary algorithms often yields pessimistic results. This requires increasing the number of independent samples, and thus the computation time, in order to get results with the necessary precision.
We aim at improving this situation by developing statistical tests that are good in answering typical questions coming from benchmarking of evolutionary algorithms. Our first step, presented in this paper, is a procedure that determines whether the performance distributions of two given algorithms are identical for each of the benchmarks. Our experimental study shows that this procedure is able to spot very small differences in the performance of algorithms while requiring computational budgets which are by an order of magnitude smaller (e.g. 15x) compared to the existing approaches.
The most commonly used statistics in Evolutionary Computation (EC) are of the Wilcoxon-Mann-Whitney-test type, in its either paired or non-paired version. However, using such statistics for drawing performance comparisons has several known drawbacks. At the same time, Bayesian inference for performance analysis is an emerging statistical tool, which has the potential to become a promising complement to the statistical perspectives offered by the aforementioned p-value type test. This work exhibits the practical use of Bayesian inference in a typical EC setting, where several algorithms are to be compared with respect to various performance indicators. Explicitly we examine performance data of 11 evolutionary algorithms (EAs) over a set of 23 discrete optimization problems in several dimensions. Using this data, and following a brief introduction to the relevant Bayesian inference practice, we demonstrate how to draw the algorithms' probabilities of winning. Apart from fixed-target and fixed-budget results for the individual problems, we also provide an illustrative example per groups of problems. We elaborate on the computational steps, explain the associated uncertainties, and articulate considerations such as the prior distribution and the sample sizing. We also present as a reference the classical p-value tests.
Automated benchmarking environments aim to support researchers in understanding how different algorithms perform on different types of optimization problems. Such comparisons carry the potential to provide insights into the strengths and weaknesses of different approaches, which can be leveraged into designing new algorithms. Carefully selected benchmark problems are also needed as training sets in the context of algorithm selection and configuration. With the ultimate goal to create a meaningful benchmark set for iterative optimization heuristics, we compile and assess in this work a selection of discrete optimization problems that subscribe to different types of fitness landscapes. All problems have been implemented and tested within IOHprofiler, our recently released software built to assess iterative heuristics solving combinatorial optimization problems. For each selected problem we compare performances of eleven different heuristics. Apart from fixed-target and fixed-budget results for the individual problems, we also derive ECDF results for groups of problems. To obtain these, we have implemented an add-on for IOHprofiler which allows aggregation of performance data across different benchmark functions.
Benchmarking aims to investigate the performance of one or several algorithms for a set of reference problems by empirical means. An important motivation for benchmarking is the generation of insight that can be leveraged for designing more efficient solvers, for selecting a best algorithm, and/or for choosing a suitable instantiation of a parametrized algorithm. An important component of benchmarking is its design of experiment (DoE), which comprises the selection of the problems, the algorithms, the computational budget, etc., but also the performance indicators by which the data is evaluated. The DoE very strongly depends on the question that the user aims to answer. Flexible benchmarking environments that can easily adopt to users' needs are therefore in high demand.
With the objective to provide such a flexible benchmarking environment, the recently released IOHprofiler not only allows the user to choose the sets of benchmark problems and reference algorithms, but provides in addition a highly interactive, versatile performance evaluation. However, it still lacks a few important performance indicators that are relevant to practitioners and/or theoreticians. In this discussion paper we focus on one particular aspect, the probability that a considered algorithm reaches a certain target value within a given time budget. We thereby suggest to extend the classically regarded fixed-target and fixed-budget analyses by a fixed-probability measure. Fixed-probability curves are estimated using Pareto layers of (target, budget) pairs that can be realized by the algorithm with the required certainty. We also provide a first implementation towards a fixed-probability module within the IOHprofiler environment.
Kinder SurpriseTM are chocolate eggs that house a small toy. Fans of the toys across the world are collecting and trading these toys, with some toys being very rare and highly priced. With this paper, we present and publish a data set that was extracted from the German Kinder Surprise pricing guide for 2019. The data set contains prices and photos of 187 sets of figurines produced between 1979 and 2018. In total 2366 items are included. We also provide a few first ideas for using this data as a benchmark data set for discrete optimisation, i.e., for subadditive functions with matroid constraints, for combinatorial auctions, and for functions with occasionally violated conditions.
This paper proposes a new discrete optimization benchmark 100b-Digit, a binary discretized version for the 100-Digit Challenge. The continuous version 100-Digit Challenge utilizing continuous input parameters for a fitness function was suggested for the competitions at 2019 GECCO and 2019 CEC, while this paper proposes an extension, a discrete version of the 100-Digit Challenge, discretizing using an IEEE-754 double precision floating-point representation for the search space variables.
Furthermore, the paper also presents a viable recent state-of-the-art algorithm discretization to be applicable on the new 100b-Digit benchmark. V-shape transfer function is applied for binarization of the variables from the Distance-based Success History Differential Evolution (DISH) to create the new DISHv algorithm, which is then run on the 100b-Digit benchmark. The preliminary results for the DISHv algorithm are then reported for a basic value-to-reach termination criterion over 100b-Digit functions. This combination of 100b-Digit benchmark and DISHv algorithm demonstrates how to bridge a gap between recent continuous optimizers and different discrete benchmarks, and vice-versa.
This study investigates how to improve the predictions of glucose values obtained with genetic programming models. A set of statistical techniques are used to discover glucose profiles that identify similar situations in patients with type 1 diabetes mellitus, and incorporate this knowledge to the models. Glucose time series are divided into 4-hour non-overlapping slots and clustered using the technique based on decision trees called chi-square automatic interaction detection, to classify glucose profiles into groups using two decision variables: day of the week and time slot of the day. The objective is to customize models for different glucose profiles that appear in the patient's day-to-day. Genetic programming models created with glucose values from the original data-set are compared to those of models created with classified glucose values. Significant differences (p-value < 0.05) and associations are observed between the glucose profiles. In general, using classified glucose values in models created with genetic programming, the accuracy of the predictions improves in comparison with those of models created with the original data-set. We concluded that the classification process can be useful to correct and improve habits or clinical therapies in patients, and obtain more accurate models through automatic learning techniques and artificial intelligence.
Considering that Prostate Cancer (PCa) is the most frequently diagnosed tumor in Western men, considerable attention has been devoted in computer-assisted PCa detection approaches. However, this task still represents an open research question. In the clinical practice, multiparametric Magnetic Resonance Imaging (MRI) is becoming the most used modality, aiming at defining biomarkers for PCa. In the latest years, deep learning techniques have boosted the performance in prostate MR image analysis and classification. This work explores the use of the Semantic Learning Machine (SLM) neuroevolution algorithm to replace the backpropagation algorithm commonly used in the last fully-connected layers of Convolutional Neural Networks (CNNs). We analyzed the non-contrast-enhanced multispectral MRI sequences included in the PROSTATEx dataset, namely: T2-weighted, Proton Density weighted, Diffusion Weighted Imaging. The experimental results show that the SLM significantly outperforms XmasNet, a state-of-the-art CNN. In particular, with respect to XmasNet, the SLM achieves higher classification accuracy (without neither pre-training the underlying CNN nor relying on backprogation) as well as a speed-up of one order of magnitude.
MicroRNAs (miRNA) play an important role in various biological process by regulating gene expression. Their abnormal expression may lead to cancer. Therefore, analysis of such data may discover potential biological insight for cancer diagnosis. In this regard, recently many feature selection methods have been developed to identify such miRNAs. These methods have their own merits and demerits as the task is very challenging in nature. Thus, in this article, we propose a novel wrapper based feature selection technique with the integration of Rough and Fuzzy sets, Random Forest and Particle Swarm Optimization, to identify putative miRNAs that can solve the underlying biological problem effectively, i.e. to separate tumour and control samples. Here, Rough and Fuzzy sets help to address the vagueness and overlapping characteristics of the dataset while performing clustering. On the other hand, Random Forest is applied to perform the classification task on the clustering results to yield better solutions. The integrated clustering and classification tasks are considered as an underlying optimization problem for Particle Swarm Optimization method where particles encode features, in this case, miRNAs. The performance of the proposed wrapper based method has been demonstrated quantitatively and visually on next-generation sequencing data of breast cancer from The Cancer Genome Atlas (TCGA). Finally, the selected miRNAs are validated through biological significance tests. The code and dataset used in this paper are available online1.
Previous studies have proposed an objective non-invasive approach to assist diagnosing neurological diseases such as Alzheimer and Parkinson's diseases by asking patients to perform certain drawing tasks against certain figure. However, the approach of rating those drawing test results is still very subjective by relying on manual measurements. By extracting features of the drawn figure from the raw data, which is generated from the digitized tablet that patients can draw on, we can use supervised learning to train the evolutionary algorithm with those extracted data, and therefore evolves an automated classifier to analyse and classify those drawing accurately. Cartesian Genetic Programming (CGP) is an improved version of conventional Genetic Programming (GP). As GP adapts the tree structure, redundancy issue exists as the tree develops more nodes with the evolution of the GP by mutation and crossover. CGP addresses this issue by using fixed number of nodes and arities, evolves by using mutation only. The outcome of this research is a highly efficient, accurate, automated classifier that can not only classify clinical drawing test results, which can provide up to 80% accuracy, but also assisting clinicians and medical experts to investigate how those features are used by the algorithm and how each component can impact patient's cognitive function.
We consider the problem of visualizing the population dynamics along an evolutionary run using a dimensionality reduction technique for mapping individuals from the original search space to a 2-D space. We quantitatively assess four of these techniques in terms of their ability to preserve useful information about (a) population movements and (b) exploration-exploitation trade-off. We propose two compact visualizations aimed at highlighting these two aspects of population dynamics and evaluate them qualitatively. The results are very promising as the proposed framework is indeed able to represent crucial properties of population dynamics in a way that is both highly informative and simple to understand.
The electrical grid is undergoing an unprecedented evolution driven mainly by the adoption of smart grid technologies. The high penetration of distributed energy resources, including renewables and electric vehicles, promises several benefits to the different market actors and consumers, but at the same time imposes grid integration challenges that must adequately be addressed. In this paper, we explore and propose potential business models (BMs) in the context of distribution networks with high penetration of electric vehicles (EVs). The analysis is linked to the CENERGETIC project (Coordinated ENErgy Resource manaGEment under uncerTainty considering electrIc vehiCles and demand flexibility in distribution networks). Due to the complex mechanisms needed to fulfill the interactions between stakeholders in such a scenario, computational intelligence (CI) techniques are envisaged as a viable option to provide efficient solutions to the optimization problems that might arise by the adoption of innovative BMs. After a brief review on evolutionary computation (EC) applied to the optimization problems in distribution networks with high penetration of EVs, we conclude that EC methods can be suited to implement the proposed business models in our future CENERGETIC project and beyond.
The Rotated Klee-Minty Problem represents an advancement of the well-known linearly constrained Klee-Minty Problem that was introduced to illustrate the worst case running time of the Simplex algorithm. Keeping the linearity as well as the complexity of the original Klee-Minty Problem, the Rotated Klee-Minty Problem remedies potential biases with respect to Coordinate Search and turns out to be a suitable constrained test environment for randomized search heuristics. The present paper is concerned with the comparison of recent evolutionary algorithm variants for constrained optimization on this respective test bed. The considered algorithm variants include the most successful participants of the CEC Competition on Single Objective Real-parameter optimization and other selected strategies that are not directly applicable to the CEC test suite. Taking into account the diverse constraint handling approaches, the performance results of all search heuristics are interpreted. It turns out that most strategies that have been successful in the CEC competitions do have severe problems on the RKMP.
This work presents the integration of the recently released benchmark suite MLDA into Nevergrad, a likewise recently released platform for derivative-free optimization. Benchmarking evolutionary and other optimization methods on this collection enables us to learn how algorithms deal with problems that are often treated by means of standard methods like clustering or gradient descent. As available computation power nowadays allows for running much 'slower' methods without noticing a performance difference it is an open question which of these standard methods may be replaced by derivative-free and (in terms of quality) better performing optimization algorithms. Additionally, most MLDA problems are suitable for landscape analysis and other means of understanding problem difficulty or algorithm behavior, due to their tangible nature.
We present the open-source reimplementation of MLDA inside the Nevergrad platform and further discuss some first findings, which result from exploratory experiments with this platform. These include superior performance of advanced quasi-random sequences in some highly multimodal cases (even in non-parallel optimization), great performance of CMA for the perceptron and the Sammon tasks, success of DE on clustering problems, and straight forward implementations of highly competitive algorithm selection models by means of competence maps.
In deep learning and physical science problems, there is a growing need for better optimization methods capable of working in very high dimensional settings. Though the use of approximated Hessians and co-variance matrices can accelerate the optimization process, these methods do not always scale well to high dimensional settings. In an attempt to meet these needs, in this paper, we propose an optimization method, called Adaptive Two Mode (ATM), that does not use any DxD objects, but rather relies on the interplay of isotropic and directional search modes. It can adapt to different optimization problems, by the use of an online parameter tuning scheme, that allocates more resources to better performing versions of the algorithm. To test the performance of this method, the Adaptive Two Mode algorithm was benchmarked on the noiseless BBOB-2009 testbed. Our results show that it is capable of solving 23/24 of the functions in 2D and can solve higher dimensional problems that do not require many changes in the direction of the search. However, it underperforms on problems in which the function to be minimized changes rapidly in non-separable directions, yet mildly in others.
One of the main goals of the COCO platform is to produce, collect, and make available benchmarking performance data sets of optimization algorithms and, more concretely, algorithm implementations. For the recently proposed biobjective bbob-biobj test suite, less than 20 algorithms have been benchmarked so far but many more are available to the public. We therefore aim in this paper to benchmark several available multiobjective optimization algorithms on the bbob-biobj test suite and discuss their performance. We focus here on algorithms implemented in the platypus framework (in Python) whose main advantage is its ease of use without the need to set up many algorithm parameters.
Uniform Random Search is considered the simplest of all randomized search strategies and thus a natural baseline in benchmarking. Yet, in continuous domain it has its search domain width as a parameter that potentially has a strong effect on its performance. In this paper, we investigate this effect on the well-known 24 functions from the bbob test suite by varying the sample domain of the algorithm ([-α,α]n for α ε {0.5, 1, 2, 3, 4, 5, 6, 10, 20} and n the search space dimension). Though the optima of the bbob testbed are randomly chosen in [-4,4]n (with the exception of the linear function f5), the best strategy depends on the search space dimension and the chosen budget. Small budgets and larger dimensions favor smaller domain widths.
In this paper, we propose a comparative benchmark of MO-CMA-ES, COMO-CMA-ES (recently introduced in [12]) and NSGA-II, using the COCO framework for performance assessment and the Bi-objective test suite bbob-biobj. For a fixed number of points p, COMO-CMA-ES approximates an optimal p-distribution of the Hypervolume Indicator. While not designed to perform on archive-based assessment, i.e. with respect to all points evaluated so far by the algorithm, COMO-CMA-ES behaves well on the COCO platform. The experiments are done in a true Black-Blox spirit by using a minimal setting relative to the information shared by the 55 problems of the bbob-biobj Testbed.
We evaluate in this paper the GNN-CMA-ES algorithm on the BBOB noiseless testbed. The GNN-CMA-ES algorithm was recently proposed as a plug-in extension to CMA-ES, introducing the possibility to train flexible search distributions, in contrast to standard search distributions (such as the multivariate Gaussian). By comparing GNN-CMA-ES and CMA-ES, we show the benefits of this extension on some unimodal functions as well as on a variety of multimodal functions. We also identify a family of unimodal functions where GNN-CMA-ES can degrade the performances of CMA-ES and discuss the possible reasons behind this behavior.
In this paper we benchmark five variants of CMA-ES for optimization in large dimension on the novel large scale testbed of COCO under default or modified parameter settings. In particular, we compare the performance of the separable CMA-ES, of VD-CMA-ES and VkD-CMA-ES, of two implementations of the Limited Memory CMA-ES and of the Rank m Evolution Strategy, RmES. For VkD-CMA-ES we perform experiments with different complexity models of the search distribution and for RmES we study the impact of the number of evolution paths employed by the algorithm. The quasi-Newton L-BFGS-B algorithm is also benchmarked and we investigate the effect of choosing the maximum number of variable metric corrections for the Hessian approximation. As baseline comparison, we provide results of CMA-ES up to dimension 320.
In this article we benchmark eight multivariate local solvers as well as the global Differential Evolution algorithm from the Python SciPy library on the BBOB noiseless testbed. We experiment with different parameter settings and termination conditions of the solvers. More focus is given to the L-BFGS-B and Nelder-Mead algorithms. For the first we investigate the effect of the maximum number of variable metric corrections used for the Hessian approximation and show that larger values than the default are of advantage. For the second we investigate the effect of adaptation of parameters, which is proved crucial for the performance of the method with increasing dimensionality.
This paper presents a novel optimisation approach, called Structured-Chromosome Genetic Algorithm (SCGA), that addresses the issue of handling variable-size design space optimisation problems. This is based on variants of standard genetic operators able to handle structured search spaces. The potential of the presented methodology is shown by solving the problem of defining observation campaigns for tracking space objects from a network of tracking stations. The presented approach aims at supporting the space sector in response to the constantly increasing population size in the around-Earth environment. The test case consists in finding the observation scheduling that minimises the uncertainty in the final state estimation of a very low Earth satellite operating in a highly perturbed dynamical environment. This is evaluated by coupling the optimiser with an estimation routine based on a sequential filtering approach that estimates the satellite state distribution conditional on received indirect measurements. The solutions found by employing SCGA are finally compared to the ones achieved using more traditional approaches. Namely, the problem has been reformulated to be faced using standard Genetic Algorithm and another variable-size optimiser, the "Hidden-genes" Genetic Algorithm variant.
Residual stresses, both macroscopic and microscopic, are originated during conventional metallurgical processes. Knowing their magnitude and distribution is of great importance in the structural design of applications where fatigue, stress corrosion or thermal cycling occur (e.g., in the aerospace industry). The importance of these stresses is reflected in the large number of articles published in recent years, mainly focused on studying macroscopic stresses. However, there are no experimental studies that quantify the magnitude of microscopic triaxial stresses. This lack is due in part to the limitations of diffraction techniques (neutrons and synchrotron radiation). Since the measurement volume is much higher than the variation of these microscopic stresses, its calculation is greatly complicated, because the methods used in the case of macroscopic stresses are not valid for microscopic ones. Furthermore, there is no reliable procedure to obtain the relaxed lattice parameter value, a key factor in the calculation of residual stresses. The aim of this paper is to present the main ideas oriented to develop a methodology for mapping microscopic stresses, particularly in aluminium alloys such as those commonly used in the aerospace industry. The procedure will use experimental diffraction results obtained from large European facilities, mainly by neutron diffraction. This information will be analyzed using evolutionary algorithms, computational techniques that handle a large number of variables. The procedure will be based on the analysis of the shift of the diffraction peaks and, fundamentally, their broadening. For simplicity, non-heat treatable alloys will be used as they do not experience lattice parameter variation with heat treatments.
A number of applications to interplanetary trajectories have been recently proposed based on deep networks. These approaches often rely on the availability of a large number of optimal trajectories to learn from. In this paper we introduce a new method to quickly create millions of optimal spacecraft trajectories from a single nominal trajectory. Apart from the generation of the nominal trajectory no additional optimal control problems need to be solved as all the trajectories, by construction, satisfy Pontryagin's minimum principle and the relevant transversality conditions. We then consider deep feed forward neural networks and benchmark three learning methods on the created dataset: policy imitation, value function learning and value function gradient learning. Our results are shown for the case of the interplanetary trajectory optimization problem of reaching Venus orbit, with the nominal trajectory starting from the Earth. We find that both policy imitation and value function gradient learning are able to learn the optimal state feedback, while in the case of value function learning the optimal policy is not captured, only the final value of the optimal propellant mass is.
Multimodal optimization has shown to be a complex paradigm underneath real-world problems arising in many practical applications, with particular prevalence in physics-related domains. Among them, a plethora of cases within the computational design of aerospace structures can be modeled as a multimodal optimization problem, such as aerodynamic optimization or airfoils and wings. This work aims at presenting a new research direction towards efficiently tackling this kind of optimization problems, which pursues the discovery of the multiple (at least locally optimal) solutions of a given optimization problem. Specifically, we propose to exploit the concept behind the so-called Novelty Search mechanism and embed it into the self-adaptive Differential Evolution algorithm so as to gain an increased level of controlled diversity during the search process. We assess the performance of the proposed solver over the well-known CEC'2013 suite of multimodal test functions. The obtained outcomes of the designed experimentation supports our claim that Novelty Search is a promising approach for heuristically addressed multimodal problems.
In this paper, we investigated and compared the performance of various constrained surrogate-based optimization (SBO) techniques on solving low-fidelity, low-speed airfoil design problems. We aim to better understand the strengths and weakness of various constrained SBO algorithms on handling non-algebraic real-world problems. Low-fidelity problems are chosen since they allow us to perform multiple independent runs of optimization algorithms, but still in the domain of non-algebraic real-world problems. To be specific, the optimization methods that we compared are Kriging-based efficient global optimization (EGO) with the probability of feasibility (PoF), ConstrLMSRBF, COBRA, and COBYLA. Results on four airfoil design problems show that ConstrLMSRBF is the most robust method in terms of convergence and consistency of performance. On the other hand, EGO-PoF found high-quality solutions on two airfoil problems, but its robustness decreases as the dimensionality increases. We also observe that COBRA is significantly better than EGO-PoF on one high-dimensionality problem, but its general performance is not as good as that of ConstrLMSRBF. Finally, COBYLA is the worst performer, which implies that methods based on linear interpolation are not accurate for the problems considered in this paper.
This paper describes a method to solve Multi-objective Dynamic Travelling Salesman Problems. The problems are formulated as multi-objective hybrid optimal control problems, where the choice of the target destination for each phase is an integer variable. The resulting problem has thus a combinatorial nature in addition to being a multi-objective optimal control problem. The overall solution approach is based on a combination of the Multi Agent Collaborative Search, a population based memetic multi-objective optimisation algorithm, and the Direct Finite Elements Transcription, a direct method for optimal control problems. A relaxation approach is employed to transform the mixed integer problem into a purely continuous problem, and a set of smooth constraints is added in order to ensure that the relaxed variables of the final solution assume an integer value. A special set of smooth constraints is introduced in order to treat the mutually exclusive choices of the targets for each phase. The method is tested on two problems: the first is a motorised Travelling salesman problem described in the literature, the second is a space application where a satellite has to de-orbit multiple debris. For the first problem, the approach is generating better solutions than those reported in the literature.
Aiming at the problem of oversubscription of data relay access request of user stars in future Space-Based Information System, the problem of resource scheduling optimization for data relay satellite system with microwave and laser hybrid links is studied. The characteristics of the hybrid links are analyzed. A multi-objective programming model on static resource scheduling constraint satisfaction problem is established, and a hybrid optimization algorithm integration of artificial immune strategies, niche ideas and improved genetic algorithm is put forward to solve the scheduling model. Simulation results show that the hybrid optimization algorithm optimizes the model quickly, and performs well in the ability of global optimization and convergence. The results validate that the static resource scheduling model could accurately describe the microwave and laser hybrid links relay satellite system resource scheduling problem with multi-tasking and multi-type antenna1.
Quantum programs are difficult for humans to develop due to their complex semantics that are rooted in quantum physics. It is therefore preferable to write specifications and then use techniques such as genetic programming (GP) to generate quantum programs instead. We present a new genetic programming system for quantum circuits which can evolve solutions to the full-adder and quantum Fourier transform problems in fewer generations than previous work, despite using a general set of gates. This means that it is no longer required to have any previous knowledge of the solution and choose a specialised gate set based on it.
Limited arity unbiased black-box complexity was proven to be a successful tool for understanding the working principles of randomized search heuristics and delivered insights to develop new efficient algorithms. While good upper bounds for simple problems were found long time ago, there are still no matching lower bounds.
On a road towards closing this gap, we introduce the notion of limited-memory, limited arity unbiased black-box complexity. We show that some efficient binary unbiased algorithms (almost) satisfy the memory-2 requirement, and present an algorithm to compute, for a given problem size, the exact lower bound on the runtime of any memory-m k-ary algorithm on any unimodal function.
The performance comparison of multi-objective evolutionary algorithms (MOEAs) has been a broadly studied research area. For almost two decades, quality indicators (QIs) have been employed to quantitatively compare the Pareto front approximations produced by MOEAs. QIs are set-functions that assign a real value, depending on specific preferences, to such approximation sets. Mainly, QIs aim to measure the capacity of MOEAs to generate nondomi-nated solutions, the diversity of such solutions, and their convergence to the true Pareto front. Regarding convergence QIs, the Pareto-compliance property is crucial to properly assess the performance of MOEAs. However, in specialized literature, the only Pareto-compliant QI is the hypervolume indicator. In this paper, we propose a methodology to construct new Pareto-compliant indicators based on the combination of QIs. Our preliminary experimental results show that our proposed framework to construct Pareto-compliant QIs introduce new preferences over the Pareto front approximations.
Lexicase selection has been proven highly successful for finding effective solutions to problems in genetic programming, especially for test-based problems where there are many distinct test cases that must all be passed. However, lexicase (as with most selection schemes) requires all prospective solutions to be evaluated against most test cases each generation, which can be computationally expensive. Here, we propose reducing the number of per-generation evaluations required by applying random subsampling: using a subset of test cases each generation (down-sampling) or by assigning test cases to subgroups of the population (cohort assignment). Tests are randomly reassigned each generation, and candidate solutions are only ever evaluated on test cases that they are assigned to, radically reducing the total number of evaluations needed while ensuring that each lineage eventually encounters all test cases. We tested these lexicase variants on five different program synthesis problems, across a range of down-sampling levels and cohort sizes. We demonstrate that these simple techniques to reduce the number of per-generation evaluations in lexicase can substantially improve overall performance for equivalent computational effort.
Black-box optimization of a previously unknown problem can often prove to be a demanding task. In order for the optimization process to be as efficient as possible, one must first recognize the nature of the problem at hand and then proceed to choose the algorithm exhibiting the best performance for that type of problem. The problem characterization is done via underlying fitness landscape features, which allow to identify similarities and differences between various problems.
In this paper we present first steps towards an adaptive landscape analysis. Our approach is aimed at taking a closer look into how features evolve during the optimization process and whether this information can be used to discriminate between different problems. The motivation of our work is to understand if and how one could exploit the information provided by the features to improve on dynamic algorithm selection and configuration. Put differently, our goal is to leverage landscape analysis to adjust the choice of the algorithm on the fly, i.e., during the optimization process itself.
In this work we provide a theoretical and empirical study of the (1 + (λ,λ)) EA on the LeadingOnes problem. We prove an upper bound of O(n2) fitness evaluations on the expected runtime for all population sizes λ < n. This asymptotic bound does not depend on the parameter λ.
We show via experiments that the value of λ has a small influence on the runtime (less than a factor of two). The value of λ that optimizes the runtime is small relative to n. We propose an extension of the existing (1 + (λ, λ)) EA by using different population sizes in the mutation and in the crossover phase of the algorithm and show via experiments that this modification can outperform the original algorithm by a small constant factor.
Previous work presented a technique called evolving self-taught neural networks - neural networks that can teach themselves, intrinsically motivated, without external supervision or reward [3]. In an autonomous multi-agent setting in which the agent is primitively set to know little or nothing about its environment, self-teaching was shown to give rise to intelligence, whereas an evolutionary algorithm alone fails since it has no way to search without gradient information. In this paper, we conduct another comparative experiment in which the foraging agent is built more conscious of its environment beforehand. Experimental results show that the more conscious primitive design can let evolution alone be able to search. Yet the combination of evolution and self-teaching still outperforms the alternative. Indications for future work on evolving intelligence are also presented.
Dynamic Compartmental Models (DCM) can be used to study the population dynamics of Multi- and Many-objective Optimization Evolutionary Algorithms (MOEAs). These models track the composition of the instantaneous population by grouping them in compartments and capture their behavior in a set of values, creating a compact representation for analysis and comparison of algorithms. Furthermore, the use of DCMs is not limited to analysis, by creating models of the same algorithm with different configurations is possible to extract new models by interpolation, and use them to explore fine-grained configurations lying between the ones used as a base. We illustrate the use of the model on some Multi- and Many-objective algorithms, run on enumerable MNK-Landscapes instances with 6 objectives for the analysis, and 5 objectives when used as a tool to do configuration.
Insights on characteristics of an optimization problem is highly important in order to select and configure the right algorithm. Some techniques called features are defined for analyzing the fitness landscape of a problem. Despite their successes, our understanding of which features are actually relevant for the discrimination between different optimization problems is rather weak, since in most applications the features are used in a black-box manner. Another aspect that has been ignored in the exploratory landscape analysis literature is the robustness of the feature computation against the randomness of sample points from which the feature values are estimated. Moreover, the influence of the number of sample points from which the feature values are estimated is also an aspect ignored by the literature. In this paper, we study these three aspects: the robustness against the random sampling, the influence of the number of sample points, and the expressiveness in terms of ability to discriminate problems. We perform such an analysis for 7 out of the 17 features sets covered by the flacco package. Our test bed are the 24 noiseless BBOB functions. We show that some of these features seems very well-fitted for the discrimination of the problems and quite robust whereas others lack robustness and/or expressiveness, and are therefore less suitable for an automated landscape-aware algorithm selection/configuration approach.
This work deals with an NP-hard problem in graphs known as the weighted independent domination problem. We propose a biased random key genetic algorithm for solving this problem. The most important part of the proposed algorithm is a decoder that translates any vector of real-values into valid solutions to the tackled problem. The experimental results, in comparison to a state-of-the-art population-based iterated greedy algorithm from the literature, show that our proposed approach has advantages over the state-of-the-art algorithm in the context of the more dense graphs in which edges have higher weights than vertices.
With improvements in selection methods and genetic operators, Genetic Programming (GP) has been able to solve many software synthesis problems. However, so far, the primary focus of GP has been on improving success rates (fraction of the runs that succeeds in finding a solution). Less attention has been paid to other important characteristics and quality measures of human-written programs. One such quality measure is modularity. Since the introduction of Automatically Defined Functions (ADFs) by John Koza, most of efforts involving modularity in GP have been directed towards pre-programming modularity into the GP system, rather than measuring it for evolved programs. Modularity has played a central role in evolutionary biology. To study its effects on the evolution of software, however, we need a quantitative formulation of modularity. In this paper, we present two platform-independent modularity metrics, namely, reuse and repetition, that make use of the information contained in the execution traces of the programs. We describe the process of calculating these metrics for any evolved program, using problems that have been solved with the PushGP system as examples. We also discuss some mechanisms for integrating these metrics into the evolution framework itself.
Multi-objective optimization problems in the real world often involve a decision maker who has certain preferences for the objective functions. When such preferences can be expressed as a reference point, the goal of optimization changes from generating a complete set of Pareto-optimal solutions to generating a small set of non-dominated solutions close to the reference point. Reference-point based optimization algorithms are used for this purpose. The preferences of the decision maker in the objective space can be interpreted as knowledge in the decision space. Extracting this knowledge it-eratively from the solutions generated during optimization, and feeding it back into the optimization algorithm can in principle improve convergence towards the reference point. Since the knowledge is extracted during runtime, this approach is termed as online knowledge-driven optimization. In this paper a recent knowledge discovery technique called flexible pattern mining is used to extract explicit rules that are used to generate new solutions in R-NSGA-II. The performance of the proposed FPM-R-NSGA-II is demonstrated on 3, 5 and 10 objective DTLZ problems. In addition to converging to a set of preferred solutions, FPM-R-NSGA-II also converges to a set of explicit rules which describe the decision maker's preferences in the decision space.
This paper addresses the problem of optimizing a Demand Responsive Transport (DRT) service. A DRT is a flexible transportation service that provides on-demand transport for users who formulate requests specifying desired locations and times of pick-up and delivery. The vehicle routing and scheduling procedures are performed based on a set of requests. This problem is modeled as a multi-objective Dial-a-Ride problem (DARP), in which a set of objectives related to costs and user inconvenience is optimized while respecting a set of constraints imposed by the passengers and vehicles, as time windows and capacity. The resulting model is solved by means of three Multi-objective Evolutionary Algorithms (MOEA) associated with feasibility-preserving operators. Computational experiments were performed on benchmark instances and the results were analyzed by means of performance quality indicators widely used for multi-objective algorithms comparison. The proposed approaches demonstrate efficient and higher performance when optimizing this DRT service compared to another algorithm from the literature.
We conduct a fixed-target runtime analysis of (1 + 1) EA with resampling on the OneMax and BinVal problems. For OneMax, our fixed-target upper bound refines the previously known bound. Our fixed-target lower bound for OneMax is the first of this kind. We also consider linear functions and show that the traditional approaches via drift analysis cannot easily be extended to yield fixed-target results. However, for the particular case of BinVal, a relatively precise fixed-target bound is obtained.
Many real-world optimization problems involve both multiple objectives and constraints. Although constraint handling in multiobjective optimization has been considered in the literature, there is still a high demand for more advanced and versatile constraint handling techniques (CHTs) in real-world applications. For this reason, we propose a general approach to combine multiple CHTs into an ensemble-based method, providing a framework to easily construct new CHTs from existing ones. The approach is evaluated on nine test problems from the literature using an ensemble of four widely-used CHTs. The experimental results show that the ensemble is more robust than single CHTs and performs at least as well as the best single CHTs on all the test problems. Moreover, a positive synergistic effect of the ensemble is demonstrated on three problems.