 Full Citation in the ACM Digital Library
               Full Citation in the ACM Digital Library
               None of today's AI systems or approaches comes anywhere close to capturing the common sense of a toddler, or even a 3-month old infant. I will talk about some of the challenges facing conventional machine learning paradigms, such as end-to-end unsupervised learning in deep networks and deep reinforcement learning, and discuss some initial, small steps we have taken with an alternative cognitively-inspired AI approach. This requires us to develop a different engineering toolset, based on probabilistic programs, game-style simulation programs as general-purpose startup software (or "the game engine in the head"), and learning as programming (or "the child as hacker").
A major challenge of contemporary statistical inference is the large-scale limit, where one wants to discover the values of many hidden parameters, using large amount of data. In recent years, ideas from statistical physics of disordered systems have helped to develop new algorithms for important inference problems, ranging from community detection to compressed sensing, machine learning (notably neural networks), tomography and generalized linear regression. The talk will review these developments and explain how they can be used, to develop new types of algorithms and identify phase transitions.
Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment ("AI Spring") and periods of disappointment, loss of confidence, and reduced funding ("AI Winter"). Even with today's seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected. One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this talk I will discuss some fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I will also speculate on what is needed for the grand challenge of making AI systems more robust, general, and adaptable --- in short, more intelligent.
Ant colony optimizers have been successfully used as general-purpose optimization heuristics. Due to the complicated nature of the random processes that describe the runs of ACO algorithms, the mathematical understanding of these algorithms is much less developed than that of other nature-inspired heuristics. In this first runtime analysis of a basic ACO algorithm on a classic multimodal benchmark, we analyze the runtime of the 2-MMASib on jump functions. For moderate jump sizes k ≤ α0 ln n, α0 > 0 a constant, we prove a runtime of order [EQUATION], when the evaporation factor ρ satisfies ρ ≤ Cn-1/2 ln(n)-1 for a sufficiently small constant C. For ρ = Θ(n-1/2 ln(n)-1), we thus obtain a runtime of O(n ln(n)). This result shows that simple ACO algorithms can cope much better with local optima than many evolutionary algorithms, which need Ω(nk) time.
In this work, we are interested in studying the parallel drone scheduling traveling salesman problem (PDSTSP), where deliveries are split between a truck and a fleet of drones. The truck performs a common delivery tour, while the drones are forced to perform back and forth trips between customers and a depot. The objective is to minimize the completion time coming back to the depot of all the vehicles. We present a hybrid ant colony optimization (HACO) metaheuristic to solve the problem. Our algorithm is based on an idea from the literature that represents a PDSTSP solution as a permutation of all customers. And then a dynamic programming is used to decompose the customer sequence into a tour for the truck and trips for the drones. We propose a new dynamic programming combined with other problem-tailored components to efficiently solve the problem. When being tested on benchmark instances from the literature, the HACO algorithm outperforms state-of-the-art algorithms in terms of both running time and solution quality. More remarkably, we find 23 new best known solutions out of 90 instances considered.
The paradox of choice refers to the observation that numerous choices can have a detrimental effect on the quality of decision making. We study this effect in swarms in the context of a resource foraging task. We simulate the evolution of swarms gathering energy from a number of energy sources distributed in a two-dimensional environment. As the number of sources increases, the evolution of the swarm results in reduced levels of efficiency, despite the sources covering more space. Our results indicate that this effect arises because the simultaneous detection of multiple sources is incompatible with an evolutionary scheme that favours greedy energy consumption. In particular, the communication among the agents tends to reduce their efficiency by precluding the evolution of a clear preference for an increasing number of options. The overabundance of explicit information in the swarm about fitness-related options cannot be exploited by the agents lacking complex planning capabilities. If the sensor ranges evolve in addition to the behaviour of the agents, a preference for reduced choice results, and the average range approaches zero as the number of sources increases. Our study thus presents a minimal model for the paradox of choice, which implies several options of experimental verification.
Particle Swarm Optimization (PSO) is a heuristic optimization technique where particles explore optima of the fitness landscape defined in the search space. It is a practical approach but vulnerable to incorrect particle movement parameter tuning. Therefore, stochastic models of a particle and a swarm are subject to analysis. In this paper, we study the stability properties of particles controlled by a universal update equation proposed by Cleghorn and Engelbrecht. We propose a new concept of particle state, namely the stasis state when the difference between two consecutive particle location variance values is not changing much after a sufficient time. We also propose a convergence time measure, representing a minimal number of steps a particle location variance needs to reach stasis. Finally, we visualize the convergence time characteristics obtained in simulations.
Computational swarm intelligence has been demonstrably shown to efficiently solve high-dimensional optimization problems due to its flexibility, robustness, and (low) computational cost. Despite these features, swarm-based algorithms are black boxes whose dynamics may be hard to understand. In this paper, we delve into the Fish School Search (FSS) algorithm by looking at how fish interact within the fish school. We find that the network emerging from these interactions is structurally invariant to the optimization of three benchmark functions: Rastrigin, Rosenbrock and Schwefel. However, at the same time, our results also reveal that the level of social interactions among the fish depends on the problem. We show that the absence of highly-influential fish leads to a slow-paced convergence in FSS and that the changes in the intensity of social interactions enable good performance on both unimodal and multimodal problems. Finally, we examine two other swarm-based algorithms---the Artificial Bee Colony (ABC) and Particle Swarm Optimization (PSO) algorithms---and find that for the same three benchmark functions, the structural invariance characteristic only occurs in the FSS algorithm. We argue that FSS, ABC, and PSO have distinctive signatures of interaction structure and flow.
In collective decision-making, individuals in a swarm reach consensus on a decision using only local interactions without any centralized control. In the context of the best-of-n problem - characterized by n discrete alternatives - it has been shown that consensus to the best option can be reached if individuals disseminate that option more than the other options. Besides being used as a mechanism to modulate positive feedback, long dissemination times could potentially also be used in an adversarial way, whereby adversarial swarms could infiltrate the system and propagate bad decisions using aggressive dissemination strategies. Motivated by the above scenario, in this paper we propose a bio-inspired defence strategy that allows the swarm to be resilient against options that can be disseminated for longer times. This strategy mainly consists in reducing the mobility of the agents that are associated to options disseminated for a shorter amount of time, allowing the swarm to converge to this option. We study the effectiveness of this strategy using two classical decision mechanisms, the voter model and the majority rule, showing that the majority rule is necessary in our setting for this strategy to work. The strategy has also been validated on a real Kilobots experiment.
Knapsack problem with a single continuous variable (KPC) is a natural extension of the standard 0-1 knapsack problem. In the KPC, the capacity of the knapsack is not fixed, so it becomes more difficult to solve. Although several metaheuristics have been proposed to solve the KPC, their performances are not stable enough. This paper proposes a hybrid ant colony optimization (HACO) algorithm to solve the KPC. In the HACO algorithm, a simplified ACO algorithm is designed to search the optimal subset of items, and the mutation operator of differential evolution algorithm is used to search the optimal value of continuous variable. The parameter of the mutation operator is adaptively configured via an ACO algorithm. In the simplified ACO algorithm, an independent selection strategy and a pheromone-based selection strategy are designed to speed up the algorithm and simplify the tuning of parameters respectively. Personal best solutions constructed by each ant are used to deposit pheromone. Furthermore, a value-based greedy optimization strategy is introduced to enhance the solutions. The HACO algorithm is experimentally analyzed on 40 KPC instances. The experimental results show that the HACO algorithm is better than other competitors in terms of obtaining the best solution and the mean solution.
Generative Adversarial Networks (GANs) can generate levels for a variety of games. This paper focuses on combining GAN-generated segments in a snaking pattern to create levels for Mega Man. Adjacent segments in such levels can be orthogonally adjacent in any direction, meaning that an otherwise fine segment might impose a barrier between its neighbor depending on what sorts of segments in the training set are being most closely emulated: horizontal, vertical, or corner segments. To pick appropriate segments, multiple GANs were trained on different types of segments to ensure better flow between segments. Flow was further improved by evolving the latent vectors for the segments being joined in the level to maximize the length of the level's solution path. Using multiple GANs to represent different types of segments results in significantly longer solution paths than using one GAN for all segment types, and a human subject study verifies that these levels are more fun and have more human-like design than levels produced by one GAN.
Quality-Diversity algorithms search for large collections of diverse and high-performing solutions, rather than just for a single solution like typical optimisation methods. They are specially adapted for multi-modal problems that can be solved in many different ways, such as complex reinforcement learning or robotics tasks. However, these approaches are highly dependent on the choice of feature descriptors (FDs) quantifying the similarity in behaviour of the solutions. While FDs usually needs to be hand-designed, recent studies have proposed ways to define them automatically by using feature extraction techniques, such as PCA or Auto-Encoders, to learn a representation of the problem from previously explored solutions. Here, we extend these approaches to more complex problems which cannot be efficiently explored by relying only on a single representation but require instead a set of diverse and complementary representations. We describe MC-AURORA, a Quality-Diversity approach that optimises simultaneously several collections of solutions, each with a different set of FDs, which are, in turn, defined automatically by an ensemble of modular auto-encoders. We show that this approach produces solutions that are more diverse than those produced by single-representation approaches.
Quality-Diversity (QD) optimisation is a new family of learning algorithms that aims at generating collections of diverse and high-performing solutions. Among those algorithms, the recently introduced Covariance Matrix Adaptation MAP-Elites (CMA-ME) algorithm proposes the concept of emitters, which uses a predefined heuristic to drive the algorithm's exploration. This algorithm was shown to outperform MAP-Elites, a popular QD algorithm that has demonstrated promising results in numerous applications. In this paper, we introduce Multi-Emitter MAP-Elites (ME-MAP-Elites), an algorithm that directly extends CMA-ME and improves its quality, diversity and data efficiency. It leverages the diversity of a heterogeneous set of emitters, in which each emitter type improves the optimisation process in different ways. A bandit algorithm dynamically finds the best selection of emitters depending on the current situation. We evaluate the performance of ME-MAP-Elites on six tasks, ranging from standard optimisation problems (in 100 dimensions) to complex locomotion tasks in robotics. Our comparisons against CMA-ME and MAP-Elites show that ME-MAP-Elites is faster at providing collections of solutions that are significantly more diverse and higher performing. Moreover, in cases where no fruitful synergy can be found between the different emitters, ME-MAP-Elites is equivalent to the best of the compared algorithms.
There is evidence of a relationship between the dynamics of resource availability and the evolution of cooperative behaviour in complex networks. Previous studies have used mathematical models, agent-based models, and studies of hunter-gatherer societies to investigate the causal mechanisms behind this relationship. Here, we present a novel, agent-based software system, built using Unity 3D, which we employ to investigate the adaptation of food sharing networks to fluctuating resource availability. As a benefit of using Unity, our system possesses an easily customisable, visually compelling interface where evolution can be observed in real-time. Across four types of populations, under three environmental conditions, we performed a quantitative analysis of the evolving structure of social interactions. A biologically-inspired gene-sequencing function translates an arbitrarily extendable genome into phenotypic behaviour. Our results contribute to the understanding of how resource availability affects the evolutionary path taken by artificial societies. It emerges that environmental conditions have a greater impact on social evolution compared to the initial genetic configurations of each society. In particular, we find that scenarios of periodically fluctuating resources lead to the evolution of stable, tightly organised societies, which form small, local, mutualistic food-sharing networks.
Jamming Grippers are a novel class of soft robotic actuators that can robustly grasp and manipulate objects of arbitrary shape. They are formed by placing a granular material within a flexible skin connected to a vacuum pump and function by pressing the un-jammed gripper against a target object and evacuating the air to transition the material to a jammed (solid) state, gripping the target object. However, due to the complex interactions between grain morphology and target object shape, much uncertainty still remains regarding optimal constituent grain shapes for specific gripping applications. We address this challenge by utilising a modern Evolutionary Algorithm, NSGA-III, combined with a Discrete Element Method soft robot model to perform a multi-objective optimisation of grain morphology for use in jamming grippers for a range of target object sizes. Our approach optimises the microscopic properties of the system to elicit bespoke functional granular material performance driven by the complex relationship between the individual particle morphologies and the related emergent behaviour of the bulk state. Results establish the important contribution of grain morphology to gripper performance and the critical role of local surface curvature and the length scale governed by the relative sizes of the grains and target object.
Tangled program graphs (TPG) support emergent modularity by first identifying subsets of programs that can usefully coexist (a team/ graph node) and then identifying the circumstance under which to reference other teams (arc adaptation). Variation operators manipulate the content of teams and arcs. This introduces cycles into the TPG structures. Previously, this effect was eradicated at run time by marking nodes while evaluating TPG individuals. In this work, a new marking heuristic is introduced, that of arc (learner) marking. This means that nodes can be revisited, but not the same arcs. We investigate the impact of this through 18 titles from the Arcade Learning Environment. The performance and complexity of the policies appear to be similar, but with specific tasks (game titles) resulting in preferences for one scheme over another.
The evolution of symbolic communication is a longstanding open research question in biology. While some theories suggest that it originated from sub-symbolic communication (i.e., iconic or indexical), little experimental evidence exists on how organisms can actually evolve to define a shared set of symbols with unique interpretable meaning, thus being capable of encoding and decoding discrete information. Here, we use a simple synthetic model composed of sender and receiver agents controlled by Continuous-Time Recurrent Neural Networks, which are optimized by means of neuro-evolution. We characterize signal decoding as either regression or classification, with limited and unlimited signal amplitude. First, we show how this choice affects the complexity of the evolutionary search, and leads to different levels of generalization. We then assess the effect of noise, and test the evolved signaling system in a referential game. In various settings, we observe agents evolving to share a dictionary of symbols, with each symbol spontaneously associated to a 1-D unique signal. Finally, we analyze the constellation of signals associated to the evolved signaling systems and note that in most cases these resemble a Pulse Amplitude Modulation system.
In many natural environments, there are different forms of living creatures that successfully accomplish the same task while being diverse in shape and behavior. This biodiversity is what made life capable of adapting to disrupting changes. Being able to reproduce biodiversity in non-biological agents, while still optimizing them for a particular task, might increase their applicability to scenarios where human response to unexpected changes is not possible.
In this work, we focus on Voxel-based Soft Robots (VSRs), a form of robots that grants great freedom in the design of both body and controller and is hence promising in terms of biodiversity. We use evolutionary computation for optimizing, at the same time, body and controller of VSRs for the task of locomotion. We investigate experimentally whether two key factors---evolutionary algorithm (EA) and representation---impact the emergence of biodiversity and if this occurs at the expense of effectiveness. We devise a way for measuring biodiversity, systematically characterizing the robots shape and behavior, and apply it to the VSRs evolved with three EAs and two representations.
The experimental results suggest that the representation matters more than the EA and that there is not a clear trade-off between diversity and effectiveness.
An open question for both natural and artificial evolutionary systems is how, and under what environmental and evolutionary conditions complexity evolves. This study investigates the impact of increasingly complex task environments on the evolution of robot complexity. Specifically, the impact of evolving body-brain couplings on locomotive task performance, where robot evolution was directed by either body-brain exploration (novelty search) or objective-based (fitness function) evolutionary search. Results indicated that novelty search enabled the evolution of increased robot body-brain complexity and efficacy given specific environment conditions. The key contribution is thus the demonstration that body-brain exploration is suitable for evolving robot complexity that enables high fitness robots in specific environments.
Autonomous robots are increasingly used in remote and hazardous environments, where damage to sensory-actuator systems cannot be easily repaired. Such robots must therefore have controllers that continue to function effectively given unexpected malfunctions and damage to robot morphology. This study applies the Intelligent Trial and Error (IT&E) algorithm to adapt hexapod robot control to various leg failures and demonstrates the IT&E map-size parameter as a critical parameter in influencing IT&E adaptive task performance. We evaluate robot adaptation for multiple leg failures on two different map-sizes in simulation and validate evolved controllers on a physical hexapod robot. Results demonstrate a trade-off between adapted gait speed and adaptation duration, dependent on adaptation task complexity (leg damage incurred), where map-size is crucial for generating behavioural diversity required for adaptation.
Reward-based optimization algorithms require both exploration, to find rewards, and exploitation, to maximize performance. The need for efficient exploration is even more significant in sparse reward settings, in which performance feedback is given sparingly, thus rendering it unsuitable for guiding the search process. In this work, we introduce the SparsE Reward Exploration via Novelty and Emitters (SERENE) algorithm, capable of efficiently exploring a search space, as well as optimizing rewards found in potentially disparate areas. Contrary to existing emitters-based approaches, SERENE separates the search space exploration and reward exploitation into two alternating processes. The first process performs exploration through Novelty Search, a divergent search algorithm. The second one exploits discovered reward areas through emitters, i.e. local instances of population-based optimization algorithms. A meta-scheduler allocates a global computational budget by alternating between the two processes, ensuring the discovery and efficient exploitation of disjoint reward areas. SERENE returns both a collection of diverse solutions covering the search space and a collection of high-performing solutions for each distinct reward area. We evaluate SERENE on various sparse reward environments and show it compares favorably to existing baselines.
Evolving effective coordination strategies in tightly coupled multi-agent settings with sparse team fitness evaluations is challenging. It relies on multiple agents simultaneously stumbling upon the goal state to generate a learnable feedback signal. In such settings, estimating an agent's contribution to the overall team performance is extremely difficult, leading to a well-known structural credit assignment problem. This problem is further exacerbated when agents must complete sub-tasks with added spatial and temporal constraints, and different sub-tasks may require different local skills. We introduce MAEDyS, Multiagent Evolution via Dynamic Skill Selection, a hybrid bi-level optimization framework that augments evolutionary methods with policy gradient methods to generate effective coordination policies. MAEDyS learns to dynamically switch between multiple local skills towards optimizing the team fitness. It adopts fast policy gradients to learn several local skills using dense local rewards. It utilizes an evolutionary process to optimize the delayed team fitness by recruiting the most optimal skill at any given time. The ability to switch between various local skills during an episode eliminates the need for designing heuristic mixing functions. We evaluate MAEDyS in complex multiagent coordination environments with spatial and temporal constraints and show that it outperforms prior methods.
As open-ended learning based on divergent search algorithms such as Novelty Search (NS) draws more and more attention from the research community, it is natural to expect that its application to increasingly complex real-world problems will require the exploration to operate in higher dimensional Behavior Spaces (BSs) which will not necessarily be Euclidean. Novelty Search traditionally relies on k-nearest neighbours search and an archive of previously visited behavior descriptors which are assumed to live in a Euclidean space. This is problematic because of a number of issues. On one hand, Euclidean distance and Nearest-neighbour search are known to behave differently and become less meaningful in high dimensional spaces. On the other hand, the archive has to be bounded since, memory considerations aside, the computational complexity of finding nearest neighbours in that archive grows linearithmically with its size. A sub-optimal bound can result in "cycling" in the behavior space, which inhibits the progress of the exploration. Furthermore, the performance of NS depends on a number of algorithmic choices and hyperparameters, such as the strategies to add or remove elements to the archive and the number of neighbours to use in k-nn search. In this paper, we discuss an alternative approach to novelty estimation, dubbed Behavior Recognition based Novelty Search (BR-NS), which does not require an archive, makes no assumption on the metrics that can be defined in the behavior space and does not rely on nearest neighbours search. We conduct experiments to gain insight into its feasibility and dynamics as well as potential advantages over archive-based NS in terms of time complexity.
A core challenge of evolutionary search is the need to balance between exploration of the search space and exploitation of highly fit regions. Quality-diversity search has explicitly walked this tightrope between a population's diversity and its quality. This paper extends a popular quality-diversity search algorithm, MAP-Elites, by treating the selection of parents as a multi-armed bandit problem. Using variations of the upper-confidence bound to select parents from under-explored but potentially rewarding areas of the search space can accelerate the discovery of new regions as well as improve its archive's total quality. The paper tests an indirect measure of quality for parent selection: the survival rate of a parent's offspring. Results show that maintaining a balance between exploration and exploitation leads to the most diverse and high-quality set of solutions in three different testbeds.
Designing optimal soft modular robots is difficult, due to non-trivial interactions between morphology and controller. Evolutionary algorithms (EAs), combined with physical simulators, represent a valid tool to overcome this issue. In this work, we investigate algorithmic solutions to improve the Quality Diversity of co-evolved designs of Tensegrity Soft Modular Robots (TSMRs) for two robotic tasks, namely goal reaching and squeezing trough a narrow passage. To this aim, we use three different EAs, i.e., MAP-Elites and two custom algorithms: one based on Viability Evolution (ViE) and NEAT (ViE-NEAT), the other named Double Map MAP-Elites (DM-ME) and devised to seek diversity while co-evolving robot morphologies and neural network (NN)-based controllers. In detail, DM-ME extends MAP-Elites in that it uses two distinct feature maps, referring to morphologies and controllers respectively, and integrates a mechanism to automatically define the NN-related feature descriptor. Considering the fitness, in the goal-reaching task ViE-NEAT outperforms MAP-Elites and results equivalent to DM-ME. Instead, when considering diversity in terms of "illumination" of the feature space, DM-ME outperforms the other two algorithms on both tasks, providing a richer pool of possible robotic designs, whereas ViE-NEAT shows comparable performance to MAP-Elites on goal reaching, although it does not exploit any map.
In the area of evolutionary computation the calculation of diverse sets of high-quality solutions to a given optimization problem has gained momentum in recent years under the term evolutionary diversity optimization. Theoretical insights into the working principles of baseline evolutionary algorithms for diversity optimization are still rare. In this paper we study the well-known Minimum Spanning Tree problem (MST) in the context of diversity optimization where population diversity is measured by the sum of pairwise edge overlaps. Theoretical results provide insights into the fitness landscape of the MST diversity optimization problem pointing out that even for a population of μ = 2 fitness plateaus (of constant length) can be reached, but nevertheless diverse sets can be calculated in polynomial time. We supplement our theoretical results with a series of experiments for the unconstrained and constraint case where all solutions need to fulfill a minimal quality threshold. Our results show that a simple (μ + 1)-EA can effectively compute a diversified population of spanning trees of high quality.
In contrast with random uniform instances, industrial SAT instances of large size are solvable today by state-of-the-art algorithms. It is believed that this is the consequence of the non-random structure of the distribution of variables into clauses. In order to produce benchmark instances resembling those of real-world formulas with a given structure, generative models have been proposed. In this paper we study the MAX-3SAT problem with model-generated instances having a power-law distribution. Specifically, we target the regions in which computational difficulty undergoes an easy/hard phase transition as a function of clause density and of the power-law exponent. Our approach makes use of a sampling technique to build a graph model (a local optima network) in which nodes are local optima and directed edges are transitions between optima basins. The objective is to relate the structure of the instance fitness landscape with problem difficulty through the transition. We succeed in associating the transition with straightforward network metrics, thus providing a novel and original fitness landscape view of the computational features of the power-law model and its phase transition.
Efficient hill climbers are at the heart of the latest gray-box optimization techniques, where some structural information about the optimization problem is available. Focusing on NK-landscapes as a challenging class of k-bounded pseudo-boolean functions, we propose a quality-enhanced and a time-accelerated hill climber. Our investigations are based on the idea of performing several simultaneous moves at each iteration of the search process. This is enabled using graph coloring to structure the interacting variables of an NK-landscape, and to identify subsets of possibly improving independent moves in the Hamming distance 1 neighborhood. Besides being extremely competitive with respect to the state-of-the-art first- and best- ascent serial variants, our initial design exposes a natural degree of parallelism allowing us to convert our serial algorithm into a parallel hill climber without further design efforts. As such, we also provide a multi-threaded implementation using up to 10 shared-memory CPU-cores. Using a range of large-scale random NK-landscapes with up to 106 variables, we provide evidence on the efficiency and effectiveness of the proposed hill climber and we highlight the strength of our parallel design in attaining substantial acceleration factors for the largest experimented functions.
Expensive black-box combinatorial optimization problems arise in practice when the objective function is evaluated by means of a simulator or a real-world experiment. Since each fitness evaluation is expensive in terms of time or resources, the number of possible evaluations is typically several orders of magnitude smaller than in non-expensive problems. Classical optimization methods are not useful in this scenario. In this paper, we propose and analyze UMM, an estimation-of-distribution (EDA) algorithm based on a Mallows probabilistic model and unbalanced rank aggregation (uBorda). Experimental results on black-box versions of LOP and PFSP show that UMM outperforms the solutions obtained by CEGO, a Bayesian optimization algorithm for combinatorial optimization. Nevertheless, a slight modification to CEGO, based on the different interpretations for rankings and orderings, significantly improves its performance, thus producing solutions that are slightly better than those of UMM and dramatically better than the original version. Another benefit of UMM is that its computational complexity increases linearly with both the number of function evaluations and the permutation size, which results in computation times an order of magnitude shorter than CEGO, making it specially useful when both computation time and number of evaluations are limited.
In this article, we propose an indicator-based branch and bound (I-BB) approach for multi-objective combinatorial optimization that uses a best-first search strategy. In particular, assuming maximizing objectives, the next node to be processed is chosen with respect to the quality of its upper bound. This quality is given by a binary quality indicator, such as the binary hypervolume or the ε-indicator, with respect to the archive of solutions maintained by the branch and bound algorithm. Although the I-BB will eventually identify the efficient set, we are particularly interested in analyzing its anytime behavior as a heuristic. Our experimental results, conducted on a multi-objective knapsack problem with 2, 3, 5, and 7 objectives, indicate that the I-BB can often outperform the naive depth-first and breadth-first search strategies, both in terms of runtime and anytime performance. The improvement is especially significant when the branching order for the decision variables is random, which suggests that the I-BB is particularly relevant when more favorable (problem-dependent) branching orders are not available.
This paper studies a method of generating hard instances of the Inventory Routing Problem (IRP) using evolutionary algorithms. The difficulty of the generated instances is measured by the time used by the CPLEX solver to solve a given instance. In order to ensure that feasible problem instances are generated, the Infeasibility Driven Evolutionary Algorithm (IDEA) is used along with seeding of the population with perturbed copies of an instance taken from a well-known set of benchmark instances. In the experiments a significant increase in problem solving time was observed with respect to IRP instances proposed in the literature. Generated IRP instances do not show any unusual properties that would make them unrealistic and they are made available as benchmarks for optimization algorithms. In the paper an attempt was made to compare generated problem instances and to draw conclusions regarding the structure of these instances, however, this has proven to be a difficult task.
Iterative Partial Transcription (IPT) is an important recombination operator for Traveling Salesman Problem (TSP), and is a key component in the LKH inexact solver for the TSP. IPT shares many characteristics with Partition Crossover for the TSP. However, the standard implementation of IPT searches for all common subchains of sizes from 4 to n/2 between two parent tours and thus has a time complexity of O(n2) in the worst case. In this paper, an efficient implementation of IPT is proposed that uses a special data structure called the Extended Edge Table (EET) to find the recombination components. We prove that the proposed technique is approximately equivalent to IPT and has a time complexity of O(n). The performance of the two implementations of IPT is compared based on some random and benchmark TSP instances to establish the efficiency of the proposed algorithm.
Submodular functions allow to model many real-world optimisation problems. This paper introduces approaches for computing diverse sets of high quality solutions for submodular optimisation problems with uniform and knapsack constraints. We first present diversifying greedy sampling approaches and analyse them with respect to the diversity measured by entropy and the approximation quality of the obtained solutions. Afterwards, we introduce an evolutionary diversity optimisation (EDO) approach to further improve diversity of the set of solutions. We carry out experimental investigations on popular submodular benchmark problems and analyse trade-offs in terms of solution quality and diversity of the resulting solution sets.
There has been intensive research on tiebreakers for insertion-based constructive heuristics in the permutation flow shop scheduling problem when the objective is to minimize the makespan. In this paper we analyze the space of all possible tie-breaking rules for the most common insertion orders, and evaluate the efficacy of existing tiebreakers based on this analysis. We find that an optimal tie breaker would produce results that are about 1% better than current best tie breakers. We propose a constructive heuristic based on a truncated cyclic best-first search in the space of all tie breakers and show that it closely approximates the solutions that such an optimal tie breaker can achieve.
In local search algorithms, the pivoting rule determines which neighboring solution to select and thus strongly influences the behavior of the algorithm and its capacity to sample good-quality local optima. The classical pivoting rules are first and best improvement, with alternative rules such as worst improvement and maximum expansion recently studied on hill-climbing algorithms. This article conducts a thorough empirical comparison of five pivoting rules (best, first, worst, approximated worst and maximum expansion) on two benchmark combinatorial problems, NK landscapes and the unconstrained binary quadratic problem (UBQP), with varied sizes and ruggedness. We present both a performance analysis of the alternative pivoting rules within an iterated local search (ILS) framework and a fitness landscape analysis and visualization using local optima networks. Our results reveal that the performance of the pivoting rules within an ILS framework may differ from their performance as single climbers and that worst improvement and maximum expansion can outperform classical pivoting rules.
Genetic Programming Hyper-Heuristic (GPHH) is a promising technique to automatically evolve effective routing policies to handle the uncertain environment in the Uncertain Capacitated Arc Routing Problem (UCARP). Previous studies mainly focus on the effectiveness of the evolved routing policies, but the size is ignored. This paper aims to develop new GPHH methods to optimise the effectiveness and the size simultaneously. There are two challenges. First, it is much easier for GP to generate small but ineffective individuals than effective ones, thus the search can be easily stuck with small but ineffective individuals. Second, the effectiveness evaluation in GPHH is stochastic, making it challenging to identify and retain effective individuals. To address these issues, we develop a Two-Stage Multi-Objective GP algorithm with Archive (TSNSGPII-a). The two-stage framework addresses the bias towards the size. The external archive stores potentially effective individuals that may be lost during the evolution, and reuses them to generate offspring. The experimental results show that TSNSGPII-a can obtain significantly better routing policies than the existing state-of-the-art approaches in terms of both effectiveness and size. If selecting the most effective routing policy from the Pareto front, TSNSGPII-a can obtain significantly smaller routing policies with statistically comparable or significantly better effectiveness.
Genetic Algorithms (GA) explore the search space of an objective function via mutation and recombination. Crucial for the search dynamics is the maintenance of diversity in the population.
Inspired by tabu search we add a mechanism to an elitist GA's selection step to ensure diversification by excluding already visited areas of the search space. To this end, we use Bloom filters as a probabilistic data structure to store a (quasi-)infinite history.
We discuss how this approach fits into the niching idea for GAs, finds an analogy in generational correlation via epi-genetics in biology, and how the approach can be regarded as a co-evolutionary edge case of previous niching techniques.
Furthermore, we apply this new technique to an NP-hard combinatorial optimization problem (ground states of Ising spin glasses). We find by large-scale hyperparameter scans that our elitist GA with quasi-infinite memory consistently outperforms its respective standard GA variant without a Bloom filter.
Metalearning of deep neural network (DNN) architectures and hyperparameters has become an increasingly important area of research. Loss functions are a type of metaknowledge that is crucial to effective training of DNNs, however, their potential role in metalearning has not yet been fully explored. Whereas early work focused on genetic programming (GP) on tree representations, this paper proposes continuous CMA-ES optimization of multivariate Taylor polynomial parameterizations. This approach, TaylorGLO, makes it possible to represent and search useful loss functions more effectively. In MNIST, CIFAR-10, and SVHN benchmark tasks, TaylorGLO finds new loss functions that outperform the standard cross-entropy loss as well as novel loss functions previously discovered through GP, in fewer generations. These functions serve to regularize the learning task by discouraging overfitting to the labels, which is particularly useful in tasks where limited training data is available. The results thus demonstrate that loss function optimization is a productive new avenue for metalearning.
In cluster analysis, the automatic clustering problem refers to the determination of both the appropriate number of clusters and the corresponding natural partitioning. This can be addressed as an optimization problem in which a cluster validity index (CVI) is used as a fitness function to evaluate the quality of potential solutions. Different CVIs have been proposed in the literature, aiming to identify adequate cluster solutions in terms of intracluster cohesion and intercluster separation. However, it is important to identify the scenarios in which these CVIs perform well and their limitations. This paper evaluates the effectiveness of 22 different CVIs used as fitness functions in an evolutionary clustering algorithm named ACDE based on differential evolution. Several synthetic datasets are considered: linearly separable data having both well-separated and overlapped clusters, and non-linearly separable data having arbitrarily-shaped clusters. Besides, real-life datasets are also considered. The experimental results indicate that the Silhouette index consistently reached an acceptable performance in linearly separable data. Furthermore, the indices Calinski-Harabasz, Davies-Bouldin, and generalized Dunn obtained an adequate clustering performance in synthetic and real-life datasets. Notably, all the evaluated CVIs performed poorly in clustering the non-linearly separable data because of the assumptions about data distributions.
Metalearning of deep neural network (DNN) architectures and hyperparameters has become an increasingly important area of research. At the same time, network regularization has been recognized as a crucial dimension to effective training of DNNs. However, the role of metalearning in establishing effective regularization has not yet been fully explored. There is recent evidence that loss-function optimization could play this role, however it is computationally impractical as an outer loop to full training. This paper presents an algorithm called Evolutionary Population-Based Training (EPBT) that interleaves the training of a DNN's weights with the metalearning of loss functions. They are parameterized using multivariate Taylor expansions that EPBT can directly optimize. Such simultaneous adaptation of weights and loss functions can be deceptive, and therefore EPBT uses a quality-diversity heuristic called Novelty Pulsation as well as knowledge distillation to prevent overfitting during training. On the CIFAR-10 and SVHN image classification benchmarks, EPBT results in faster, more accurate learning. The discovered hyperparameters adapt to the training process and serve to regularize the learning task by discouraging overfitting to the labels. EPBT thus demonstrates a practical instantiation of regularization metalearning based on simultaneous training.
The XCS classifier system adaptively controls a rule-generality of a rule-condition through a rule-discovery process. However, there is no proof that the rule-generality can eventually converge to its optimum value even under some ideal assumptions. This paper conducts a convergence analysis of the rule-generality on the rule-discovery process with the ternary alphabet coding. Our analysis provides the first proof that an average rule-generality of rules in a population can converge to its optimum value under some assumptions. This proof can be used to mathematically conclude that the XCS framework has a natural pressure to explore rules toward optimum rules if XCS satisfies our derived conditions. In addition, our theoretical result returns a rough setting-up guideline for the maximum population size, the mutation rate, and the GA threshold, improving the convergence speed of the rule-generality and the XCS performance.
In Multi-label (ML) classification, each instance is associated with more than one class label. Incorporating the label correlations into the model is one of the increasingly studied areas in the last decade, mostly due to its potential in training more accurate predictive models and dealing with noisy/missing labels. Previously, multi-label learning classifier systems have been proposed that incorporate the high-order label correlations into the model through the label powerset (LP) technique. However, such a strategy cannot take advantage of the valuable statistical information in the label space to make more accurate inferences. Such information becomes even more crucial in ML image classification problems where the number of labels can be very large. In this paper, we propose a multi-label learning classifier system that leverages a structured representation for the labels through undirected graphs to utilize the label similarities when evolving rules. More specifically, we propose a novel scheme for covering classifier actions, as well as a method to calculate ML prediction arrays. The effectiveness of this method is demonstrated by experimenting on multiple benchmark datasets and comparing the results with multiple well-known ML classification algorithms.
In classification, when class overlap is intertwined with the issue of class imbalance, it is often challenging to discover useful patterns because of an ambiguous boundary between the majority class and the minority class. This becomes more difficult if the data is high-dimensional. To date, very few pieces of work have investigated how the class overlap issue can be effectively addressed or alleviated in classification with high-dimensional unbalanced data. In this paper, we propose a new genetic programming based method, which is able to automatically and directly detect borderline instances, in order to address the class overlap issue in classification with high-dimensional unbalanced data. In the proposed method, each individual has two trees to be trained together based on different classification rules. The proposed method is examined and compared with baseline methods on high-dimensional unbalanced datasets. Experimental results show that the proposed method achieves better classification performance than the baseline methods in almost all cases.
We put forward a novel learning methodology for ensembles of decision trees based on a genetic algorithm that is able to train a decision tree for maximizing both its accuracy and its robustness to adversarial perturbations. This learning algorithm internally leverages a complete formal verification technique for robustness properties of decision trees based on abstract interpretation, a well-known static program analysis technique. We implemented this genetic adversarial training algorithm in a tool called MetaSilvae and we experimentally evaluated it on some standard reference datasets used in adversarial training. The experimental results show that MetaSilvae is able to train robust models that compete with and often improve on the current state-of-the-art of adversarial training of decision trees while being much more compact and therefore interpretable and efficient tree models.
Automated machine learning (AutoML) strives for automatically constructing and configuring compositions of machine learning algorithms, called pipelines, with the goal to optimize a suitable performance measure on a concrete learning task. So far, most AutoML tools are focused on standard problem classes, such as classification and regression. In the field of predictive maintenance, especially the estimation of remaining useful lifetime (RUL), the task of AutoML becomes more complex. In particular, a good feature representation for multivariate sensor data is essential to achieve good performance. Due to the need for methods generating feature representations, the search space of candidate pipelines enlarges. Moreover, the runtime of a single pipeline increases substantially. In this paper, we tackle these problems by partitioning the search space into two sub-spaces, one for feature extraction methods and one for regression methods, and employ cooperative coevolution for searching a good combination. Thereby, we benefit from the fact that the generated feature representations can be cached, whence the evaluation of multiple regressors based on the same feature representation speeds up, allowing the evaluation of more candidate pipelines. Experimentally, we show that our coevolutionary strategy performs superior to the baselines.
Generative adversarial networks (GANs) exhibit training pathologies that can lead to convergence-related degenerative behaviors, whereas spatially-distributed, coevolutionary algorithms (CEAs) for GAN training, e.g. Lipizzaner, are empirically robust to them. The robustness arises from diversity that occurs by training populations of generators and discriminators in each cell of a toroidal grid. Communication, where signals in the form of parameters of the best GAN in a cell propagate in four directions: North, South, West and East, also plays a role, by communicating adaptations that are both new and fit. We propose Lipi-Ring, a distributed CEA like Lipizzaner, except that it uses a different spatial topology, i.e. a ring. Our central question is whether the different directionality of signal propagation (effectively migration to one or more neighbors on each side of a cell) meets or exceeds the performance quality and training efficiency of Lipizzaner. Experimental analysis on different datasets (i.e, MNIST, CelebA, and COVID-19 chest X-ray images) shows that there are no significant differences between the performances of the trained generative models by both methods. However, Lipi-Ring significantly reduces the computational time (14.2%... 41.2%). Thus, Lipi-Ring offers an alternative to Lipizzaner when the computational cost of training matters.
Graph neural networks (GNNs) have been proposed for a wide range of graph-related learning tasks. In particular, in recent years, an increasing number of GNN systems were applied to predict molecular properties. However, a direct impediment is to select appropriate hyperparameters to achieve satisfactory performance with lower computational cost. Meanwhile, many molecular datasets are far smaller than many other datasets in typical deep learning applications. Most hyperparameter optimization (HPO) methods have not been explored in terms of their efficiencies on such small datasets in the molecular domain. In this paper, we conducted a theoretical analysis of common and specific features for two state-of-the-art and popular algorithms for HPO: TPE and CMA-ES, and we compared them with random search (RS), which is used as a baseline. Experimental studies are carried out on several benchmarks in MoleculeNet, from different perspectives to investigate the impact of RS, TPE, and CMA-ES on HPO of GNNs for molecular property prediction. In our experiments, we concluded that RS, TPE, and CMA-ES have their individual advantages in tackling different specific molecular problems. Finally, we believe our work will motivate further research on GNN as applied to molecular machine learning problems in chemistry and materials sciences.
Pareto compliance is a critical property of quality indicators (QIs) focused on measuring convergence to the Pareto front. This property allows a QI not to contradict the order imposed by the Pareto dominance relation. Hence, Pareto-compliant QIs are remarkable when comparing approximation sets of multi-objective evolutionary algorithms (MOEAs) since they do not produce misleading conclusions. However, the practical effect of Pareto-compliant QIs as the backbone of MOEAs' survival mechanisms is not entirely understood. In this paper, we study the practical effect of IGD++ (which is a combination of IGD+ and the hypervolume indicator), IGD+, and IGD, which are Pareto-compliant, weakly Pareto-compliant, and not Pareto-compliant QIs, respectively. To this aim, we analyze the convergence and diversity properties of steady-state MOEAs based on the previously mentioned QIs throughout the whole evolutionary process. Our experimental results showed that, in general, the practical effect of a Pareto-compliant QI in a selection mechanism is not very significant concerning weaker QIs, taking into account the whole evolutionary process.
Normalization is an important algorithmic component for multiobjective evolutionary algorithms (MOEAs). Different normalization methods have been proposed in the literature. Recently, several studies have been conducted to examine the effects of normalization methods. However, the existing evaluation methods for investigating the effects of normalization are limited due to their drawbacks. In this paper, we discuss the limitations of the existing evaluation methods. A new metric has been proposed to facilitate the investigation of normalization methods. Our analysis clearly shows the superiority of the proposed metric over the existing methods. We also use the proposed metric to compare three popular normalization methods on problems with different Pareto front shapes.
We propose a new algorithm for the extreme hypervolume contributor/contribution problem, i.e. the problem of finding the point with minimum/maximum contribution to the hypervolume and (optionally) the value of this extreme contribution. Our algorithm is motivated by the Improved Quick Hypervolume (IQHV) which works in a divide and conquer manner, and uses lower and upper bounds of the contribution to speed up calculations. It may be applied directly in multiobjective optimization methods that select solutions based on the extreme contribution to the hypervolume, or within the greedy algorithms for hypervolume subset selection problem often used in EMO. The proposed algorithm is the first dedicated algorithm for the maximum contribution problem and in the reported computational experiment it outperforms on most data sets state-of-the-art IWFG algorithm for the minimum contribution problem. We show that the proposed algorithm uses the lower and upper bounds very well by comparing it to a reference algorithm having an access to an oracle providing the best contributor in advance. We propose also a modification of the general IQHV algorithm scheme in which the order of objectives processing which influences construction of sub-problems is heuristically adapted and show that this modification improves the efficiency of the algorithm.
In this paper, we demonstrate the application of features from landscape analysis, initially proposed for multi-objective combinatorial optimisation, to a benchmark set of 1 200 randomly-generated multiobjective interpolated continuous optimisation problems (MO-ICOPs). We also explore the benefits of evaluating the considered landscape features on the basis of a fixed-size sampling of the search space. This allows fine control over cost when aiming for an efficient application of feature-based automated performance prediction and algorithm selection. While previous work shows that the parameters used to generate MO-ICOPs are able to discriminate the convergence behaviour of four state-of-the-art multi-objective evolutionary algorithms, our experiments reveal that the proposed (black-box) landscape features used as predictors deliver a similar accuracy when combined with a classification model. In addition, we analyse the relative importance of each feature for performance prediction and algorithm selection.
The hypervolume indicator is widely used by multi-objective optimization algorithms and for assessing their performance. We investigate a set of p vectors in the biobjective space that maximizes the hypervolume indicator with respect to some reference point, referred to as p-optimal distribution. We prove explicit lower and upper bounds on the gap between the hypervolumes of the p-optimal distribution and the p-optimal distribution (the Pareto front) as a function of p, of the reference point, and of some Lipschitz constants. On a wide class of functions, this optimality gap can not be smaller than p(1/p), thereby establishing a bound on the optimal convergence speed of any algorithm. For functions with either bilipschitz or convex Pareto fronts, we also establish an upper bound and the gap is hence Θ(1/p). The presented bounds are not only asymptotic. In particular, functions with a linear Pareto front have the normalized exact gap of 1/(p + 1) for any reference point dominating the nadir point.
We empirically investigate on a small set of Pareto fronts the exact optimality gap for values of p up to 1000 and find in all cases a dependency resembling 1/(p + CONST).
In this paper, we revisit the distance-based subset selection (DSS) algorithm in evolutionary multi-objective optimization. First, we show one drawback of the DSS algorithm, i.e., a uniformly distributed solution set cannot always be selected. Then, we show that this drawback can be overcome by maximizing the uniformity level of the selected solution set, which is defined by the minimum distance between two solutions in the solution set. Furthermore, we prove that the DSS algorithm is a greedy inclusion algorithm with respect to the maximization of the uniformity level. Based on this conclusion, we generalize DSS as a subset selection problem where the objective is to maximize the uniformity level of the subset. In addition to the greedy inclusion DSS algorithm, a greedy removal algorithm and an iterative algorithm are proposed for the generalized DSS problem. We also extend the Euclidean distance in the original DSS to other widely-used and user-defined distances. We conduct extensive experiments on solution sets over different types of Pareto fronts to compare the three DSS algorithms with different distances. Our results suggest the usefulness of the generalized DSS for selecting a uniform subset. The effect of using different distances on the selected subsets is also analyzed.
Hypervolume subset selection (HSS) aims to select a subset from a candidate solution set so that the hypervolume of the selected subset is maximized. Due to its NP-hardness nature, the greedy algorithms are the most efficient for solving HSS in many-objective optimization. However, when the number of objectives is large, the calculation of the hypervolume contribution in the greedy HSS is time-consuming, which makes the greedy HSS inefficient. To solve this issue, in this paper we propose a greedy approximated HSS algorithm. The main idea is to use an R2-based hypervolume contribution approximation method in the greedy HSS. In the algorithm implementation, a utility tensor structure is introduced to facilitate the calculation of the hypervolume contribution approximation. In addition, the tensor information in the last step is utilized in the current step to accelerate the calculation. We also apply the lazy strategy in the proposed algorithm to further improve its efficiency. We test the greedy approximated HSS algorithm on 3-10 objective candidate solution sets. The experimental results show that the proposed algorithm is much faster than the state-of-the-art greedy HSS algorithm in many-objective optimization while their hypervolume performance is almost the same.
Improvements to the design of interactive Evolutionary Multiobjective Algorithms (iEMOAs) are unlikely without quantitative assessment of their behaviour in realistic settings. Experiments with human decision-makers (DMs) are of limited scope due to the difficulty of isolating individual biases and replicating the experiment with enough subjects, and enough times, to obtain confidence in the results. Simulation studies may help to overcome these issues, but they require the use of realistic simulations of decision-makers. Machine decision-makers (MDMs) provide a way to carry out such simulation studies, however, studies so far have relied on simple utility functions. In this paper, we analyse and compare two state-of-the-art iEMOAs by means of a MDM that uses a sigmoid-shaped utility function. This sigmoid utility function is based on psychologically realistic models from behavioural economics, and replicates several realistic human behaviours. Our findings are that, on a variety of well-known benchmarks with two and three objectives, the two iEMOAs do not consistently recover the most-preferred points. We hope that these findings provide an impetus for more directed design and analysis of future iEMOAs.
This work proposes a Bayesian optimisation with Gaussian Process approach to learn decision maker (DM) preferences in the attribute search space of a multi-objective optimisation problem (MOP). The DM is consulted periodically during optimisation of the problem and asked to provide their preference over a series of pairwise comparisons of candidate solutions. After each consultation, the most preferred solution is used as the reference point in an appropriate multiobjective optimisation evolutionary algorithm (MOEA). The rationale for using Bayesian optimisation is to identify the most preferred location in the decision search space with the least number of DM queries, thereby minimising DM cognitive burden and fatigue. This enables non-expert DMs to be involved in the optimisation process and make more informed decisions. We further reduce the number of preference queries required, by progressively redefining the Bayesian search space to reflect the MOEA's decision bounds as it converges toward the Pareto Front. We demonstrate how this approach can locate a reference point close to an unknown preferred location on the Pareto Front, of both benchmark and real-world problems with relatively few pairwise comparisons.
We propose a novel algorithm, called iMOEA-HA, for interactive evolutionary multiple objective optimization. It combines, in an original way, the state-of-the-art paradigms postulated in evolutionary multiple objective optimization and multiple criteria decision aiding. When it comes to the incorporated evolutionary framework, iMOEA-HA implements a steady-state, quasi decomposition-based model with a restricted mating pool. To drive the population towards the Decision Maker's (DM's) most preferred region in the Pareto front, iMOEA-HA exploits the DM's pairwise comparisons of solutions from the current population and assesses all solutions as imposed by their holistic acceptability indices. These acceptabilities are built on the probabilities of attaining a certain number of the most preferred ranks. We evaluate the proposed algorithm's performance on a vast set of optimization problems with different properties and numbers of objectives. Moreover, we compare iMOEA-HA with existing interactive evolutionary algorithms that can be deemed its predecessors, proving its high effectiveness and competitiveness.
The quality of solutions in multiobjective evolutionary algorithms (MOEAs) is usually evaluated by objective functions. However, function evaluations (FEs) are usually time-consuming in real-world problems. A large number of FEs limit the application of MOEAs. In this paper, we propose a fuzzy classifier-based selection strategy to reduce the number of FEs of MOEAs. First, all evaluated solutions in previous generations are used to build a fuzzy classifier. Second, the built fuzzy classifier is used to predict each unevaluated solution's label and its membership degree. The reproduction procedure is repeated to generate enough offspring solutions (classified as positive by the classifier). Next, unevaluated solutions are sorted based on their membership degrees in descending order. The same number of solutions as the population size are selected from the top of the sorted unevaluated solutions. Then, the best half of the chosen solutions are selected and stored in the new population without evaluations. The other half solutions are evaluated. Finally, the evaluated solutions are used together with evaluated current solutions for environmental selection to form another half of the new population. The proposed strategy is integrated into two MOEAs. Our experimental results demonstrate the effectiveness of the proposed strategy on reducing FEs.
A major approach to saddle point optimization minx maxy f(x, y) is a gradient based approach as is popularized by generative adversarial networks (GANs). In contrast, we analyze an alternative approach relying only on an oracle that solves a minimization problem approximately. Our approach locates approximate solutions x′ and y′ to minx′ f(x′, y) and maxy′ f(x, y′) at a given point (x, y) and updates (x, y) toward these approximate solutions (x′, y′) with a learning rate η. On locally strong convex-concave smooth functions, we derive conditions on η to exhibit linear convergence to a local saddle point, which reveals a possible shortcoming of recently developed robust adversarial reinforcement learning algorithms. We develop a heuristic approach to adapt η derivative-free and implement zero-order and first-order minimization algorithms. Numerical experiments are conducted to show the tightness of the theoretical results as well as the usefulness of the η adaptation mechanism.
Differential MAP-Elites is a novel algorithm that combines the illumination capacity of CVT-MAP-Elites with the continuous-space optimization capacity of Differential Evolution. The algorithm is motivated by observations that illumination algorithms, and quality-diversity algorithms in general, offer qualitatively new capabilities and applications for evolutionary computation yet are in their original versions relatively unsophisticated optimizers. The basic Differential MAP-Elites algorithm, introduced for the first time here, is relatively simple in that it simply combines the operators from Differential Evolution with the map structure of CVT-MAP-Elites. Experiments based on 25 numerical optimization problems suggest that Differential MAP-Elites clearly outperforms CVT-MAP-Elites, finding better-quality and more diverse solutions.
In this study, we analyze behaviours of the well-known CMA-ES by extracting the time-series features on its dynamic strategy parameters. An extensive experiment was conducted on twelve CMA-ES variants and 24 test problems taken from the BBOB (Black-Box Optimization Bench-marking) testbed, where we used two different cutoff times to stop those variants. We utilized the tsfresh package for extracting the features and performed the feature selection procedure using the Boruta algorithm, resulting in 32 features to distinguish either CMA-ES variants or the problems. After measuring the number of predefined targets reached by those variants, we contrive to predict those measured values on each test problem using the feature. From our analysis, we saw that the features can classify the CMA-ES variants, or the function groups decently, and show a potential for predicting the performance of those variants. We conducted a hierarchical clustering analysis on the test problems and noticed a drastic change in the clustering outcome when comparing the longer cutoff time to the shorter one, indicating a huge change in search behaviour of the algorithm. In general, we found that with longer time series, the predictive power of the time series features increase.
In this paper, we investigate a non-elitist Evolution Strategy designed to handle black-box constraints by an adaptive Augmented Lagrangian penalty approach, AL-(μ/μw, λ)-CMA-ES, on problems with up to 28 constraints. Based on stability and performance observations, we propose an improved default parameter setting. We exhibit failure cases of the Augmented Lagrangian technique and show how surrogate modeling of the constraints can overcome some difficulties. Several variants of AL-CMA-ES are compared on a set of nonlinear constrained problems from the literature. Simple adaptive penalty techniques serve as a baseline for comparison.
Surrogate regression models have been shown as a valuable technique in evolutionary optimization to save evaluations of expensive black-box objective functions. Each surrogate modelling method has two complementary components: the employed model and the control of when to evaluate the model and when the true objective function, aka evolution control. They are often tightly interconnected, which causes difficulties in understanding the impact of each component on the algorithm performance. To contribute to such understanding, we analyse what constitutes the evolution control of three surrogate-assisted versions of the state-of-the-art algorithm for continuous black-box optimization --- the Covariance Matrix Adaptation Evolution Strategy. We implement and empirically compare all possible combinations of the regression models employed in those methods with the three evolution controls encountered in them. An experimental investigation of all those combinations allowed us to asses the influence of the models and their evolution control separately. The experiments are performed on the noiseless and noisy benchmarks of the Comparing-Continuous-Optimisers platform and a real-world simulation benchmark, all in the expensive scenario, where only a small budget of evaluations is available.
An evolution strategy design is presented that allows for an evolution on general quadratic manifolds. That is, it covers elliptic, parabolic, and hyperbolic equality constraints. The peculiarity of the presented algorithm design is that it is an interior point method. It evaluates the objective function only for feasible search parameter vectors and it evolves itself on the nonlinear constraint manifold. This is achieved by a closed form transformation of an individual's parameter vector, which is in contrast to iterative repair mechanisms. Results of different experiments are presented. A test problem consisting of a spherical objective function and a single hyperbolic/parabolic equality constraint is used. It is designed to be scalable in the dimension and it is used to compare the performance of the developed algorithm with other optimization methods supporting constraints. The experiments show the effectiveness of the proposed algorithm on the considered problems.
Although exploratory landscape analysis (ELA) has shown its effectiveness in various applications, most previous studies focused only on low- and moderate-dimensional problems. Thus, little is known about the scalability of the ELA approach for large-scale optimization. In this context, first, this paper analyzes the computational cost of features in the flacco package. Our results reveal that two important feature classes (ela_level and ela_meta) cannot be applied to large-scale optimization due to their high computational cost. To improve the scalability of the ELA approach, this paper proposes a dimensionality reduction framework that computes features in a reduced lower-dimensional space than the original solution space. We demonstrate that the proposed framework can drastically reduce the computation time of ela_level and ela_meta for large dimensions. In addition, the proposed framework can make the cell-mapping feature classes scalable for large-scale optimization. Our results also show that features computed by the proposed framework are beneficial for predicting the high-level properties of the 24 large-scale BBOB functions.
In practise, it is often desirable to provide the decision-maker with a rich set of diverse solutions of decent quality instead of just a single solution. In this paper we study evolutionary diversity optimization for the knapsack problem (KP). Our goal is to evolve a population of solutions that all have a profit of at least (1 - ε) · OPT, where OPT is the value of an optimal solution. Furthermore, they should differ in structure with respect to an entropy-based diversity measure. To this end we propose a simple (μ + 1)-EA with initial approximate solutions calculated by a well-known FPTAS for the KP. We investigate the effect of different standard mutation operators and introduce biased mutation and crossover which puts strong probability on flipping bits of low and/or high frequency within the population. An experimental study on different instances and settings shows that the proposed mutation operators in most cases perform slightly inferior in the long term, but show strong benefits if the number of function evaluations is severely limited.
In the last few decades, evolutionary algorithms were successfully applied numerous times for creating Boolean functions with good cryptographic properties. Still, the applicability of such approaches was always limited as the cryptographic community knows how to construct suitable Boolean functions with deterministic algebraic constructions. Thus, evolutionary results so far helped to increase the confidence that evolutionary techniques have a role in cryptography, but at the same time, the results themselves were seldom used.
This paper considers a novel problem using evolutionary algorithms to improve Boolean functions obtained through algebraic constructions. To this end, we consider a recent generalization of Hidden Weight Boolean Function construction, and we show that evolutionary algorithms can significantly improve the cryptographic properties of the functions. Our results show that the genetic algorithm performs by far the best of all the considered algorithms and improves the nonlinearity property in all Boolean function sizes. As there are no known algebraic techniques to reach the same goal, we consider this application a step forward in accepting evolutionary algorithms as a powerful tool in the cryptography domain.
Generating diverse populations of high quality solutions has gained interest as a promising extension to the traditional optimization tasks. We contribute to this line of research by studying evolutionary diversity optimization for two of the most prominent permutation problems, namely the Traveling Salesperson Problem (TSP) and Quadratic Assignment Problem (QAP). We explore the worst-case performance of a simple mutation-only evolutionary algorithm with different mutation operators, using an established diversity measure. Theoretical results show most mutation operators for both problems ensure production of maximally diverse populations of sufficiently small size within cubic expected run-time. We perform experiments on QAPLIB instances in unconstrained and constrained settings, and reveal much more optimistic practical performances. Our results should serve as a baseline for future studies.
We propose a novel surrogate-assisted Evolutionary Algorithm for solving expensive combinatorial optimization problems. We integrate a surrogate model, which is used for fitness value estimation, into a state-of-the-art P3-like variant of the Gene-Pool Optimal Mixing Algorithm (GOMEA) and adapt the resulting algorithm for solving non-binary combinatorial problems. We test the proposed algorithm on an ensemble learning problem. Ensembling several models is a common Machine Learning technique to achieve better performance. We consider ensembles of several models trained on disjoint subsets of a dataset. Finding the best dataset partitioning is naturally a combinatorial non-binary optimization problem. Fitness function evaluations can be extremely expensive if complex models, such as Deep Neural Networks, are used as learners in an ensemble. Therefore, the number of fitness function evaluations is typically limited, necessitating expensive optimization techniques. In our experiments we use five classification datasets from the OpenML-CC18 benchmark and Support-vector Machines as learners in an ensemble. The proposed algorithm demonstrates better performance than alternative approaches, including Bayesian optimization algorithms. It manages to find better solutions using just several thousand fitness function evaluations for an ensemble learning problem with up to 500 variables.
Symbolic regression aims to hypothesize a functional relationship involving explanatory variables and one or more dependent variables, based on examples of the desired input-output behavior. Genetic programming is a meta-heuristic commonly used in the literature to achieve this goal. Even though Symbolic Regression is sometimes associated with the potential of generating interpretable expressions, there is no guarantee that the returned function will not contain complicated constructs or even bloat. The Interaction-Transformation (IT) representation was recently proposed to alleviate this issue by constraining the search space to expressions following a simple and comprehensive pattern. In this paper, we resort to Simulated Annealing to search for a symbolic expression using the IT representation. Simulated Annealing exhibits an intrinsic ability to escape from poor local minima, which is demonstrated here to yield competitive results, particularly in terms of generalization, when compared with state-of-the-art Symbolic Regression techniques, that depend on population-based meta-heuristics, and committees of learning machines.
Computing diverse sets of high-quality solutions has gained increasing attention among the evolutionary computation community in recent years. It allows practitioners to choose from a set of high-quality alternatives. In this paper, we employ a population diversity measure, called the high-order entropy measure, in an evolutionary algorithm to compute a diverse set of high-quality solutions for the Traveling Salesperson Problem. In contrast to previous studies, our approach allows diversifying segments of tours containing several edges based on the entropy measure. We examine the resulting evolutionary diversity optimisation approach precisely in terms of the final set of solutions and theoretical properties. Experimental results show significant improvements compared to a recently proposed edge-based diversity optimisation approach when working with a large population of solutions or long segments.
Problem decomposition is an important part of many state-of-the-art Evolutionary Algorithms (EAs). The quality of the decomposition may be decisive for the EA effectiveness and efficiency. Therefore, in this paper, we focus on the recent proposition of Linkage Learning based on Local Optimization (3LO). 3LO is an empirical linkage learning (ELL) technique and is proven never to report the false linkage. False linkage is one of the possible linkage defects and occurs when linkage marks two independent genes as a dependent. Although thanks to the problem decomposition quality, the use of 3LO may lead to excellent results, its main disadvantage is its high computational cost. This disadvantage makes 3LO not applicable to state-of-the-art EAs that originally employed Statistical-based Linkage Learning (SLL) and frequently update the linkage information. Therefore, we propose the Direct Linkage Empirical Discovery technique (DLED) that preserves 3LO advantages, reduces its costs, and we prove that it is precise in recognizing the direct linkage. The concept of direct linkage, which we identify in this paper, is related to the quality of the decomposition of overlapping problems. The results show that incorporating DLED in three significantly different state-of-the-art EAs may lead to promising results.
In this paper we address the Euclidean Steiner tree problem in the plane in the presence of soft and solid polygonal obstacles. The Euclidean Steiner tree problem is a well-known NP-hard problem with different applications in network design. Given a set of terminal nodes in the plane the aim is to find a shortest-length interconnection of the terminals allowing further nodes, so-called Steiner points, to be added. In many real-life scenarios there are further constraints that need to be considered. Regions in the plane that cannot be traversed or can only be traversed at a higher cost can be approximated by polygonal areas that either need to be avoided (solid obstacles) or come with a higher cost of traversing (soft obstacles). We propose a genetic algorithm that uses problem-specific representation and operators to solve this problem and show that the algorithm can solve various test scenarios of different sizes. The presented approach appears to outperform current heuristic approaches for the Steiner tree problem with soft obstacles and was evaluated on larger test instances as well.
Partition crossover (PX) is an efficient recombination operator for gray-box optimization. PX is applied in problems where the objective function can be written as a sum of subfunctions fl(.). In PX, the variable interaction graph (VIG) is decomposed by removing vertices with common variables. Parent variables are inherited together during recombination if they are part of the same connected recombining component of the decomposed VIG. A new way of generating the recombination graph is proposed here. The VIG is decomposed by removing edges associated with subfunctions fl(.) that have similar evaluation for combinations of variables inherited from the parents. By doing so, the partial evaluations of fl(.) are taken into account when decomposing the VIG. This allows the use of partition crossover in continuous optimization. Results of experiments where local optima are recombined indicate that more recombining components are found. When the proposed epsilon-PX (ePX) is compared with other recombination operators in Genetic Algorithms and Differential Evolution, better performance is obtained when the epistasis degree is low.
A parallel ensemble of Genetic Algorithms for the Traveling Salesman Problem (TSP) is proposed. Different TSP solvers perform efficiently on different instance types. However, finding the best solver for all instances is challenging. A hybrid of the Mixing Genetic Algorithm (MGA) and Edge Assembly Crossover (EAX) has been shown to perform well on hard instances. The MGA uses Generalized Partition Crossover (GPX) to find the best and worst out of 2k possible solutions, where k is a decomposition factor of two-parent tours. MGA mixes the edges without any loss of diversity in the population. The best individuals move to the top of the population. The worst individuals are filtered to the bottom of the population. Previously, MGA was applied to TSP instances with less than 4,500 vertices. In this article, various Island Model implementations of MGA are introduced to handle larger problem sizes. The island model uses two mixing policies - migration, which does not lose diversity, and replacement, which loses some population diversity. The islands are configured in two patterns - a ring and a hypercube. An ensemble running multiple versions of an hybrid of MGA and EAX algorithms yields excellent performance for problems as large as 85,900.
In Evolutionary Computation, it is informative to ask what happens when well known benchmarks and bit representations are transformed into quadratic pseudo-Boolean optimization problems. Such transforms are commonly used in quantum computing in order to reduce nonlinearity to k-bounded interactions. First we show that Gray code representations are transformed into linear encodings with quadratic constraints. Next we look at Long Path problems which are constructed so that bit flip local search requires exponential time to converge to a global or local optimum. We show that Long Path problems are similar to reflected Gray codes in both construction and complexity. Finally, a basic form of the "Needle in a haystack" problem is transformed into a problem that can be optimally solved in linear time.
In the last two decades, significant effort has been made to solve computationally expensive optimization problems using surrogate models. Regardless of whether surrogates are the primary drivers of an algorithm or improve the convergence of an existing method, most proposed concepts are rather specific and not very generalizable. Some important considerations are selecting a baseline optimization algorithm, a suitable surrogate methodology, and the surrogate's involvement in the overall algorithm design. This paper proposes a probabilistic surrogate-assisted framework (PSAF), demonstrating its applicability to a broad category of single-objective optimization methods. The framework injects knowledge from a surrogate into an existing algorithm through a tournament-based procedure and continuing the optimization run on the surrogate's predictions. The surrogate's involvement is determined by updating a replacement probability based on the accuracy from past iterations. A study of four well-known population-based optimization algorithms with and without the proposed probabilistic surrogate assistance indicates its usefulness in achieving a better convergence. The proposed framework enables the incorporation of surrogates into an existing optimization algorithm and, thus, paves the way for new surrogate-assisted algorithms dealing with challenges in less frequently addressed computationally expensive functions, such as different variable types, large dimensional problems, multiple objectives, and constraints.
Most evolutionary algorithms have parameters, which allow a great flexibility in controlling their behavior and adapting them to new problems. To achieve the best performance, it is often needed to control some of the parameters during optimization, which gave rise to various parameter control methods. In recent works, however, similar advantages have been shown, and even proven, for sampling parameter values from certain, often heavy-tailed, fixed distributions. This produced a family of algorithms currently known as "fast evolution strategies" and "fast genetic algorithms".
However, only little is known so far about the influence of these distributions on the performance of evolutionary algorithms, and about the relationships between (dynamic) parameter control and (static) parameter sampling. We contribute to the body of knowledge by presenting an algorithm that computes the optimal static distributions, which describe the mutation operator used in the well-known simple (1+λ) evolutionary algorithm on a classic benchmark problem OneMax. We show that, for large enough population sizes, such optimal distributions may be surprisingly complicated and counter-intuitive. We investigate certain properties of these distributions, and also evaluate the performance regrets of the (1 + λ) evolutionary algorithm using standard mutation operators.
Accurately predicting the performance of different optimization algorithms for previously unseen problem instances is crucial for high-performing algorithm selection and configuration techniques. In the context of numerical optimization, supervised regression approaches built on top of exploratory landscape analysis are becoming very popular. From the point of view of Machine Learning (ML), however, the approaches are often rather naïve, using default regression or classification techniques without proper investigation of the suitability of the ML tools. With this work, we bring to the attention of our community the possibility to personalize regression models to specific types of optimization problems. Instead of aiming for a single model that works well across a whole set of possibly diverse problems, our personalized regression approach acknowledges that different models may suite different types of problems. Going one step further, we also investigate the impact of selecting not a single regression model per problem, but personalized ensembles. We test our approach on predicting the performance of numerical optimization heuristics on the BBOB benchmark collection.
We consider multi-solution optimization and generative models for the generation of diverse artifacts and the discovery of novel solutions. In cases where the domain's factors of variation are unknown or too complex to encode manually, generative models can provide a learned latent space to approximate these factors. When used as a search space, however, the range and diversity of possible outputs are limited to the expressivity and generative capabilities of the learned model. We compare the output diversity of a quality diversity evolutionary search performed in two different search spaces: 1) a predefined parameterized space and 2) the latent space of a variational autoencoder model. We find that the search on an explicit parametric encoding creates more diverse artifact sets than searching the latent space. A learned model is better at interpolating between known data points than at extrapolating or expanding towards unseen examples. We recommend using a generative model's latent space primarily to measure similarity between artifacts rather than for search and generation. Whenever a parametric encoding is obtainable, it should be preferred over a learned representation as it produces a higher diversity of solutions.
Automated algorithm selection and configuration methods that build on exploratory landscape analysis (ELA) are becoming very popular in Evolutionary Computation. However, despite a significantly growing number of applications, the underlying machine learning models are often chosen in an ad-hoc manner.
We show in this work that three classical regression methods are able to achieve meaningful results for ELA-based algorithm selection. For those three models - random forests, decision trees, and bagging decision trees - the quality of the regression models is highly impacted by the chosen hyper-parameters. This has significant effects also on the quality of the algorithm selectors that are built on top of these regressions.
By comparing a total number of 30 different models, each coupled with 2 complementary regression strategies, we derive guidelines for the tuning of the regression models and provide general recommendations for a more systematic use of classical machine learning models in landscape-aware algorithm selection. We point out that a choice of the machine learning model merits to be carefully undertaken and further investigated.
We handle min-max black-box optimization problems in which the scenario variable to be maximized is discrete and the design variable to be minimized is a continuous vector. To reduce the number of objective function calls, which are assumed to be computationally expensive, we propose an approach that samples a subset of scenarios at each iteration to approximate the worst-case objective function and apply the covariance matrix adaptation evolution strategy to the approximated worst-case objective function. In addition, we develop an adaptation mechanism for the probability of sampling each scenario. Moreover, we introduce the notion of support scenarios to characterize min-max optimization problems with discrete scenario variables and design test problems with various characteristics of support scenarios. Empirical evaluations reveal that the proposed approach learns to sample the set of support scenarios, being more efficient than sampling all the scenarios, especially when the available scenarios outnumber the support scenarios.
This paper proposes a method to improve the CPU utilization of parallel differential evolution (PDE) by incorporating the interleaving generation mechanism. Previous research proposed the interleaving generation evolutionary algorithm (IGEA) and its improved variants (iIGEA). IGEA reduces the computation time by generating new offspring, which parents have been determined even when all individuals have not evaluated. However, the previous research only used a simple EA method, which is not suitable for practical use. For this issue, this paper explores the applicability of IGEA and iIGEA to practical EA methods. In particular, we choose differential evolution (DE), which is widely used in real-world applications, and propose IGDE and its improved variant, iIGDE. We conduct experiments to investigate the effectiveness of IGDE with several features of the evaluation time on a simulated parallel computing environment. The experimental results reveal that the IGDE variants have higher CPU utilization than a simple PDE and reduce the computation time required for optimization. Besides, iIGDE outperforms the original IGDE for all features of the evaluation time.
The evolution of advanced persistent threats (APTs) spurs us to explore computational models of coevolutionary dynamics arising from efforts to secure cyber systems from them. In a first for evolutionary algorithms, we incorporate known threats and vulnerabilities into a stylized "competition" that pits cyber attack patterns against mitigations. Variations of attack patterns that are drawn from the public CAPEC catalog offering Common Attack Pattern Enumeration and Classifications. Mitigations take two forms: software updates or monitoring, and the software that is mitigated is identified by drawing from the public CVE dictionary of Common Vulnerabilities and Exposures. In another first, we quantify the outcome of a competition by incorporating the public Common Vulnerability Scoring System - CVSS. We align three abstract models of population-level dynamics where APTs interact with defenses with three competitive, coevolutionary algorithm variants that use the competition. A comparative study shows that the way a defensive population preferentially acts, e.g. shifting to mitigating recent attack patterns, results in different evolutionary outcomes, expressed as different dominant attack patterns and mitigations.
In the domain of partial classification, recent studies about multiobjective local search (MOLS) have led to new algorithms offering high performance, particularly when the data are imbalanced. In the presence of such data, the class distribution is highly skewed and the user is often interested in the least frequent class. Making further improvements certainly requires exploiting complementary solving techniques (notably, for the rule mining problem). As Constraint Programming (CP) has been shown to be effective on various combinatorial problems, it is one such promising complementary approach. In this paper, we propose a new hybrid combination, based on MOLS and CP that are quite orthogonal. Indeed, CP is a complete approach based on powerful filtering techniques whereas MOLS is an incomplete approach based on Pareto dominance. Experimental results on real imbalanced datasets show that our hybrid approach is statistically more efficient than a simple MOLS algorithm on both training and tests instances, in particular, on partial classification problems containing many attributes.
As computing power grows, the automated specialization and design of evolutionary algorithms (EAs) to tune their performance to individual problem classes becomes more attractive. To this end, a significant amount of research has been conducted in recent decades, utilizing a wide range of techniques. However, few techniques have been devised which automate the design of the overall structure of an EA. Most EA implementations rely solely on the traditional evolutionary cycle of parent selection, reproduction, and survival selection, and those with unique structures typically replace this with another static, hand-made cycle. Existing techniques for modifying the evolutionary structure use representations which are either loosely structured and highly stochastic, or which are constrained and unable to easily evolve complicated pathways. The ability to easily evolve complex evolutionary pathways would greatly expand the heuristic space which can be explored during specialization, potentially allowing for the representation of EAs which outperform the traditional cycle. This work proposes a methodology for the automated design of the evolutionary process by introducing a novel directed-graph-based representation, created to be mutable and flexible, permitting a black-box designer to produce reusable, high-performance EAs. Experiments show that our methodology can produce high-performance EAs demonstrating intelligible strategies.
Many real-world applications have the time-linkage property, and the only theoretical analysis is recently given by Zheng, et al. (TEVC 2021) on their proposed time-linkage OneMax problem, OneMax(0,1n). However, only two elitist algorithms (1 + 1) EA and (μ + 1) EA are analyzed, and it is unknown whether the non-elitism mechanism could help to escape the local optima existed in OneMax(0,1n). In general, there are few theoretical results on the benefits of the non-elitism in evolutionary algorithms. In this work, we analyze on the influence of the non-elitism via comparing the performance of the elitist (1 + λ) EA and its non-elitist counterpart (1, λ) EA. We prove that with probability 1 - o(1) (1 + λ) EA will get stuck in the local optima and cannot find the global optimum, but with probability 1, (1, λ) EA can reach the global optimum and its expected runtime is O(n3+c log n) with [EQUATION] for the constant c ≥ 1. Noting that a smaller offspring size is helpful for escaping from the local optima, we further resort to the compact genetic algorithm where only two individuals are sampled to update the probabilistic model, and prove its expected runtime of O(n3 log n). Our computational experiments also verify the efficiency of the two non-elitist algorithms.
In explainable AI, one aspect of a prediction's explanation is to measure each predictor's importance to the decision process. The importance can measure how much variation a predictor promotes locally or how much the predictor contributes to the deviation from a reference point (Shapley value). If we have the ground truth analytical model, we can calculate the former using the Partial Effect, calculated as the predictor's partial derivative. Also, we can estimate the latter by calculating the average partial effect multiplied by the difference between the predictor and the reference value. Symbolic Regression is a gray-box model for regression problems that returns an analytical model approximating the input data. Although it is often associated with interpretability, few works explore this property. This paper will investigate the use of Partial Effect with the analytical models generated by the Interaction-Transformation Evolutionary Algorithm symbolic regressor (ITEA). We show that the regression models returned by ITEA coupled with Partial Effect provide the closest explanations to the ground-truth and a close approximation to Shapley values. These results open up new opportunities to explain symbolic regression models compared to the approximations provided by model-agnostic approaches.
Uncertain Capacitated Arc Routing Problem (UCARP) is an NP-hard optimisation problem with many applications in logistics domains. Genetic Programming (GP) is capable of evolving routing policies to handle the uncertain environment of UCARP. There are many different but related UCARP domains in the real world to be solved (e.g. winter gritting and waste collection for different cities). Instead of training a routing policy for each of them, we can use the multi-task learning paradigm to improve the training effectiveness by sharing the common knowledge among the related UCARP domains. Previous studies showed that GP population for solving UCARP loses diversity during its evolution, which decreases the effectiveness of knowledge sharing. To address this issue, in this work we propose a novel multi-task GP approach that takes the uniqueness of transferable knowledge, as well as its quality, into consideration. Additionally, the transferred knowledge is utilised in a manner that improves diversity. We investigated the performance of the proposed method with several experimental studies and demonstrated that the designed knowledge transfer mechanism can significantly improve the performance of GP for solving UCARP.
Genetic Programming (GP) is known to suffer from the burden of being computationally expensive by design. While, over the years, many techniques have been developed to mitigate this issue, data vectorization, in particular, is arguably still the most attractive strategy due to the parallel nature of GP. In this work, we employ a series of benchmarks meant to compare both the performance and evolution capabilities of different vectorized and iterative implementation approaches across several existing frameworks. Namely, TensorGP, a novel open-source engine written in Python, is shown to greatly benefit from the TensorFlow library to accelerate the domain evaluation phase in GP. The presented performance benchmarks demonstrate that the TensorGP engine manages to pull ahead, with relative speedups above two orders of magnitude for problems with a higher number of fitness cases. Additionally, as a consequence of being able to compute larger domains, we argue that TensorGP performance gains aid the discovery of more accurate candidate solutions.
The Zoetrope Genetic Programming (ZGP) algorithm is based on an original representation for mathematical expressions, targeting evolutionary symbolic regression. The zoetropic representation uses repeated fusion operations between partial expressions, starting from the terminal set. Repeated fusions within an individual gradually generate more complex expressions, ending up in what can be viewed as new features. These features are then linearly combined to best fit the training data. ZGP individuals then undergo specific crossover and mutation operators, and selection takes place between parents and offspring. ZGP is validated using a large number of public domain regression datasets, and compared to other symbolic regression algorithms, as well as to traditional machine learning algorithms. ZGP reaches state-of-the-art performance with respect to both types of algorithms, and demonstrates a low computational time compared to other symbolic regression approaches.
For the past six years, researchers in genetic programming and other program synthesis disciplines have used the General Program Synthesis Benchmark Suite to benchmark many aspects of automatic program synthesis systems. These problems have been used to make notable progress toward the goal of general program synthesis: automatically creating the types of software that human programmers code. Many of the systems that have attempted the problems in the original benchmark suite have used it to demonstrate performance improvements granted through new techniques. Over time, the suite has gradually become outdated, hindering the accurate measurement of further improvements. The field needs a new set of more difficult benchmark problems to move beyond what was previously possible.
In this paper, we describe the 25 new general program synthesis benchmark problems that make up PSB2, a new benchmark suite. These problems are curated from a variety of sources, including programming katas and college courses. We selected these problems to be more difficult than those in the original suite, and give results using PushGP showing this increase in difficulty. These new problems give plenty of room for improvement, pointing the way for the next six or more years of general program synthesis research.
We investigate the use of Genetic Programming (GP) as a convolutional predictor for missing pixels in images. The training phase is performed by sweeping a sliding window over an image, where the pixels on the border represent the inputs of a GP tree. The output of the tree is taken as the predicted value for the central pixel. We consider two topologies for the sliding window, namely the Moore and the Von Neumann neighborhood. The best GP tree scoring the lowest prediction error over the training set is then used to predict the pixels in the test set. We experimentally assess our approach through two experiments. In the first one, we train a GP tree over a subset of 1000 complete images from the MNIST dataset. The results show that GP can learn the distribution of the pixels with respect to a simple baseline predictor, with no significant differences observed between the two neighborhoods. In the second experiment, we train a GP convolutional predictor on two degraded images, removing around 20% of their pixels. In this case, we observe that the Moore neighborhood works better, although the Von Neumann neighborhood allows for a larger training set.
DNA microarray data contain valuable biological information, which makes it of great importance in disease analysis and cancer diagnosis. However, the classification of microarray data is still a challenging task because of the high dimension and small sample size, especially for multiclass data accompanied by class imbalance. In this paper, we propose a cooperative coevolutionary multiobjective genetic programming (CC-MOGP) for microarray data classification. It converts a multiclass problem into a set of tractable binary problems and coevolves the corresponding population. And a cooperative coevolutionary Pareto archived evolution strategy (CC-PAES) is employed to approximate the Pareto front. During this procedure, we propose a synergy test method to assist in guiding the coevolution between populations. Experimental results on 8 multiclass microarray data show that CC-MOGP can obtain competitive prediction accuracy compared with several state-of-art evolutionary computation and traditional methods.
In the multi-class classification problem GP plays an important role when combined with other non-GP classifiers. However, when GP performs the actual classification (without relying on other classifiers) its classification accuracy is low. This is especially true when the number of classes is high. In this paper, we present DTC, a GP classifier that leverages the effectiveness of the dynamic target approach to evolve a set of discriminant functions (one for each class). Notably, DTC is the first GP classifier that defines the fitness of individuals by using the synergistic combination of linear scaling and the hinge-loss function (commonly used by SVM). Differently, most previous GP classifiers use the number of correct classifications to drive the evolution. We compare DTC with eight state-of-art multi-class classification techniques (e.g., RF, RS, MLP, and SVM) on eight popular datasets. The results show that DTC achieves competitive classification accuracy even with 15 classes, without relying on other classifiers.
The generalizability of programs synthesized by genetic programming (GP) to unseen test cases is one of the main challenges of GP-based program synthesis. Recent work showed that increasing the amount of training data improves the generalizability of the programs synthesized by GP. However, generating training data is usually an expensive task as the output value for every training case must be calculated manually by the user. Therefore, this work suggests an approximation of the expected generalization ability of solution candidates found by GP. To obtain candidate solutions that all solve the training cases, but are structurally different, a GP run is not stopped after the first solution is found that solves all training instances but search continues for more generations. For all found candidate solutions (solving all training cases), we calculate the behavioral vector for a set of randomly generated additional inputs. The proportion of the number of different found candidate solutions generating the same behavioral vector with highest frequency compared to all other found candidate solutions with different behavior can serve as an approximation for the generalizability of the found solutions. The paper presents experimental results for a number of standard program synthesis problems confirming the high prediction accuracy.
Learning ensembles by bagging can substantially improve the generalization performance of low-bias, high-variance estimators, including those evolved by Genetic Programming (GP). To be efficient, modern GP algorithms for evolving (bagging) ensembles typically rely on several (often inter-connected) mechanisms and respective hyper-parameters, ultimately compromising ease of use. In this paper, we provide experimental evidence that such complexity might not be warranted. We show that minor changes to fitness evaluation and selection are sufficient to make a simple and otherwise-traditional GP algorithm evolve ensembles efficiently. The key to our proposal is to exploit the way bagging works to compute, for each individual in the population, multiple fitness values (instead of one) at a cost that is only marginally higher than the one of a normal fitness evaluation. Experimental comparisons on classification and regression tasks taken and reproduced from prior studies show that our algorithm fares very well against state-of-the-art ensemble and non-ensemble GP algorithms. We further provide insights into the proposed approach by (i) scaling the ensemble size, (ii) ablating the changes to selection, (iii) observing the evolvability induced by traditional subtree variation.
Code: https://github.com/marcovirgolin/2SEGP.
Recent research on genotype-phenotype (G-P) maps in natural evolution has contributed significantly to our understanding of neutrality, redundancy, robustness, and evolvability. Here we investigate the properties of the digital logic gate G-P map and show that this map shares many of the common properties of natural G-P maps, with the exception of a positive relationship between evolvability and robustness. Our results show that in some cases robustness and evolvability may be negatively related as a result of the method used to approximate evolvability. We give two definitions of circuit complexity and empirically show that these two definitions are closely related. This study leads to a better understanding of the relationships between redundancy, robustness, and evolvability in genotype-phenotype maps. We further investigate these results in the context of complexity and show the relationships between phenotypic complexity and phenotypic redundancy, robustness and evolvability.
The diversity between individual learners in an ensemble is known to influence its performance. However, there is no standard agreement on how diversity should be defined, and thus how to exploit it to construct a high-performing classifier. We propose two new behavioural diversity metrics based on the divergence of errors between models. Following a neuroevolution approach, these metrics are then used to guide a novelty search algorithm to search a space of neural architectures and discover behaviourally diverse classifiers, iteratively adding the models with high diversity score to an ensemble. The parameters of each ANN are tuned individually with a standard gradient descent procedure. We test our approach on three benchmark datasets from Computer Vision --- CIFAR-10, CIFAR-100, and SVHN --- and find that the ensembles generated significantly outperform ensembles created without explicitly searching for diversity and that the error diversity metrics we propose lead to better results than others in the literature. We conclude that our empirical results signpost an improved approach to promoting diversity in ensemble learning, identifying what sort of diversity is most relevant and proposing an algorithm that explicitly searches for it without selecting for accuracy.
We demonstrate the training of Spiking Neural Networks (SNN) in a novel multi-agent Evolutionary Robotics (ER) framework inspired by competitive evolutionary environments in nature. The topology of a SNN along with morphological parameters of the bot it controls in the ER environment is together treated as a phenotype. Rules of the framework select certain bots and their SNNs for reproduction and others for elimination based on their efficacy in capturing food in a competitive environment. While the bots and their SNNs are not explicitly trained to survive or reproduce using loss functions, these drives emerge implicitly as they evolve to hunt food and survive. Their efficiency in capturing food exhibits the evolutionary signature of punctuated equilibrium. We use this signature to compare the performances of two evolutionary inheritance algorithms on the phenotypes, Mutation and Crossover with Mutation, using ensembles of 100 experiments for each algorithm. We find that Crossover with Mutation promotes 40% faster learning in the SNN than mere Mutation with a statistically significant margin.
Quality-Diversity optimization algorithms such as MAP-Elites, aim to generate collections of both diverse and high-performing solutions to an optimization problem. MAP-Elites has shown promising results in a variety of applications. In particular in evolutionary robotics tasks targeting the generation of behavioral repertoires that highlight the versatility of robots. However, for most robotics applications MAP-Elites is limited to using simple open-loop or low-dimensional controllers. Here we present Policy Gradient Assisted MAP-Elites (PGA-MAP-Elites), a novel algorithm that enables MAP-Elites to efficiently evolve large neural network controllers by introducing a gradient-based variation operator inspired by Deep Reinforcement Learning. This operator leverages gradient estimates obtained from a critic neural network to rapidly find higher-performing solutions and is paired with a traditional genetic variation to maintain a divergent search behavior. The synergy of these operators makes PGA-MAP-Elites an efficient yet powerful algorithm for finding diverse and high-performing behaviors. We evaluate our method on four different tasks for building behavioral repertoires that use deep neural network controllers. The results show that PGA-MAP-Elites significantly improves the quality of the generated repertoires compared to existing methods.
Neural Architecture Search (NAS) is the name given to a set of methods designed to automatically configure the layout of neural networks. Their success on Convolutional Neural Networks inspired its use on optimizing other types of neural network architectures, including Graph Neural Networks (GNNs). GNNs have been extensively applied over several collections of real-world data, achieving state-of-the-art results in tasks such as circuit design, molecular structure generation and anomaly detection. Many GNN models have been recently proposed, and choosing the best model for each problem has become a cumbersome and error-prone task. Aiming to alleviate this problem, recent works have proposed strategies for applying NAS to GNN models. However, different search methods converge relatively fast in the search for a good architecture, which raises questions about the structure of the problem. In this work we use Fitness Landscape Analysis (FLA) measures to characterize the search space explored by NAS methods for GNNs. We sample almost 90k different architectures that cover most of the fitness range, and represent them using both a one-hot encoding and an embedding representation. Results of the fitness distance correlation and dispersion metrics show the fitness landscape is easy to be explored, and presents low neutrality.
Neural networks with temporal characteristics such as asynchronous spiking have made much progress towards biologically plausible artificial intelligence. However, genetic approaches for evolving network structures in this area are still relatively unexplored. In this paper, we examine a specific variant of time-dependent spiking neural networks (NN) in which the spatial and temporal relationships between neurons affect output. First, we built and customized a standard NN implementation to more closely model the time-delay characteristics of biological neurons. Next, we tested this with simulated tasks such as food foraging and image recognition, demonstrating success in multiple domains. We then developed a genetic representation for the network that allows for both scalable network size and compatibility with genetic crossover operations. Finally, we analyzed the effects of genetic crossover algorithms compared to random mutations on the food foraging task. Results showed that crossover operations based on node usage converge on a local maximum more quickly than random mutations, but suffer from genetic defects that reduce overall population performance.
Generalization to out-of-distribution (OOD) circumstances after training remains a challenge for artificial agents. To improve the robustness displayed by plastic Hebbian neural networks, we evolve a set of Hebbian learning rules, where multiple connections are assigned to a single rule. Inspired by the biological phenomenon of the genomic bottleneck, we show that by allowing multiple connections in the network to share the same local learning rule, it is possible to drastically reduce the number of trainable parameters, while obtaining a more robust agent. During evolution, by iteratively using simple K-Means clustering to combine rules, our Evolve & Merge approach is able to reduce the number of trainable parameters from 61,440 to 1,920, while at the same time improving robustness, all without increasing the number of generations used. While optimization of the agents is done on a standard quadruped robot morphology, we evaluate the agents' performances on slight morphology modifications in a total of 30 unseen morphologies. Our results add to the discussion on generalization, overfitting and OOD adaptation. To create agents that can adapt to a wider array of unexpected situations, Hebbian learning combined with a regularising "genomic bottleneck" could be a promising research direction.
Neuroevolution is an alternative to gradient-based optimisation that has the potential to avoid local minima and allows parallelisation. The main limiting factor is that usually it does not scale well with parameter space dimensionality. Inspired by recent work examining neural network intrinsic dimension and loss landscapes, we hypothesise that there exists a low-dimensional manifold, embedded in the policy network parameter space, around which a high-density of diverse and useful policies are located. This paper proposes a novel method for diversity-based policy search via Neuroevolution, that leverages learned representations of the policy network parameters, by performing policy search in this learned representation space. Our method relies on the Quality-Diversity (QD) framework which provides a principled approach to policy search, and maintains a collection of diverse policies, used as a dataset for learning policy representations. Further, we use the Jacobian of the inverse-mapping function to guide the search in the representation space. This ensures that the generated samples remain in the high-density regions, after mapping back to the original space. Finally, we evaluate our contributions on four continuous-control tasks in simulated environments, and compare to diversity-based baselines.
Previous evolution based architecture search require high computational resources resulting in large search time. In this work, we propose a novel way of applying a simple genetic algorithm to the neural architecture search problem called EvNAS (Evolving Neural Architecture using One Shot Model) which reduces the search time significantly while still achieving better result than previous evolution based methods. The architectures are represented by architecture parameter of one shot model which results in the weight sharing among the given population of architectures and also weight inheritance from one generation to the next generation of architectures. We use the accuracy of partially trained architecture on validation data as a prediction of its fitness to reduce the search time. We also propose a decoding technique for the architecture parameter which is used to divert majority of the gradient information towards the given architecture and is also used for improving the fitness prediction of the given architecture from the one shot model during the search process. EvNAS searches for architecture on CIFAR-10 for 3.83 GPU day on a single GPU with top-1 test error 2.47%, which is then transferred to CIFAR-100 and ImageNet achieving top-1 error 16.37% and top-5 error 7.4% respectively.
A major limitation to the optimization of artificial neural networks (ANN) with evolutionary methods lies in the high dimensionality of the search space, the number of weights growing quadratically with the size of the network. This leads to expensive training costs, especially in evolution strategies which rely on matrices whose sizes grow with the number of genes. We introduce a geometric encoding for neural network evolution (GENE) as a representation of ANN parameters in a smaller space that scales linearly with the number of neurons, allowing for efficient parameter search. Each neuron of the network is encoded as a point in a latent space and the weight of a connection between two neurons is computed as the distance between them. The coordinates of all neurons are then optimized with evolution strategies in a reduced search space while not limiting network fitness and possibly improving search.
The core networks of current telecommunication infrastructures are typically engineered as multi-layer networks. The uppermost layer is defined by the virtual topology, which determines the logical connections between core network routers. This topology is realized by optical paths in the lower layer, which is defined by optical fiber connections between network nodes. Minimizing the hardware cost incurred by these optical paths for a given set of traffic demands is a common combinatorial optimization problem in network planning, often approached by Mixed-Integer Linear Programming. However, increasing network densities and the introduction of additional constraints will impact tractability of future network problems.
In order to provide a more scalable method, we suggest a Genetic Algorithm-based approach that optimizes the virtual topology and subsequently derives the remaining parameters from it. Our genetic encoding utilizes a combination of spanning trees and augmentation links to quickly form meaningful topologies. We compare the results of our approach to known linear programming solutions in simple scenarios and to a competing heuristic based on Simulated Annealing in large-scale problems.
Traffic congestion is a major issue that can be solved by suggesting drivers alternative routes they are willing to take. This concept has been formalized as a strategic routing problem in which a single alternative route is suggested to an existing one. We extend this formalization and introduce the Multiple-Routes problem, which is given a start and destination and aims at finding up to n different routes that the drivers strategically disperse over, minimizing the overall travel time of the system.
Due to the NP-hard nature of the problem, we introduce the Multiple-Routes evolutionary algorithm (MREA) as a heuristic solver. We study several mutation and crossover operators and evaluate them on real-world data of Berlin, Germany. We find that a combination of all operators yields the best result, improving the overall travel time by a factor between 1.8 and 3, in the median, compared to all drivers taking the fastest route. For the base case n = 2, we compare our MREA to the highly tailored optimal solver by Bläsius et al. [ATMOS 2020] and show that, in the median, our approach finds solutions of quality at least 99.69% of an optimal solution while only requiring 40 % of the time.
Decision trees (DTs) are popular techniques in the field of eXplainable Artificial Intelligence. Despite their effectiveness in solving various classification problems, they are not compatible with modern biological data generated with high-throughput technologies. This work aims to combine evolutionary induced DT with a recently developed concept designed directly for gene expression data, called Relative eXpression Analysis (RXA). We propose a new solution, termed Evolutionary Heterogeneous Decision Tree (EvoHDTree), which uses both classical univariate and bivariate tests that focus on the relative ordering and weight relationships between the genes in the splitting nodes. The search for the decision tree structure, node representation, and splits is performed globally by the evolutionary algorithm.
To meet the huge computational demands, we enriched our solution with more than a dozen specialized variants of recombination operators, GPU-computed local search components, OpenMP parallelization, and built-in gene ranking to improve evolutionary convergence. Experiments performed on cancer-related gene expression-based data show that the proposed solution finds accurate and much simpler interactions between genes. Importantly, the patterns discovered by EvoHDTree are easy to understand and to some extent supported by biological evidence in the literature.
In the field of 3D Human Performance Capture, a high-quality 3D scan of the performer is rigged and skinned to an animatable 3D template mesh that is subsequently fitted to the captured performance's RGB-D data. Template fitting is accomplished via solving for the template's pose parameters that better explain the performance data at each recorded frame. In this paper, we challenge open implementations of zeroth-order optimizers to solve the template fitting problem in a human performance capture dataset. The objective function that we employ approximates, the otherwise costly to evaluate, 3D RMS hausdorff distance between the animated template and the 3D mesh reconstructed from the depth data (target mesh) at an individual recorded frame. We distinguish and benchmark the optimizers, in three different real-world scenarios, two of which are based on the geometric proximity of the template to the target in individual frames, while in the third one we fit the template sequentially to all target frames of the recorded sequence. Conclusions of this work can serve as a reference for future optimizer implementations and our findings can server as a baseline for future multi-objective optimization approaches. We make part of our benchmark and experiment setup publicly available (https://github.com/VCL3D/nevergrad, https://github.com/VCL3D/PerformanceCapture/releases/).
Portfolio optimization is a control problem whose objective is to find the optimal strategy for the process of selecting the proportions of assets that can provide the maximum return. Conventional approaches formulate the problem as a single Markov decision process and apply reinforcement learning methods to provide solutions. However, it is well known that financial markets involve non-stationary processes, leading to violations of this assumption in these methods. In this work, we reformulate the portfolio optimization problem to deal with the non-stationary nature of financial markets. In our approach, we divide a long-term process into multiple short-term processes to adapt to context changes and consider the portfolio optimization problem as a multitask control problem. Thereafter, we propose an evolutionary meta reinforcement learning approach to search for an initial policy that can quickly adapt to the upcoming target tasks. We model the policies as convolutional networks that can score the match of the patterns in market data charts. Finally, we test our approach using real-world cryptographic currency data and show that it adapts well to the changes in the market and leads to better profitability.
Optimal transmission switching (OTS) is a new practice in power systems and can improve the economics of electric power systems integrated with renewable resources such as wind. In OTS modeling binary decision variables are added to the optimal power flow (OPF) problem to represent on and off switching status of lines. This extension to alternative current optimal power flow (ACOPF) problem results in a mixed integer nonlinear program (MINLP) which is not guaranteed to be solved optimally by existing solution methods and also requires excessive computation times for large real systems. In this paper we develop a genetic algorithm (GA) for ACOPF based OTS problem. In our GA approach we benefit from the structure of power transmission network and develop a line scoring method and a graphical distance based local improvement technique to better search the solution space. We compare our proposed genetic algorithm with two greedy heuristics on test power systems with renewable resources of energy. The results show that our proposed approach finds more economic solutions especially in larger power systems.
Primer sets are short DNA sequences of 18-22 base pairs, that can be used to verify the presence of a virus, and designed to attach to a specific part of a viral DNA. Designing a primer set requires choosing a region of DNA, avoiding the possibility of hybridization to a similar sequence, as well as considering its GC content and Tm (melting temperature). Coronaviruses, such as SARS-CoV-2, have a considerably large genome (around 30 thousand nucleotides) when compared to other viruses. With the rapid rise and spread of SARS-CoV-2 variants, it has become a priority to breach our lack of specific primers available for diagnosis of this new variants. Here, we propose an evolutionary-based approach to primer design, able to rapidly deliver a high-quality primer set for a target sequence of the virus variant. Starting from viral sequences collected from open repositories, the proposed approach is proven able to uncover a specific primer set for the B.1.1.7 SARS-CoV-2 variant. Only recently identified, B.1.1.7 is already considered potentially dangerous, as it presents a considerably higher transmissibility when compared to other variants.
The automatic itinerary planning service requires to generate multiple-day schedules automatically under user-specified POIs and constraints. Knowing as an NP-Hard optimization problem, the task is commonly solved by (meta-)heuristic algorithms such as the genetic algorithms (GAs). However, considering the concurrent requests received by a web server in practice, the time efficiency of the existing itinerary planners can be rather unsatisfactory. To address the issue, this paper proposes a computational approach that hybridizing a GA with the reinforcement learning (RL) technology. The benefit is that we no longer need to re-execute the GA for each new request arrived. Instead, the approach keeps the historical solutions in track and maintains a RL agent to sequentially decide how to handle each new request. Experimental results show that the proposed approach is able to stably provide high-quality solutions, while greatly reducing the average time overhead of the web server.
Networking devices deployed in ultra-scale data centers must run perfectly and in real-time. The networking device performance is tuned using the device configuration registers. The optimal configuration is derived from the network topology and traffic patterns. As a result, it is not possible to specify a single configuration that fits all scenarios, and manual tuning is required in order to optimize the devices' performance. Such tuning slows down data center deployments and consumes massive resources. Moreover, as traffic patterns change, the original tuning becomes obsolete and causes degraded performance. This necessitates expensive retuning, which in some cases is infeasible.
In this work, we present ZTT: a continuously running Genetic Algorithm that can be used for online, automatic tuning of the networking device parameters. ZTT is adaptive, fast to respond and have low computational costs required for running on a networking device. We test ZTT in a diversity of real-world traffic scenarios and show that it is able to obtain a significant performance boost over static configurations suggested by experts. We also demonstrate that ZTT is able to outperform alternative search algorithms like Simulated Annealing and Recursive Random Search even when those are adapted to better match the task at hand.
Estimating a person's age from a facial image is a challenging problem with clinical applications. Several medical aesthetics treatments have been developed that alter the skin texture and other facial features, with the goal of potentially improving patient's appearance and perceived age. In this paper, this effect was evaluated using evolutionary neural networks with uncertainty estimation. First, a realistic dataset was obtained from clinical studies that makes it possible to estimate age more reliably than e.g. datasets of celebrity images. Second, a neuroevolution approach was developed that customizes the architecture, learning, and data augmentation hyperparameters and the loss function to this task. Using state-of-the-art computer vision architectures as a starting point, evolution improved their original accuracy significantly, eventually outperforming the best human optimizations in this task. Third, the reliability of the age predictions was estimated using RIO, a Gaussian-Process-based uncertainty model. Evaluation on a real-world Botox treatment dataset shows that the treatment has a quantifiable result: The patients' estimated age is reduced significantly compared to placebo treatments. The study thus shows how AI can be harnessed in a new role: To provide an objective quantitative measure of a subjective perception, in this case the proposed effectiveness of medical aesthetics treatments.
Computerized-adaptive testing (CAT) is a form of assessment in which items/questions are administered based upon a test taker's ability (i.e., their estimated proficiency, such as a knowledge, skill, or personality characteristic). CAT is regularly used in psychological studies, medical exams, and standardized testing to reduce test length and improve measurement accuracy and precision. A key challenge in CAT is item selection. Past algorithms have been designed based on criteria such as item difficulty and maximum Fisher information. However, these only consider a fixed-length test, which may result in it being longer or less precise. To address this problem, we formulate a new multi-objective optimization problem to model the trade-off between test length and precision. A binary population-based genetic algorithm, NSGA-II, is used to obtain the set of Pareto-optimal solutions by maximizing precision and minimizing the number of questions. We evaluate our approach with a simulated study using four standard personality assessments. We also investigate the influence of test response types (e.g., binary versus categorical response) and number of variables (i.e., the number of possible items) on performance. The results obtained show multi-objective optimization can be used in CAT to minimize overall test length and improve measurement precision and overall accuracy.
Inverse folding an RNA secondary structure of length L can be understood as searching in space S = {A, C, G, U}L, one or many RNA sequence(s) that fold(s) into that structure. Solving this problem is a prerequisite for RNA design and is still a real challenge in bio-engineering. To improve the RNA inverse folding, we investigate an evolutionary algorithm implementing local mutations taking into account the nucleotide and canonical base pair distributions. We implement our algorithm as a simple open-source python program called aRNAque1. The empirical results show that aRNAque can solve some target structures previously unsolvable by the existing evolutionary algorithms benchmarked in this work. On the two benchmark datasets, ≈ 72% of the EteRNA100 dataset (respectively ≈ 92% using the Turner1999 energy parameter sets) and ≈ 87% of the non-EteRNA100 dataset and the designed sequences are of better quality. In comparison to existing tools, our EA implementation also shows a significant computational time improvement.
Evolutionary multi-objective optimization (EMO) is often used to deal with practical problems in engineering design to identify products that exhibit the best possible compromise between multiple conflicting performance criteria. Much of the literature in EMO considers algorithmic developments and benchmarking problems involving a single concept only. However, in practice, there could be many applications where the solution of a given problem may be represented using multiple concepts, and optimizing each one of them individually to obtain the overall Pareto front may be inefficient. To address this gap, in this study, we firstly develop computer-aided models of multiple concepts for a simulation-based problem which can serve as a good application of multi-concept optimization. The problem involves the design of lattice structures with different types of unit cells (each constituting a concept), which can be customized to suit a range of real-world applications such as design of structures with minimal thermal expansion, low weight and high rigidity etc. Furthermore, we develop baseline evolutionary strategies to search across multiple concepts simultaneously. Empirical experiments show that the presented strategies are able to outperform conventional single concept-based approach. Moreover, some challenges are also uncovered which prompts the need for further developments.
Services such as garbage collection, road gritting, street sweeping, and power line inspection can each be formulated as a capacitated arc routing problem (CARP). The traditional formulation of CARP has the goal of minimizing the total cost of the routes making up a solution. Recently, operators of such services require routes that are balanced and visually attractive in addition to low cost. Routes that are balanced are about equal in length and provide fair work assignments. Visually attractive routes are subjective, but they usually involve non-crossing routes that provide well defined service areas. These additional features are important because they address operational complexities that arise from using the routes in practice. This paper presents MA-ABC, a <u>m</u>emetic <u>a</u>lgorithm to find solutions for CARP that maximize route <u>a</u>ttractiveness and <u>b</u>alance, while minimizing total <u>c</u>ost. A novel fitness function combines route overlap with route contiguity to assess route attractiveness. MA-ABC is the first to incorporate attractiveness in a three-objective search for heuristic solutions for CARP. Experimental results on CARP benchmark instances show that MA-ABC finds a diverse set of heuristic solutions at the Pareto front, providing a wide choice for service operators to tradeoff design objectives.
Video game level generation based on machine learning (ML), in particular, deep generative models, has attracted attention as a technique to automate level generation. However, applications of existing ML-based level generations are mostly limited to tile-based level representation. When ML techniques are applied to game domains with non-tile-based level representation, such as Angry Birds, where objects in a level are specified by real-valued parameters, ML often fails to generate playable levels. In this study, we develop a deep-generative-model-based level generation for the game domain of Angry Birds. To overcome these drawbacks, we propose a sequential encoding of a level and process it as text data, whereas existing approaches employ a tile-based encoding and process it as an image. Experiments show that the proposed level generator drastically improves the stability and diversity of generated levels compared with existing approaches. We apply latent variable evolution with the proposed generator to control the feature of a generated level computed through an AI agent's play, while keeping the level stable and natural.
The goal of music segmentation is to identify boundaries between parts of music pieces which are perceived as entities. Segment boundaries often go along with a change in musical properties including instrumentation, key, and tempo (or a combination thereof). One can consider different types (or classes) of boundaries according to these musical properties. In contrast to existing datasets with missing specifications of changing properties for annotated boundaries, we have created a set of artificial music tracks with precise annotations for boundaries of different types. This allows for a profound analysis and interpretation of annotated and predicted boundaries and a more exhaustive comparison of different segmentation algorithms. For this scenario, we formulate a novel multi-objective optimisation task that identifies boundaries of only a specific type. The optimisation is conducted by means of evolutionary multi-objective feature selection and a novelty-based segmentation approach. Furthermore, we provide lists of audio features from non-dominated fronts which most significantly contribute to the estimation of given boundaries (the first objective) and most significantly reduce the performance of the prediction of other boundaries (the second objective).
Finding efficient production schedules for automotive paint shops is a challenging task and several paint shop problem variants have been investigated in the past. In this work we focus on a recently introduced real-life paint shop scheduling problem appearing in the automotive supply industry where car parts, which need to be painted, are placed upon carrier devices. These carriers are placed on a conveyor belt and moved into painting cabins, where robots apply the paint. The aim is to find an optimized production schedule for the painting of car parts.
In this paper, we propose a memetic algorithm to solve this problem. An initial population is generated, followed by the constant evolution of generations. Selection, crossover, mutation, and local improvement operators are applied in each generation. We design three novel crossover operators that consider problem-specific knowledge. Finally, we carefully configure our algorithm, including automated and manual parameter tuning.
Using a set of available real-life benchmark instances from the literature, we perform an extensive experimental evaluation of our algorithm. The experimental results show that our memetic algorithm yields competitive results for small- and medium-sized instances and is able to set new upper bounds for some of the problem instances.
The stockpile blending problem seeks to determine how many tonnages of ore that stockpiles provide and to which parcels. This scheduling model maximizes the volume of valuable material in the final production subject to resource capacities, constraints for mill machines, and customer requirements. Motivated by the uncertainty in the geologic input date which affects optimization, we consider the stockpile blending problem with uncertainty in material grades. A non-linear continuous optimization model developed here to integrate uncertain variables to optimization. We introduce chance constraints that are used to guarantee the constraint is violated with a small probability to tackle the stochastic material grades. We investigate a well-known approach in this paper, which is used to solve optimization problems over continuous space, namely the differential evolution (DE) algorithm. In the experiment section, we compare the performance of the algorithm with the deterministic model and three chance constraint models by using a synthetic benchmark. We also evaluate the effectiveness of different chance constraints.
Search-based test generation commonly uses fitness functions based on branch distances, i.e., estimations of how close conditional statements in a program are to evaluating to true or to false. When conditional statements depend on Boolean variables or Boolean-valued methods, the branch distance metric is unable to provide any guidance to the search, causing challenging plateaus in the fitness landscape. A commonly proposed solution is to apply testability transformations, which transform the program in a way that avoids conditional statements from depending on Boolean values. In this paper we introduce the concept of Certainty Booleans, which encode how certain a true or false Boolean value is. Using these Certainty Booleans, a basic testability transformation allows to restore gradients in the fitness landscape for Boolean branches, even when Boolean values are the result of complex interprocedural calculations. Evaluation on a set of complex Java classes and the EvoSuite test generator shows that this testability transformation substantially alters the fitness landscape for Boolean branches, and the altered fitness landscape leads to performance improvements. However, Boolean branches turn out to be much rarer than anticipated, such that the overall effects on code coverage are minimal.
Concurrent model synchronisation, i.e. the (bidirectional) propagation of updates between two models, is an important problem in the area of model-driven engineering (MDE). Compared to other consistency management tasks, synchronising concurrent updates is especially challenging as they can be conflicting, such that restoring a consistent state is not possible when all updates must be considered. Recent approaches create a search space of possible solutions and determine the optimum solution via exact methods, such as integer linear programming (ILP), via a configurable, scalarised objective function that takes conflicting goals into account. However, the determination of suitable configuration parameters and runtime efficiency improvements are still an open issue, which is commonly addressed by using heuristics instead of exact methods. We investigate on whether it is beneficial to apply heuristics to solve concurrent model synchronisation problems. First, a multiobjective evolutionary algorithm is used for small instances for which all pareto-optimal solutions can be presented to a user to select the best one. Second, for larger models, we propose a method to determine suitable weightings for aggregating all objectives into a single function. Finally, these insights are used to recommend a strategy for determining solutions of satisfying quality within an acceptable amount of time.
Due to the complexity of designing vehicle products and the inherent uncertainties in their operating environments, ensuring the safety of their Advanced Driver Assistance Systems (ADASs) becomes crucial. Especially, very minor changes to a vehicle design, for instance due to production errors or component degradation, might lead to failures of ADASs and, therefore, catastrophic consequences such as collision occurrences. Motivated by this, we propose a multi-objective search-based approach (employing NSGA-II) to find minimum changes to the configuration of a set of configurable parameters of a vehicle design, such that the collision probability is maximized, consequently leading to a reversal change in its safety. We conducted experiments, in a vehicle driving simulator, to evaluate the effectiveness of our approach. Results show that our approach with NSGA-II significantly outperforms the random search. Moreover, based on the detailed analyses of the results, we identify some parameters for which minor changes to their values lead the vehicle into collisions, and demonstrated the importance of studying the configuration of multiple parameters in a single search and the impact of their interactions on causing collisions.
Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. Due to the often complicated interplay of the parameters, setting these values right for a particular problem is a challenging task. This task becomes even more complicated when the optimal parameter values change significantly during the run of the algorithm since then a dynamic parameter choice is necessary.
In this work, we propose a lazy but effective solution, namely choosing all parameter values in each iteration randomly from a suitably scaled power-law distribution. We demonstrate the effectiveness of this approach via runtime analyses of the (1 + (λ, λ)) genetic algorithm with all three parameters chosen in this manner. We show that this algorithm on the one hand can imitate simple hill-climbers like the (1+1) EA, giving the same asymptotic runtime on some simple problems. On the other hand, this algorithm is also very efficient on jump functions, where the best static parameters are very different from those necessary to optimize simple problems. We prove a performance guarantee that is comparable to, and sometimes even better than, the best performance known for static parameters. We complement our theoretical results with a rigorous empirical study confirming what the asymptotic runtime results suggest.
Jump functions are the most studied non-unimodal benchmark in the theory of evolutionary algorithms (EAs). They have significantly improved our understanding of how EAs escape from local optima. However, their particular structure - to leave the local optimum the EA can only jump directly to the global optimum - raises the question of how representative the recent findings are.
For this reason, we propose an extended class Jumpk,δ of jump functions that incorporate a valley of low fitness of width δ starting at distance k from the global optimum. We prove that several previous results extend to this more general class: for all k = o (n1/3) and δ ≤ k, the optimal mutation rate for the (1 + 1) EA is [EQUATION], and the fast (1 + 1) EA runs faster than the classical (1 + 1) EA by a factor super-exponential in δ. However, we also observe that some known results do not generalize: the randomized local search algorithm with stagnation detection, which is faster than the fast (1 + 1) EA by a factor polynomial in k on Jumpk, is slower by a factor polynomial in n on some Jumpk,δ instances.
Computationally, the new class allows experiments with wider fitness valleys, especially when they lie further away from the global optimum.
It is largely unknown how the runtime of evolutionary algorithms depends on fitness landscape characteristics for broad classes of problems. Runtime guarantees for complex and multi-modal problems where EAs are typically applied are rarely available.
We present a parameterised problem class SparseLocalOptα,ε where the class with parameters α, ϵ ∈ [0, 1] contains all fitness landscapes with deceptive regions of sparsity ε and fitness valleys of density α. We study how the runtime of EAs depends on these fitness landscape parameters.
We find that for any constant density and sparsity α, ε ∈ (0, 1), SparseLocalOptα,ε has exponential elitist (μ + λ) black-box complexity, implying that a wide range of elitist EAs fail even for mildly deceptive and multi-modal landscapes. In contrast, we derive a set of sufficient conditions for non-elitist EAs to optimise any problem in SparseLocalOptα,ε in expected polynomial time for broad values of α and ε. These conditions can be satisfied for tournament selection and linear ranking selection, but not for (μ, λ)-selection.
One of the first and easy to use techniques for proving run time bounds for evolutionary algorithms is the so-called method of fitness levels by Wegener. It uses a partition of the search space into a sequence of levels which are traversed by the algorithm in increasing order, possibly skipping levels. An easy, but often strong upper bound for the run time can then be derived by adding the reciprocals of the probabilities to leave the levels (or upper bounds for these). Unfortunately, a similarly effective method for proving lower bounds has not yet been established. The strongest such method, proposed by Sudholt (2013), requires a careful choice of the viscosity parameters γi,j, 0 ≤ i ≤ j ≤ n.
In this paper we present two new variants of the method, one for upper and one for lower bounds. Besides the level leaving probabilities, they only rely on the probabilities that levels are visited at all. We show that these can be computed or estimated without greater difficulties and apply our method to reprove the following known results in an easy and natural way. (i) The precise run time of the (1 + 1) EA on LeadingOnes. (ii) A lower bound for the run time of the (1 + 1) EA on OneMax, tight apart from an O(n) term. (iii) A lower bound for the run time of the (1 + 1) EA on long k-paths.
Recent theoretical studies have shown that self-adjusting mechanisms can provably outperform the best static parameters in evolutionary algorithms on discrete problems. However, the majority of these studies concerned elitist algorithms and we do not have a clear answer on whether the same mechanisms can be applied for non-elitist algorithms.
We study one of the best-known parameter control mechanisms, the one-fifth success rule, to control the offspring population size λ in the non-elitist (1, λ) EA. It is known that the (1, λ) EA has a sharp threshold with respect to the choice of λ where the runtime on OneMax changes from polynomial to exponential time. Hence, it is not clear whether parameter control mechanisms are able to find and maintain suitable values of λ.
We show that the answer crucially depends on the success rate s (i. e. a one-(s + 1)-th success rule). We prove that, if the success rate is appropriately small, the self-adjusting (1, λ) EA optimises OneMax in O(n) expected generations and O(n log n) expected evaluations. A small success rate is crucial: we also show that if the success rate is too large, the algorithm has an exponential runtime on OneMax.
Real-world optimisation problems often involve uncertainties. In the past decade, several rigorous analysis results for evolutionary algorithms (EAs) on discrete problems show that EAs can cope with low-level uncertainties, and sometimes benefit from uncertainties. Using non-elitist EAs with large population size is a promising approach to handle higher levels of uncertainties. However, the performance of non-elitist EAs in some common fitness-uncertainty scenarios is still unknown.
We analyse the runtime of non-elitist EAs on OneMax and LeadingOnes under prior and posterior noise models, and the dynamic binary value problem (DynBV). Our analyses are more extensive and precise than previous analyses of non-elitist EAs. In several settings, we prove that the non-elitist EAs beat the current state of the art results. Previous work shows that the population size and mutation rate can dramatically impact the performance of non-elitist EAs. The optimal choices of these parameters depend on the level of uncertainties in the fitness functions. We provide more precise guidance on how to choose mutation rate and population size as a function of the level of uncertainties.
The (1+1)-evolution strategy (ES) with success-based step-size adaptation is analyzed on a general convex quadratic function and its monotone transformation, that is, f(x) = g((x - x*)TH(x - x*)), where g: R → R is a strictly increasing function, H is a positive-definite symmetric matrix, and x* ∈ Rd is the optimal solution of f. The convergence rate, that is, the decrease rate of the distance from a search point mt to the optimal solution x*, is proven to be in O(exp(-L/Tr(H))), where L is the smallest eigenvalue of H and Tr(H) is the trace of H. This result generalizes the known rate of O(exp(-1/d)) for the case of H = Id (Id is the identity matrix of dimension d) and O(exp(-1/(d · ξ))) for the case of H = diag(ξ · Id/2, Id/2). To the best of our knowledge, this is the first study in which the convergence rate of the (1+1)-ES is derived explicitly and rigorously on a general convex quadratic function, which depicts the impact of the distribution of the eigenvalues in the Hessian H on the optimization and not only the impact of the condition number of H.
Stagnation detection has been proposed as a mechanism for randomized search heuristics to escape from local optima by automatically increasing the size of the neighborhood to find the so-called gap size, i. e., the distance to the next improvement. Its usefulness has mostly been considered in simple multimodal landscapes with few local optima that could be crossed one after another. In multimodal landscapes with a more complex location of optima of similar gap size, stagnation detection suffers from the fact that the neighborhood size is frequently reset to 1 without using gap sizes that were promising in the past.
In this paper, we investigate a new mechanism called radius memory which can be added to stagnation detection to control the search radius more carefully by giving preference to values that were successful in the past. We implement this idea in an algorithm called SD-RLSm and show compared to previous variants of stagnation detection that it yields speed-ups for linear functions under uniform constraints and the minimum spanning tree problem. Moreover, its running time does not significantly deteriorate on unimodal functions and a generalization of the Jump benchmark. Finally, we present experimental results carried out to study SD-RLSm and compare it with other algorithms.
Addressing a complex real-world optimization problem is a challenging task. The chance-constrained knapsack problem with correlated uniform weights plays an important role in the case where dependent stochastic components are considered. We perform runtime analysis of a randomized search algorithm (RSA) and a basic evolutionary algorithm (EA) for the chance-constrained knapsack problem with correlated uniform weights. We prove bounds for both algorithms for producing a feasible solution. Furthermore, we investigate the behaviour of the algorithms and carry out analyses on two settings: uniform profit value and the setting in which every group shares an arbitrary profit profile. We provide insight into the structure of these problems and show how the weight correlations and the different profit profiles influence the runtime behavior of both algorithms in the chance-constrained setting.