A High-Performance, Pipelined, FPGA-Based Genetic Algorithm Machine (2001) Barry Shackleford, Greg Snider, Richard Carter, Etsuko Okushi, MitsuhiroYasuda, Katsuhiko Seo, Hiroto Yasuura A Paper Review by Griffin Lacey OUTLINE • • • • • • • INTRODUCTION ALGORITHM ARCHITECTURE PERFORMANCE RESULTS CONTRIBUTIONS CRITIQUE INTRODUCTION • What is a genetic algorithm (GA)? ▫ Search heuristic ▫ Mimics process of natural evolution ▫ Starts with random population of candidate solutions ▫ Terminates when desired fitness level achieved WHAT ARE THE PROBLEMS WITH GA? • One major drawback ▫ Slow execution speed when implemented on GPP • How to overcome this? ▫ Parallel processing • Implement as hardware pipeline on FPGA ▫ ▫ ▫ ▫ Parent selection Crossover Mutation Survival • How to program GA? ▫ Design pipelined fitness function for the problem to be solved ALGORITHM ALGORITHM NOTATION • One-dimensional population array ▫ Each entry contains cdata and cfitness • FUNCTIONS ▫ Fitness(Cdata) ▫ Crossover(cut_prob,p1data,p2data) ▫ Mutation(mutation_prob,cdata) ALGORITHM EXPLANATION 1. Randomly generated population is assigned fitness values 2. Randomly select parents • • 3. 4. 5. 6. 7. parent2 <- parent1 parent1 <- random Child created via crossover function Child then exposed to mutation Child evaluated by fitness function If child is fitter than one of parents -> replacement Eventually survival rate diminishes to zero Pseudo-code for steadystate GA readily implementable in hardware ALGORITHM RATIONALE 1. Population storage ▫ Steady-state allows population array to be implemented as single memory 2. Parent selection ▫ By replacing old parent with new parent, only one clock cycle needed for parent pair 3. Crossover and mutation ▫ Performed every clock cycle 4. Survival-driven evolution ▫ Evolution promoted through survival ALGORITHM VALIDITY Have compromises been made that damage the functional integrity of the GA? Royal Road Function • Optimum solution achieved after ≈ 6,000 crossovers • Speedup of 10x over GA used in Royal Road experiment ARCHITECTURE • 6 stage pipeline • Equal processing time for each stage DATAPATH • Significant portion of GA circuitry • In bit-slice, there are: ▫ 5 logic functions ▫ 7 flip-flops • Cost is 8 LUT’s per bit-slice ▫ Under assumption of 2-output LUT • Total cost in LUT’s is 8nd • Parent Registers ▫ Hold signal prevents re-entry • Crossover ▫ Crossover template controls cutpoint variation • Mutation ▫ Controlled by AND, XOR • Child Registers ▫ Connected to fitness function and population memory CROSSOVER MUTATION PERFORMANCE • Time for each stage: ▫ fc = clock frequency • Net throughput: ▫ Nf = number of function units ▫ Ii = Initiation interval of fitness function RESULTS First Prototype Second Prototype Theoretical Problem Type Set-Covering Protein Folding Protein Folding FPGA Implementation 6 Aptix AXB-MP3 FPGAs 1 MHz Xilinx SCV300 66 MHz Xilinx XCV3200E Software Implementation Workstation 100 MHz Pentium II 366 MHz Workstation 100 MHz Speedup 2,200x 320x 9,600x CONTRIBUTIONS • Bit-slice design which is amenable to FPGA implementation • A net child chromosome generation rate of one per clock cycle is obtained CRITIQUE • Doesn’t emphasize what makes this algorithm superior to others • Theoretical speedup of 9,600x relies on large FPGA and many complex fitness functions, but genetic algorithms do not scale well with complexity • Would like to see more diagrams/figures to help explain concepts QUESTIONS?
© Copyright 2026 Paperzz