Community Detection Using Genetic Algorithm and Cuckoo Search

Koorosh Moslemi · July 29, 2022

This project was one of my assignments for an AI course at AUT. In this project, one had to find communities in a network using the Genetic Algorithm and Cuckoo Search. I will mainly describe steps specific to this problem rather than explaining the algorithms.

First things first, let’s look at a sample input graph:

Finding Communities Using Genetic Algorithm:

  1. Genetic Representation and Fitness Function:
    • I used a locus-based representation for this problem which is illustrated as follows:
    • Moreover, I used the modularity measure as a fitness function.
  2. Selection:
  3. Crossover and Mutation:
    • A uniform crossover was used as follows:
    • It is worth mentioning that the initial population is not completely random. Instead, the process considers the original network’s connections, meaning that individuals avoid uninteresting divisions containing unconnected nodes. In this way, the search space is restricted, and it is likely to converge faster. We call these individuals safe. As a result, given two safe parents, a safe child is generated in a uniform crossover. Furthermore, in the mutation step, we make sure safety is regarded.

Finally, after 50 iterations with 200 initial populations, the following communities were found. In this division, the value of the fitness function is 0.41978961209730353, and the mutation rate is 0.4.

Cuckoo Search is based on three idealized rules:

  1. Each cuckoo lays one egg at a time and dumps its egg in a randomly chosen nest;
  2. The best nests with high-quality eggs will carry over to the next generation;
  3. The number of available host nests is fixed, and the egg laid by a cuckoo is discovered by the host bird with a probability between 0 and 1. In this case, the host bird can throw the egg away and abandon the nest, and build a completely new nest.

The pseudo-code is as follows:

Generation of new solutions is performed using a Levy flight, which is a random walk in which the step lengths have a Levy distribution.

The original Cuckoo Search(CS) is for continuous space. Thus, we need to change the nest location updating strategy and the abandon operator in discrete form because of the discrete nature of community detection. Here is one solution:

According to this paper:

After 30 iterations with 200 initial populations, the following communities were found. In this division, the value of the fitness function is 0.4197896120973035.

All of the parameter choices and the source code are available in this repository.

References:

  1. Pizzuti Clara. GA-Net: A Genetic Algorithm for Community Detection in Social Networks.
  2. Xin-She Yang and Suash Deb. Cuckoo Search via Lévy flights.
  3. Xu Zhou, Yanheng Liu and Bin Li. A multi-objective discrete cuckoo search algorithm with local search for community detection in complex networks.