Formally, a bipartite network can be defined as G=X, Y, E
, where X and Y are two nonoverlapping sets of nodes and E is the set of links that only exist between X and Y, i.e., e=
x, y
∈E where x∈X and y∈Y. For a bipartite network, the number of all possible links is |X|·|Y| and we denote these links as U. Thus, a link prediction problem is to identify which links are likely missing in the set U?E.
Link prediction algorithms, specifically similarity-based methods, are used that first compute the similarity of every non-connected pair of nodes. Based on the similarity values, it can generate a ranked list of missing links with decreasing scores for recommendation. One way to compute the similarity between pairs of nodes is via a random walk. Another way of measuring similarity is based on comparing the neighborhoods of two nodes, including common neighbors, jaccard coefficient, adamic-adar coefficient, and preferential attachment.