Motivated by the relevance of clustering or transitivity to a variety of network applications, we study the Triangle Interdiction Problem (TIP), which is to find a minimum-size set of edges that intersects all triangles of a network. As existing approximation algorithms for this NP-hard problem either do not scale well to massive networks or have poor solution quality, we formulate two algorithms, TARL and DART, with worst-case guarantees 5/2 and 3 with respect to optimal, respectively. Furthermore, DART is able to efficiently maintain its worst-case guarantee under dynamic edge insertion and removal to the network. In our comprehensive experimental evaluation, we demonstrate that DART is able to run on networks with billions of triangles within 2 hours and is able to dynamically update its solution in microseconds.Slides (Presentation at IEEE ICDM 2017)
Provided source code
for DART1, DART2, TARL, KORTSARZ, and the primal-dual algorithm
to approximate the TIP problem. Code for DART1,DART2,PRIMAL-DUAL is in
TARL and KORTSARZ are implemented in
glpk_solver.cpp. An example program demonstrating
the algorithms is provided in
sudo apt-get install libglpk-dev
make in the root directory of the source code,
which produces compiled program
tip. An already compiled binary, compiled on
an intel i7 amd64 processor is provided.
The primary input file format is a simple, undirected edge-list format in plain text, with the number of nodes (max. node id) on the first line, and each undirected edge listed on the following lines.Example usage:
To run DART1 on Youtube: ./tip -G ./datasets/com-youtube.ungraph.txt -D To run DART2 on Youtube: ./tip -G ./datasets/com-youtube.ungraph.txt -E To run TARL on Gnutella: ./tip -G ./datasets/p2p-Gnutella08.txt -T To run KORTSARZ on Gnutella: ./tip -G ./datasets/p2p-Gnutella08.txt -K To run PRIMAL-DUAL on Gnutella: ./tip -G ./datasets/p2p-Gnutella08.txt -P
tipwill log information to the standard output as it runs. Once the specified algorithm completes, it will report the size of solution S, the running time in seconds, and the peak memory usage in MB.
Options: -G [graph filename in edge list format] -g [graph filename in binary edge list format] -o [output filename] (output file in xml format) -D (run DART) -E (run DART (second implementation)) -T (run TARL) -K (run 2-approx. of Kortsarz et al.) -O (run optimal solution via GNU GLPK) -P (run primal-dual algorithm) -p (Preprocess graph and count triangles only) -A [m_add] (run DART, then adaptively add [m_add] random edges to the network) -R [m_remove] (run DART, then adaptively remove [m_remove] random edges from the network) -S [m_addremove] (run DART, then adaptively add [m_addremove] random edges to the network, and then remove the same edges) -t [time limit in hours, default is 4]