## Datasets for Optimization Problems

Lehrstuhl II for Mathematics, Focus: Discrete Optimization
RWTH Aachen University

### Overview

On this webpage, we present data for various problems in the field of discrete mathematics. This data has been used for computational experiments in different research works. Where possible, we provide background information on the different datasets.

### Network Design with Compression (NDPC)

The Network Design Problem with compression (NDPC) is an extension of the classic Network Design Problem (NDP). For further information, we refer to the following papers:

Extended Cutset Inequalities for the Network Power Consumption Problem, Koster, Phan and Tieves (link).
Design and management of networks with low power consumption, Phan (link).

The following data was originally created for the classic Network Design Problem and was used in the context of the paper

Cutset Inequalities for Robust Network Design, Koster, Kutschka and Raack (link).

In this work, the topologies "abilene", "geant", "germany17" and "germany50" have been used. They are taken from the SNDLIB (link), we refer to the same website for a description of the file format. The data can easily be expanded to a complete data set for the NDPC problem.

In the Ph.D. thesis of M. Tieves

Discrete and Robust Optimization Approaches to Network Design with Compression and Virtual Network Embedding, Tieves (to appear),

the data has been expanded for the NDPC problem. The additional data was created at random, we refer to the above link for further information. The corresponding datasets for the deterministic problem with average and peak data are provided below. The files are in AMPL readable format.

Explanation/Format:
Nodes: Contained in the set V.
Edges: Contained in the set E.
Costs: b (compressors) and c (edge capacities).
Edge Capacity: k.
Commodities: Set Q, For q in Q: dest[q,i]=1 if node i is q's source node, ..=-1 if it is q's sink. d[q] describes the demand volume.

In the Ph.D. thesis of M. Tieves, the robust NDPC problem is also considered. Find below the corresponding dataset (again in AMPL readle format).

Explanation/Format:
The format is the same as for the deterministic case. Additionally, d_nom[..] describes the nominal volume of a commodity and d_dev[..] describes the maximum deviation of a commodity. Similar, the scal_nom[...] and scal_dev[..] values refer the the nominal value and the maximum deviation of the compression factor.

### Virtual Network Embedding (VNE)

The Virtual Network Embedding Problem (VNE) asks, given a physical (substrate) network with node and link resources, how many virtual networks with node and link demands can be realized on the substrate network. For further informationen, we refer to the VINO project (link) and to the following two papers:

Virtual network embedding under uncertainty: Exact and heuristic approaches, Coniglio, Koster and Tieves (link)
Data Uncertainty in Virtual Network Embedding: Robust Optimization and Protection Levels, Coniglio, Koster and Tieves (link)

as well as to the Ph.D. thesis of M. Tieves:

Discrete and Robust Optimization Approaches to Network Design with Compression and Virtual Network Embedding, Tieves (to appear).

In these works, two different datasets have been used.

The first dataset contains small to medium sized instances. The substrat networks of these instances have been taken from the SNDLIB (link), they corespond to the directed topologies "abilene", "atlanta", "nobel-us" and "polska". The corresponding virtual networks (5 to 32 requests, identifier "-r5-" to "-r32-" in the file name) contain 12 virtual nodes each. The topologies and the demand values of the virtual networks have been generated at random. The files are in an AMPL readable format.

The second dataset contains large instances. The substrate networks of these instances have been taken from the Internet Topology Zoo (link), they correspond to the directed topologies "bellsouth, "cernet", "cogentco", "deltacom", "digex", "fatman", "intellifiber" and "redbestel". The corresponding virtual networks (5 to 50 requests, identifier "-r5-" to "-r50-" in the file name) contain 12 virtual nodes each. The topologies and the demand values of the virtual networks have been generated at random. The files are in an AMPL readable format.

Explanation/Format:
Substrate-Network node (param V): "node, capacity"
Substrate-Networkz Arc (param A): "node, node, capacity"
Number of virtual networks: "r = ..."
Number of historic data tarces: "t= ..."
Virtual network "i", number of nodes: "vn[i]:= ..."
Virtual network "i" profit: "p[i]:= ..."
Virtual network "i" topology: "an[i,j,k]:= 1" (arc (j,k) exists for the virt. network i).
Virtual network "i", locality conditions for node j: V_local[i,j] := ... (all nodes on which j may be embedded)
Virtual network "i" historic demand in traffic trace t for the virtual arc (j,k): "d_historical[i,t,j,k]:= ...".
Virtual network "i" historic demand in traffic trace t for the virtual node k: "w_historical[i,t,k]:= ...".
d_nominal and w_nominal describe the average demands over all historic traces, d_deviation and w_deviation refer to the maximum deviations from these averages. All additional data has not been used in the papers mentioned above.

### Equitable Graph Coloring

We refer to the project webpage (link).