www.nw.neclab.eu

About Us Topics Projects Jobs



Internet AS-level topology construction & analysis
About the project
Type of Project:

This project is partly sponsored by the European Commission "Information Society Technologies" (IST), 7th Framework through the Trilogy project and receiving substantial support from NEC´s IT Research Division in St. Augustin. All data displayed on this website has been produced using one of their SX-series vector computers. We would like to thank UCLA´s Internet Research Lab, the RouteView´s Project and RIPE´s RIS for providing the data we are using for our computations!

 

UPDATE: Due to the current unavailability of the SX Vector Comupter, we were forced to pause this activity. We are working on a solution right now and will resume our activities once an alternative is in place.

Motivation and Project Goals
The Internet is constantly growing. Not only the number of end user devices such as desktop computers, laptops and cell phones is increasing rapidly but also the number of networks connecting to the Net is growing fast. These growth trends not only challenge the scalability of Internet technology but measuring the Internet, modeling it and understanding its evolution is becoming increasingly complex as well. This project is therefore aiming at two things: continuously analyzing the quality of the Internet´s fabric - the AS-level topology - and providing a simplified representation of it for researchers that need to simulate or evaluate a new technology or algorithm on top of the Internet´s AS-level graph.

Many fields in Internet research are dependant on available topology data to perform realistic evaluations. The mere availability of topology data it appears is not the biggest problem to solve any more. Projects such as RouteViews [1], RIPE RIS [2] or DIMES [3] to name just three provide extensive sets of topology data. And although these data sets do not provide a complete view of the Internet´s topology, they approximate the Internet´s underlying graph to an increasingly satisfying manner although they will never be able to claim completeness. While data collection is constantly improving, researchers have to put a lot of effort into extracting the relevant data and transforming that data into something they can make use of. This includes e.g. parsing data such as routing tables, constructing graph representations of the AS-level connectivity and performing time consuming computations to analyze the data.

The first contribution of this project is therefore to provide an AS-level topology as an easy to use pair of matrices, one approximating the distances between any given AS pair, the other one representing the next hop matrix to reconstruct AS paths. As part of this effort, we analyze the accuracy of the provided model by comparing the paths in those matrices against a large set of routing table data from [1] and [2]. The matrices themselves are based on data provided by the Internet Research Lab. In addition we classify the ASes as tier-1, tier-2 and tier-3 ASes and make these data sets also available. For details on the construction process and the analysis please refer to [5]. We re-compute all of this once a month and make the updated and a number of older data sets available. A description of the data and used formats can be found here.

A second contribution of this project is to constantly monitor the "quality" of the current Internet routing system. Based on the data from [4] we construct the best possible routing topology in terms of AS path length, which is the metric for path "goodness" as seen by BGP. Such a topology would be constructed in the absence of routing policy. We compare this idealized topology against a vast amount of routing table data and draw a number of conclusions from it. This policy-free BGP topology is also provided as a pair of matrices.


The data can be retrieved here:
Please, if you make use of this data let us know. We are interested in learning what other people are using our data for. We are also open for suggestions that help us to improve our analysis of the data and generate better models. To contact us, please use our contact form. Thanks!

If you would like to cite the data source in a publication please use [5] as the reference.


Statistics for 2008: | 09 | 10 | 11 | 12 |

Statistics for 2009: | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 |


Cumulative statistics
The plots below depict the development of some characteristics of the Internet's AS-level topology. Each individual data point in the plots comes with a more elaborate analysis which can be accessed through the links above. The first two graphs illustrate the development of the node degree in the graph, where a node represents an AS, based on the data from the Internet Research Lab. The first graph show the maximum degree whereas the second graph shows the average degree. A degree distribution is provided for each month through the links above.


Maximum AS degree

Average AS degree

The next two graphs show some statistics about the distances between ASes in the Internet. The distances shown are those taken from the shortest path matrix, i.e. they are not the ones that can be extracted from routing tables but the ones that a policy-free shortest AS path routing protocol would construct. Again, more elaborate per month statistics can be accessed through the links above.


Maximum distance

Average distance

The next two graphs show the distances we observed in the routing tables used for our analysis. One has to bear in mind that the distances in the routing tables are somewhat different in nature than the distances given above. The distances here are from very specific observation points, many closer to the core than to the edge of the AS-level topology. This results into a slightly biased view. Distances observed from the core of the Internet are on average shorter than when observed from the edges. Furthermore, the comparably small amount of tables used (compared to the total number of ASes) strengthens this trend.


Maximum distance

Average distance

Finally, the last set of graphs show the AS path inflation which we calculate by comparing the paths found in the routing tables against the corresponding paths in the shortest path matrix (think of comparing BGP today against a it's policy-free counterpart). The set of graphs below show the maximum path inflation, the average path inflation and the fration of paths with no observable inflation.


Maximum path diversity

Average path diversity

Fraction of paths with no path diversity

Finally, the last set of graphs show the AS path inflation which we calculate by comparing the paths found in the routing tables against the corresponding paths in the shortest path matrix (think of comparing BGP today against a it's policy-free counterpart). The set of graphs below show the maximum path inflation, the average path inflation and the fration of paths with no observable inflation.


Maximum path inflation

Average path inflation

Fraction of paths with no path inflation



Data files and formats:
We make the following files available for download:

1. nextHops.tar.gz & nextHopWeighted.tar.gz - contains the next hop matrix of the AS-level topology
Each line in the file represents the next hop on a path from the AS associated with the row towards the ASes associated with the column (the asMapping.txt file above contains that row, column mapping). Columns are separated by tabs. A value of -1 indicates that the source equals the destination. The first file (nextHops.tar.gz) contains that information for the policy-free topology, the second file contains the next hop matrix for our Internet-like model.

2. shortestPaths.tar.gz & shortestPathsWeighted.tar.gz - contains the next hop matrix of the shortest path AS-level graph
Each line in the file corresponds to the distance in AS hops between the AS associated with the row and the ASes associated with the columns (the asMapping.txt file above contains that row, column mapping). Columns are separated by tabs. The first file (shortestPaths.tar.gz) contains that information for the policy-free topology, the second file contains the distance matrix for our Internet-like model.

3. asMapping.txt - contains the list of AS numbers found in the matrices
This file is a simple list of AS numbers, one per line. Each line corresponds to the row and column in the matrix file. In other words if line 10 in the asMapping.txt file contains the number 701, then the row and column in the matrix files refers to AS 701.

4. tierXASList.txt - contains the list of AS numbers classified as belonging to tier X
The X in the above filename is either a 1, 2 or 3. Each line in the files contains an AS Number. Each AS found in that file has been classified to belong to tier X. The classification works as follows. First, we identify the AS with the highest degree, i.e. with the maximum number of neighboring ASes. This one we classify as a tier 1 AS. Then we find the AS with the next most number of neighbors. In case it is connected to ALL the already classified tier 1 ASes, it is also classified as a tier 1 AS. This process stops once an AS is found which is not connected to all already classified tier 1 ASes. Then, we classify all remaining ASes as tier 2 in case it has a connection to at least one of the tier 1 ASes identified in the previous step. Finally, all remaining ASes are classified as tier 3.


Links to related projects and further information:

[1] RouteViews Project, http://www.routeviews.org
[2] RIPE RIS, http://www.ripe.net/ris
[3] The DIMES Project, http://www.netdimes.org
[4] UCLA´s Internet Research Lab, Topology Data, http://irl.cs.ucla.edu/topology/
[5] Rolf Winter, "Modeling the Internet Routing Topology with a Known Degree of Accuracy - in less han 24h", ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation 2009 (PADS 2009), please use the to receive a version of it


Note: These pages are auto-generated. In case they contain errors please contact the NEC Network Labs using the . Note further: The here provided data and analysis can only be used for non-commercial purposes. Individuals and businesses that wish to use our data for any commercial, for-profit activity must have prior written consent from the NEC Network Laboratories. The information provided here comes with absolutely NO WARRANTY. There is no guarantee that the information is accurate to any particular degree, has any value, or is fit for any purpose. In particular, we disclaim responsibility for any consequences arising from use of our data.

Publications

Publication R. Winter, "Modeling the Internet Routing Topology with a Known Degree of Accuracy - in less han 24h" ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation 2009 (PADS 2009)

Contact
For more information on the Trilogy project please visit the project website, consult NEC's own project description or use our .