Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Dec 13;108(50):19985-9.
doi: 10.1073/pnas.1115960108. Epub 2011 Nov 21.

Evolution of a modular software network

Affiliations

Evolution of a modular software network

Miguel A Fortuna et al. Proc Natl Acad Sci U S A. .

Abstract

"Evolution behaves like a tinkerer" (François Jacob, Science, 1977). Software systems provide a singular opportunity to understand biological processes using concepts from network theory. The Debian GNU/Linux operating system allows us to explore the evolution of a complex network in a unique way. The modular design detected during its growth is based on the reuse of existing code in order to minimize costs during programming. The increase of modularity experienced by the system over time has not counterbalanced the increase in incompatibilities between software packages within modules. This negative effect is far from being a failure of design. A random process of package installation shows that the higher the modularity, the larger the fraction of packages working properly in a local computer. The decrease in the relative number of conflicts between packages from different modules avoids a failure in the functionality of one package spreading throughout the entire system. Some potential analogies with the evolutionary and ecological processes determining the structure of ecological networks of interacting species are discussed.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Dependencies and conflicts between packages during the installation of the Debian GNU/Linux operating system. Package i depends on package j (green arrows) if package j has to be installed first on the computer for i to work. Package i has a conflict with package j (red arrows) if package i cannot be installed if j is already on the computer. Packages, represented by nodes, are available for installation from the online servers or repositories (indicated in the figure by the cloud). The character of the interaction between packages determines which ones can be eventually installed on the computer. In this specific example, green nodes (1–4) represent packages already installed on the computer. For the network of packages in the cloud, only the package represented by the yellow node (5) can be installed on the computer. Package 6 has a conflict with an already installed package (2), and the remaining ones (7–10) depend directly or indirectly on it. In this schematic local installation process, only half of the available packages can be installed on the computer. Different temporal sequences in the order of package installation will result in different sets of installed packages or, in other words, functionalities of the operating system (i.e., fraction of installed packages of the total number of available packages).
Fig. 2.
Fig. 2.
Schematic representation of the growth of the Debian GNU/Linux operating system through its first major releases. Circles depict releases and are arranged following the temporal sequence (from left to right). Their area is proportional to the logarithm of the number of packages in each release. Three arrows represent the transition between releases: The outgoing arrow indicates the number of packages that are deprecated from one release to the other; the incoming arrows represent the number of packages that give rise to the next release (some of them are updated from the previous release and the others are new packages). The number on the last node indicates the number of packages of the last analyzed release.
Fig. 3.
Fig. 3.
Cumulative degree distribution of the number of incoming (solid lines) and outgoing (dashed lines) dependencies for the software packages of the first and last releases (on top and on bottom, respectively) of the Debian GNU/Linux operating system analyzed here. The figures depict the probability, P(k), for a package to depend on or to be needed by at least 1,2,3,…,k packages to work. Both axes are in logarithmic scale. In all cases, the best fit for the outgoing dependencies is an exponential function, whereas for the incoming dependencies is a power law (see also SI Appendix, Fig. S1).
Fig. 4.
Fig. 4.
Evolution of the modular structure of the network of dependencies between packages of the Debian GNU/Linux operating system. Packages are represented by nodes. A green arrow from package i to package j indicates that package i depends on package j, and a red arrow indicates that package i has a conflict with package j. Packages within a module (depicted by a big circle) have many dependencies between themselves and only a few with packages from other modules. During the growth of the operating system, the modular structure of the network of dependencies has increased: (I) The new packages added in successive releases depended mainly on previously existing packages within the same module, and hence, the size of the modules created in earlier releases increased over time; (ii) the number of modules also increased, although the new modules consisted only of a few new packages; and (iii) the relative number of dependencies between packages from different modules decreased. Moreover, the relative number of conflicts between packages from different modules decreased, whereas those within modules increased through the different releases of the operating system.
Fig. 5.
Fig. 5.
Changes of the modular structure (measured as the Z score of the modularity compared to a randomization of the modular structure) and functionality (measured as the Z score of the fraction of packages installed in a local computer compared to that installed from a randomization of the modular structure) for the network of dependencies of the releases of Debian GNU/Linux operating system analyzed here. The positive effects of the modular structure on the functionality shows up strongly in the last three releases (linear increase), although the exponential increase of the modularity happens in the first ones.

References

    1. Strogatz SH. Exploring complex networks. Nature. 2001;410:268–276. - PubMed
    1. Watts DJ, Strogatz SH. Collective dynamics of “small-world” networks. Nature. 1998;393:440–442. - PubMed
    1. Barabási A-L, Albert R. Emerging of scaling in random networks. Science. 1999;286:509–512. - PubMed
    1. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási A-L. The large-scale organization of metabolic networks. Nature. 2000;407:651–654. - PubMed
    1. Liljeros F, Edling CR, Amaral LAN, Stanley HE, Aberg Y. The web of human sexual contacts. Nature. 2001;411:907–908. - PubMed

Publication types