Breaking Bad? Semantic Versioning and Impact of Breaking Changes in Maven Central
The current content was originally published in the crossminer/maracas repository. All the presented results were computed with the corresponding tool.
The content presented in this website accompanies the paper “Breaking Bad? Semantic Versioning and Impact of Breaking Changes in Maven Central” authored by Lina Ochoa, Thomas Degueule, Jean-Rémy Falleri, and Jurgen Vinju. The paper was submitted and accepted in the Journal of Empirical Software Engineering (EMSE’21). This study is an external and differentiated replication study of the paper “Semantic Versioning and Impact of Breaking Changes in the Maven Repository” presented by Steven Raemaekers, Arie van Deursen, and Joost Visser.
Research Questions & Hypotheses
- Q1: How are semantic versioning principles applied in the Maven Central
repository in terms of BCs?
- H1: BCs are widespread without regard for semantic versioning principles.
- Q2: To what extent has the adherence to semantic versioning principles increased over time?
- H2: The adherence to semantic versioning principles has increased over time.
- Q3: What is the impact of BCs on clients?
- H3: BCs have a significant impact in terms of compilation errors in client systems.
Corpora
- Maven Dependency Dataset (MDD): This corpus is a snapshot of Maven Central Repository (MCR). It was introduced by Raemaekers et al. in this paper and it can be downloaded from here.
- Maven Dependency Graph (MDG): This corpus is a snapshot of MCR 2018. It was introduced by Benelalla et al. in this paper and it can be found at Zenodo.
Conclusions
- Q1: We conclude that although semver principles are not always strictly applied (20.1% of non-major releases are breaking), they are largely followed: 83.4% of all upgrades comply with semver regarding backwards compatibility guarantees, and the differences between semver levels are notable. Not only do minor and patch releases break less often than major releases, they also introduce fewer BCs. This leads us to reject H1.
- Q2: Our results confirm the results of the original study. They also show that the improvement over time is much higher than initially reported
for the 2005–2011 period. The tendency persists in the 2011–2018 period, although the slope is less steep. Thus, we cannot reject H2.
- Q3: We observe that in most cases breaking declarations are not used by client projects, which instead yields a low number of broken clients (7.9% for all releases). The number is even lower in the case of minor and patch upgrades. However, when a breaking declaration is used by a client, there is a high chance that it will be impacted. These results contrast with those of the original study and lead us to reject H3.
Takeaway
- The introduction of BCs is inherent to software evolution and cannot always be avoided.
- Introducing BCs is tolerable as long as they are properly announced in advance to not take clients by surprise (use versioning conventions or code-level mechanisms).
- In MCR, most releases comply with semver requirements and avoid BCs in non-major releases. Besides, the situation has significantly improved over time.
- Each ecosystem has its own policy regarding versioning conventions and the treatment of BCs. Client developers should thus pay attention to ecosystem-specific guidelines and pick an ecosystem that advocates a strict policy to minimize the risk of being impacted by unwanted changes.
- There is an open opportunity to clarify the role of code-level mechanisms and their relation with semver.
- Programming languages can incorporate better mechanisms to delimit public APIs.
- Researchers should also strive to design and implement benchmarks to
compare tools related to library evolution objectively.
- When analyzing software evolution, the design of a study protocol and the creation of the underlying datasets should be carefully performed. Sampling bias is a recurrent threat to validity that can hurt the interpretation of the results.
Annexes
Breaking Changes and Detections
In the study, we consider 31 types of breaking changes as reported on the Java Language Specification Java SE 8 Edition, and the blog post “Evolving Java-based APIs” authored by Jim Des Rivières.
We rely on JApiCmp to extract these changes from Java bytecode.
- Interface added
- Interface removed
- Class less accessible
- Class no longer public
- Class now abstract
- Class now final
- Class removed
- Class type changed
- Constructor less accessible
- Constructor removed
- Field less accessible
- Field more accessible
- Field no longer static
- Field now final
- Field now static
- Field removed
- Field type changed
- Method abstract added to class
- Method abstract now default
- Method added to interface
- Method less accessible
- Method more accessible
- Method new default
- Method no longer static
- Method now abstract
- Method now final
- Method now static
- Method removed
- Method return type changed
- Superclass added
- Superclass removed
Details on how we detect these breaking changes are provided here.
Maracas
Datasets and Notebooks
The main datasets and notebooks used for this study are contained in the maven-api-dataset repository (warning: > 2GB). This repository contains:
- Jupyter notebooks for every research question and both corpora (MDD and MDG). They can be used to play with the data, re-generate the paper’s figures, and run the statistical tests.
- The code used to compute the datasets, deltas, and detections.
- CSV files containing the annotations extracted from the top-1000 most popular libraries on Maven Central, the 148,253 artifacts extracted from the MDD, and the most popular version suffixes in the MDG.
- CSV files containing the raw data used to answer every research question, automatically generated by our analysis scripts, imported, and analyzed in the notebooks.