Some stats on the evolution of the WordPress codebase
May 13, 2015
The WordPress project advances thanks to the work of many people that every single day report bugs, contribute patches or help in many other ways . I believe it’s important that from time to time we take a step back from this day-to-day frenzy activity and take a look at how the project is evolving to make sure the data trends confirm we’re going in the right direction.
Today, I’ll be providing some metrics and charts focusing on the evolution of the WordPress codebase. The data has been calculated using Gitana , a SQL-based Git repository inspector. Valerio Cosentino, the main developer behind Gitana, has been kind enough to apply Gitana to the WordPress Git mirror on GitHub .
Keep in mind that this is just the tip of the iceberg. There are tons of interesting data that could be calculated (and not only based on the repository but also considering Trac discussions, e.g. see some of the things we can analyze for GitHub projects ). If you have a specific question/request in mind please fire it up and we’ll see what we can do. There are quite a lot of software visualization tools that could be applied to WordPress to get a different view on the project!
Time between versions
The public WordPress Roadmap says that “After the 2.1 release, we decided to adopt a regular release schedule every 3-4 months” . Well, I’d say it took some time to adjust but looking at the latest releases we’re getting good ad that.
Number of commits per version
How much work goes into each WordPress version? Quite a lot I’d say, the following chart shows that each version involves more than 1000 commits in the codebase. The number of commits per version is becoming more uniform (which makes sense given that the release cycle is also becoming more regular).
And what about the files included in each version? Data shows that
Number of files is showing a slight but steady growth (typical “disease” of any project)
Once a file gets in the codebase, it doesn’t get out. Only a very small percentage of commit actions on files are deletions, some more are insert actions and most of the time we have modifications on existing files. Note the percentages do not refer to the total number of files but to the number of actions on files. Looking at this data I do wonder if all the files are really necessary (but this is something only a, mostly, manual analysis could answer).
As a curiosity, below you’ll find the top 10 files according to the number of modifications. I guess that the most modified file is version.php should not come as a surprise. By the way there are over a 100 files that have not been modified a single time since they were added.
times the file has been modified
The success of an open project largely depends on its capacity to attract an important number of contributors that decide to devote some of its time to the project. I believe a large number of such contributions should come from “external” people (as opposed to the “internal” people, i.e. lead developers and other core people with commit access rights).
The following charts shows the accumulation of commits in the WordPress codebase separating commits between the two groups of contributors (since patches from external contributor are committed by core contributors, distinction is made by detecting commits using the @props tag thanking the original submitter).
WordPress scores quite good in this “openness” metric, specially when compared with other projects . Still, there is plenty of opportunities for improvement here.
But even if WordPress is the result of the hard work of many people, there are always some people that are more deeply involved than others. So I think it’s fair to finish the post thanking the top 5 committers of the project. We show both the top 5 committers counting only commits with own contributions and the top 5 committers when counting only commits with external contributions (both are necessary, without people taking the time to review and guide external contributors the pace of WordPress would be much slower).
I find impressive that these five people are behind of more than 50% of the total project commits (even this can be considered a risk also based on the bus factor metric ).
Featured image adapted from fyrenwater .
Community , Technical Details
codebase , commit , contribution , repository
Share this post