BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News The Demise of Open Source Hosting Providers Codehaus and Google Code

The Demise of Open Source Hosting Providers Codehaus and Google Code

At the turn of the millenium, a new breed of open-source hosting platforms was created to provide free hosting for open-source projects. The inaugral hosting service was SourceForge, created by VA Linux as a means to host open-source projects in 1999, to support their VA Linux product created in 1993. The repository provided a location for developers to host code (with CVS), have an issue tracking system, mailing lists and hosting for download purposes. By the end of 2001, over 30,000 projects were hosted on SourceForge. By 2006 the number of projects had grown to 100k, and adding Google Ads provided a means of income to support the hosting site. 2006 also saw Subversion being added to the platform.

Whilst the back-end code for SourceForge was initially available under an open-source license, the software went closed source for a few years before being sold to Collab.Net in 2007, which went onto become TeamForge. The hosting platform continued to make money through advertising and ownership changed a few times. The software was re-written in 2010 and has subsequently been open-sourced under an Apache license and is now branded Apache Allura.

Other open-source forges were created; Codehaus (pronounced Code House) was one popular site, created in 2003, a few years after SourceForge. After starting with projects such as Jaxen and Picocontainer the popularity grew, especially in the Java and Maven community, as Jason van Zyl – the creator of Maven – popularised it to the point where many Maven plug-ins are hosted on or have documentation on Codehaus. The Groovy language, up until its recent transition to Apache, was documented on groovy.codehaus.org.

However, in April 2008 a new startup launched a source code hosting platform entirely based around the Git version control system, called GitHub. This popularised the distributed version control system Git, created a few years earlier in 2005. The power and elegance of DVCS eclipsed centralised version control systems, and in a relatively short space of time projects migrated from CVS and Subversion to Git. Although SourceForge added support for Git in March 2009, the early mover advantage of GItHub saw users flock to it – over 40k repositories by the end of 2008, and 77k repositories by March 2009. GitHub continued to grow, and had 1m repositories by mid-2010, 2m repositories by mid-2011 followed by 10m repositories by the end of 2013. By way of comparison, SourceForge claimed 430k projects at the time of writing.

GitHub became larger than SourceForge in 2011 according to Redmonk and although SourceForge claimed to still be relevant it was clear which way the direction was going. Part of this may be due to the way SourceForge populates ads on the site; adverts that included a 'Download' button but led to virus installs resulted in a acknowledgement that things needed to change in 2010, followed by exactly the same acknowledgement in 2015. It also didn't help that the downloads service started serving bundleware with downloads, which was identifed by the GIMP developers and why they switched back to FTP. This then became a driver to host projects elsewhere, which helped GitHub's further growth (GIMP is now available on GitHub).

Other DVCS hosting providers were created in a similar time frame (like BitBucket and eventually Google Code) but they bet on the wrong DVCS implementation, which lacked the performance and simplicity of Git. By the time Google Code added Git support in 2011 it was too little, too late and the repository never saw the metoric growth that the early adopters used.

For all these reasons, traditional reposotory hosting services like Codehaus (which added Git support before SourceForge, in 2008) saw a lack of traffic, and GitHub became the new go-to site for repository creation and subsequently migration. In an announcement posted on the site at the start of March, the Codehaus website noted that:

The time has come to end the era of Codehaus.

With increasing diversity in opensource hosting platforms like Github and Bitbucket - who are meeting the needs of 1000s of projects - it makes sense to end the opensource hosting services of Codehaus.

Codehaus has operated at a loss for several years now (we're not powered by venture capital), and can not compete with the army of developers and integrated product offerings that are now commonplace.

The platform was to be terminated at the end of February 2015, however SonarQube has graciously offered to sponsor Codehaus for a few months to aid in the transition. Projects and services will be progressively taken offline from April 2nd 2015 onwards.  Inactive projects will be migrated first, followed by active projects that are ready to migrate early. Most projects and services will be terminated around May 17th 2015 unless otherwise agreed. 

Dustin Marx writes In "Codehaus: The Once Great House of Code Has Fallen":

Codehaus has played a significant role in the world of Java development. An interesting and brief retrospective on Codehaus's glory days can be found in the one of the comments on the Java sub-reddit thread "Codehaus, birthplace of many Java OSS projects, coming to an end." A bit more history of Codehaus can be found, for now, in Codehaus | About | History. Some Java-related projects hosted on Codehaus include AspectWerkz, Castor, PicoContainer, and XStream. Well-known Java-related projects that were formerly on Codehaus before moving somewhere else include JMock, Mule, Jackson, and XDoclet. Until recently, Groovy and its documentation were accessed at http://groovy.codehaus.org/ (now accessed via http://groovy-lang.org/). [Ed: Groovy is now moving to Apache]

Just as GeoCities was overtaken by other web hosting sites and social media, and just as Dr. Dobb's was overtaken by a plethora of online content covering everything from low-level details to high-level breadth, Codehaus was overtaken by SourceForge and Google Code and eventually GitHub has overtaken all of them. 

As if responding to the closure, Google announced that it too was shutting down Google Code, after having previously disabled downloads for Google Code in May 2013:

When we started the Google Code project hosting service in 2006, the world of project hosting was limited. We were worried about reliability and stagnation, so we took action by giving the open source community another option to choose from. Since then, we’ve seen a wide variety of better project hosting services such as GitHub and Bitbucket bloom. Many projects moved away from Google Code to those other systems. To meet developers where they are, we ourselves migrated nearly a thousand of our own open source projects from Google Code to GitHub.

As developers migrated away from Google Code, a growing share of the remaining projects were spam or abuse. Lately, the administrative load has consisted almost exclusively of abuse management. After profiling non-abusive activity on Google Code, it has become clear to us that the service simply isn’t needed anymore.

March 12, 2015 - New project creation disabled.
August 24, 2015 - The site goes read-only. You can still checkout/view project source, issues, and wikis.
January 25, 2016 - The project hosting service is closed.

SourceForge has responded to the upcoming closure of Google Code with advice of how to move from Google Code to SourceForge, including a Google Code importer that will migrate not only the code but also the issues, wiki and other metadata, whilst Google themselves have created an export from Google Code to GitHub service. 

With Microsoft opening some of its code on GitHub as well as developing code in the clear like TypeScript, msbuild and dotnet, hosting providers like CodePlex may be next in line for consolidation.

Although Subversion itself hasn't migrated to Git yet, it's clear that DVCS has won out in mindshare and specifically Git has become the repository manager of choice for all but a few projects (OpenJDK is one of the few that are on Mercurial). Together with new tools like Atlassian Stash (the enterprise version equivalent to Bitbucket), Gerrit (and its enterprise supported version at GerritHub), GitLab (along with enterprise versions) it's clear that the next decade of source hosting forges will be distributed and social by default.

Project owners with repositories on Codehaus should migrate before May, whilst project owners with repositories on Google Code should migrate before January next year.

Rate this Article

Adoption
Style

BT