Integrating Subversion, Trac, and Mailman into a development environment: part 1, design
July 1, 2010,
Many software development projects are hosted in a 'forge', a web application that provides tools such as source code management, mailing lists, and a bug tracker. We were recently asked to provide such an environment, based on Trac, Mailman, and Subversion. We thought it might be informative to document our findings.
The Netherlands Bioinformatics Centre (NBIC) hosts a number of software development projects. Hosting these projects was the responsibility of SARA, the Dutch national High Performance Computing en e-Science Support Center. SARA has asked us to design a project development environment that meets a few requirements:
- Source code management through Subversion, and at a later stage also Git.
- Project management using Trac for documentation, bug and request tracking and possibly additional functions.
- Mailing lists for users and developers using Mailman.
- A system that integrates these components and ads functions for project management and user registration.
- By default, access to a project is limited, but project managers should be able to easily grant extended access to a registered user.
The individual components (Trac, Mailman, and Subversion) can be considered best-of-breed for their respective tasks, and combined should provide a very powerful platform. A system that integrates source code management, issue tracking, and mailing lists is commonly known as a 'forge'. Well known forges include Sourceforge and Google Code. Actually, NBIC was already using GForge for their software projects. The expectation is that the system will eventually host maybe two hunderd projects, but certainly not thousands.
Beside the requirements above, we decided to add a few requirements of ourselves, that we considered essential:
- There should be one single storage for authentication data, to avoid inconsistencies.
- It should be possible to install several project development environments on a single server. More or less reversely, while the system is not intended for vast numbers of projects, scalability should be taken into account, in the sense that it should be possible to distribute the system over several servers.
- Projects should be 'exportable' in such a way that a project may be migrated to another system, be it a similar environment or separate SCM, Mailman and Trac installations.
- For the sake of stability, project managers should not be able to add extensions such as Trac plugins or Subversion hooks. These matters are handled through the system administrators.
Besides these points, there are the usual considerations about system security and data integrity.
Based on the requirements we've decided that authentication will be handled by HTTP Basic Authentication, since that is supported by Subversion and Trac right out of the box. The problem of broadcasting account data in plain text (ok, a base64 encoded header, but let's not split hairs) with each request will be resolved by restricting access to the system to SSL. Access privileges such as Subversion commit privileges will be controlled by grouping users.
The hardest nut to crack for this system is the way data are stored, or actually, bridging the different storage systems that the components use. Subversion stores repository state in its own directories, Mailman is notoriously tricky to manage with configuration and subscribers stored in Python pickle files, and Trac uses both a configuration file and a database. Ideally, we bring together as much data into a single storage system as is possible and practical, without having to modify the software itself. We very much prefer to get software updates through the system package managers, so modifications to system software are rarely an option. So we've decided that at least for now we let Mailman and Subversion work with their conventional ways of storing data, and we use additional tools to interface with this.
For database storage we've decided to go with PostgreSQL. The public schema is used to store authentication data and some generic project data, and each instance of Trac gets its own schema as a backend. This allows us to store all projects in a single database, which is easier to maintain and access. If at a later stage other systems require a database backend, these can likely also find a place in the project schemas. Apache can be configured to use PostgreSQL for authentication, with groups that are created per project.
The whole installation of an environment is placed in a single directory, in which the following subdirectories are present:
- cache/ - the location where various tools can write temporary files.
- cli/ - a directory containing commandline tools for managing the system.
- downloads/ - the location where downloadable files are stored. Each project has a subdirectory here.
- http/ - the document root of the web environment. Each project has its Trac environment here.
- http/admin/ - location of the web based application that is used to manage projects and users.
- http/index/ - the frontend application of the environment with a project index.
- http/style/ - the location of resources such as style sheets and images that allow branding of the environment.
- lib/ - the location where libraries and packages can be put when these cannot be installed systemwide.
- logs/ - the location of the server logs.
- templates/ - de directory containing templates for various configurations and other documents.
- svn/ - the root directory of all subversion repositories. Each project has its own repository here.
There were a number of components that needed to be built:
- an index application that serves as the front end to the environment. This should also provide means for registration and password recovery.
- a management application that is used to administer users and projects, and in particular, control which users have extended access to which projects.
- tools to synchronize Mailman with the information in the user database.
- an installation tool that will set up all the resources for a project.
We're currently using Bash for the set of command line tools, and PHP for the web applications.
In the followups to this article we will address the way we set up Apache, how we integrated Mailman, and what administrative tools we developed. If all goes as planned, by then we will have set up a project for this tool, and we can open it up for contributions.