[greenstone-devel] Tough questions about SVN

From Oran Fry
DateThu, 12 Apr 2007 16:29:38 +1200
Subject [greenstone-devel] Tough questions about SVN
Hi all,

Oran here again, with part two of the previous message. Read this part
if you want to help with the move by offering us advice, or if you are
just generally interested in the details of the move to SVN.

In the process of moving to SVN there are a few problems which I have
had to solve, or which I am still looking for solutions to. It would be
good to get some feedback on some of the solutions I have come up with,
as well as some suggestions.

The problem, in general:

Greenstone makes use of CVS. Instructions in manuals and wikis explain
how to download source code using CVS, and Greenstone makefiles and
build scripts even download source code from CVS automatically. When the
Greenstone source code repository is moved to SVN, these instructions
and build scripts will need to be updated. This might present a few
roadblocks as there will not always be an equivalent SVN command for
each CVS command that is currently used.

Furthermore, SVN lacks some of the features of CVS and replaces them
with alternatives. The features of SVN demand a different sort of
repository organisation, which the repository will have to be made to
conform to. This means that more source code modification may be required.

There is a particular problem I have been mulling over for a while which
will discuss in more details:

Greenstone2 and Greenstone3 share some code. Up 'till now this has been
made possible by CVS a module, gs2build, which pulls out all the bits of
Greenstone2 that are needed for collection building, so that Greenstone3
can use them. When you download Greenstone3 and run the ant build
commands, this module is downloaded and put into your Greenstone3
working copy. The problem lies in the fact that SVN doesn't have
modules, but 'externals' instead. An 'external' resides in a particular
directory of a SVN repository, and is basically a pointer to an entire
directory elsewhere in the repository (or a directory in an external
repository, hence the name). When you checkout a directory that contains
an external, your working copy will contain a subdirectory with the
contents the directory that the external points to. An external can only
refer an entire directory, not select files from a directory. And the
external files can only be placed in a subdirectory of your working
copy, not the root.

Given these constraints I have bent my brain trying to find a use of
externals which would achieve the same effect as the CVS module. Finally
I think I have come up with a solution which would be both easy to
implement and sufficiently elegant. Basically the solution is to neglect
to set up any SVN externals at all, and leave the Greenstone3 ant build
commands to pick-and-choose the right files from the repository. Of
course, this will require the Greenstone3 build.xml file to be modified.
But I figure that's ok since it will have to be modified anyway; it
contains CVS checkout commands which need replacing with SVN checkout
commands. Right now, to find out if a particular file is considered
'shared' you would need to check the CVS modules file as well as the
Greenstone3 build.xml file, which means that information is spread
across those two files. So my proposal is to shift the information
solely to the Greenstone3 build.xml file. This would give us more
freedom regarding the organistaion of the repository (and perhaps would
have been the best idea to start with!).

Conveniently, this freedom might also soften the blow of another problem
which I have been grappling with:

By convention, a subversion project consists of three directories,
called 'trunk' 'branches' and 'tags'. The main version of the project
lives in 'trunk', branches of the project people are working on live in
'branches' and snapshots of the project at a particular time live in
'tags'. It is necessary then to separate the repository into different
'projects'. The simplest answer would be to consider Greenstone to be a
single project, and have 'trunk', 'branches' and 'tags' in the root of
the repository directory (This is the default for the cvs2svn conversion
program). Or it might be better to distinguish between Greenstone2 and
Greenstone3, and have two directories 'greenstone2' and 'greenstone3'
(and perhaps a third directory 'shared') in the root of the repository,
and within each of those have 'trunk', 'branches' and 'tags'. Or it
might be better to split the repository into even more projects than this.

Any feedback is appreciated on these matters.

Kind regards and happy coding!