cvs2svn FAQ

How-to:

  1. How can I convert my CVS repository one module at a time?
  2. How can I convert part of a CVS repository?
  3. How can I convert separate projects in my CVS repository into a single Subversion repository?
  4. How can I convert project foo so that trunk/tags/branches are inside of foo?
  5. I want a single project but tag-rewriting rules that vary by subdirectory. Can this be done?
  6. How can I convert a CVSNT repository?

Problems:

  1. I get an error "A CVS repository cannot contain both repo/path/file.txt,v and repo/path/Attic/file.txt,v". What can I do?
  2. Using cvs2svn 1.3.x, I get an error "The command '['co', '-q', '-x,v', '-p1.1', '-kk', '/home/cvsroot/myfile,v']' failed" in pass 8.

Getting help:

  1. How do I get help?
  2. How do I report a bug?
  3. How can I produce a useful test case?

How-to:

How can I convert my CVS repository one module at a time?

If you need to convert certain CVS modules (in one large repository) to Subversion now and other modules later, you may want to convert your repository one module at a time. This situation is typically encountered in large organizations where each project has a separate lifecycle and schedule, and a one-step conversion process is not practical.

First you have to decide whether you want to put your converted projects into a single Subversion repositories or multiple ones. This decision mostly depends on the degree of coupling between the projects and is beyond the scope of this FAQ. See the Subversion book for a discussion of repository organization.

If you decide to convert your projects into separate Subversion repositories, then please follow the instructions in How can I convert part of a CVS repository? once for each repository.

If you decide to put more than one CVS project into a single Subversion repository, then please follow the instructions in How can I convert separate projects in my CVS repository into a single Subversion repository?.

How can I convert part of a CVS repository?

This is easy: simply run cvs2svn normally, passing it the path of the project subdirectory within the CVS repository. Since cvs2svn ignores any files outside of the path it is given, other projects within the CVS repository will be excluded from the conversion.

Example: You have a CVS repository at path /path/cvsrepo with projects in subdirectories /path/cvsrepo/foo and /path/cvsrepo/bar, and you want to create a new Subversion repository at /path/foo-svn that includes only the foo project:

    $ cvs2svn -s /path/foo-svn /path/cvsrepo/foo

How can I convert separate projects in my CVS repository into a single Subversion repository?

cvs2svn supports multiproject conversions, but you have to use the options file method to start the conversion. In your options file, you simply call ctx.add_project() once for each sub-project in your repository. For example, if your CVS repository has the layout:

  /project_a
  /project_b

and you want your Subversion repository to be laid out like this:

   project_a/
      trunk/
         ...
      branches/
         ...
      tags/
         ...
   project_b/
      trunk/
         ...
      branches/
         ...
      tags/
         ...

then you need to have a section like this in your options file:

ctx.add_project(
    Project(
        'my/cvsrepo/project_a',
        'project_a/trunk',
        'project_a/branches',
        'project_a/tags',
        symbol_transforms=[
            #...whatever...
            ],
        )
    )
ctx.add_project(
    Project(
        'my/cvsrepo/project_b',
        'project_b/trunk',
        'project_b/branches',
        'project_b/tags',
        symbol_transforms=[
            #...whatever...
            ],
        )
    )

How can I convert project foo so that trunk/tags/branches are inside of foo?

If foo is the only project that you want to convert, then either run cvs2svn like this:

   $ cvs2svn --trunk=foo/trunk --branches=foo/branches --tags=foo/tags CVSREPO/foo

or use an options file that defines a project like this:

ctx.add_project(
    Project(
        'my/cvsrepo/foo',
        'foo/trunk',
        'foo/branches',
        'foo/tags',
        symbol_transforms=[
            #...whatever...
            ],
        )
    )

If foo is not the only project that you want to convert, then you need to do a multiproject conversion; see How can I convert separate projects in my CVS repository into a single Subversion repository? for more information.

I want a single project but tag-rewriting rules that vary by subdirectory. Can this be done?

This is an example of how the cvs2svn conversion can be customized using Python.

Suppose you want to write symbol transform rules that are more complicated than "replace REGEXP with PATTERN". This can easily be done by adding just a little bit of Python code to your options file.

When a symbol is encountered, cvs2svn iterates through the list of SymbolTransform objects defined for the project. For each one, it calls symbol_transform.transform(cvs_file, symbol_name). That method can return any legal symbol name, which will be used in the conversion instead of the original name.

To use this feature, you will have to use an options file to start the conversion. You then write a new SymbolTransform class that inherits from RegexpSymbolTransform but checks the path before deciding whether to transform the symbol. Add the following to your options file:

from cvs2svn_lib.symbol_transform import RegexpSymbolTransform

class MySymbolTransform(RegexpSymbolTransform):
    def __init__(self, path, pattern, replacement):
        """Transform only symbols that occur within the specified PATH."""

        self.path = path
        RegexpSymbolTransform.__init__(self, pattern, replacement)

    def transform(self, cvs_file, symbol_name):
        # Is the file is within the path we are interested in?
        if cvs_file.cvs_path.startswith(path + '/'):
            # Yes -> Allow RegexpSymbolTransform to transform the symbol:
            return RegexpSymbolTransform.transform(
                    self, cvs_file, symbol_name)
        else:
            # No -> Return the symbol unchanged:
            return symbol_name

# Note that we use a Python loop to fill the list of symbol_transforms:
symbol_transforms = []
for subdir in ['project1', 'project2', 'project3']:
    symbol_transforms.append(
        MySymbolTransform(
            subdir,
            r'^release-(\d+)_(\d+)$',
            r'%s-release-\1.\2' % subdir))

# Now register the project, using our own symbol transforms:
ctx.add_project(
    Project(
        'your_cvs_path',
        'trunk',
        'branches',
        'tags',
        symbol_transforms=symbol_transforms))

This example causes any symbol under "project1" that looks like "release-3_12" to be transformed into a symbol named "project1-release-3.12", whereas if the same symbol appears under "project2" it will be transformed into "project1-release-3.12".

How can I convert a CVSNT repository?

CVSNT is a version control system that started out by adding support for running CVS under Windows NT. Since then it has made numerous extensions to the RCS file format, to the point where CVS compatibility does not imply CVSNT compatibility with any degree of certainty.

cvs2svn might happen to successfully convert a CVSNT repository, especially if the repository has never had any CVSNT-only features used on it, but this use is not supported and should not be expected to work.

If you want to experiment with converting a CVSNT repository, then please consider the following suggestions:

Patches to support the conversion of CVSNT repositories would, of course, be welcome.


Problems:

I get an error "A CVS repository cannot contain both repo/path/file.txt,v and repo/path/Attic/file.txt,v". What can I do?

Background: Normally, if you have a file called path/file.txt in your project, CVS stores its history in a file called repo/path/file.txt,v. But if file.txt is deleted on the main line of development, CVS moves its history file to a special Attic subdirectory: repo/path/Attic/file.txt,v. (If the file is recreated, then it is moved back out of the Attic subdirectory.) Your repository should never contain both of these files at the same time.

This cvs2svn error message thus indicates a mild form of corruption in your CVS repository. The file has two conflicting histories, and even CVS does not know the correct history of path/file.txt. The corruption was probably created by using tools other than CVS to backup or manipulate the files in your repository. With a little work you can learn more about the two histories by viewing each of the file.txt,v files in a text editor.

There are four straightforward approaches to fixing the repository corruption, but each has potential disadvantages. Remember to make a backup before starting. Never run cvs2svn on a live CVS repository--always work on a copy of your repository.

  1. Remove the Attic version of the file and restart the conversion. Sometimes it represents an old version of the file that was deleted long ago, and it won't be missed. But this completely discards one of the file's histories, probably causing file.txt to be missing in older historical revisions. (For what it's worth, this is probably how CVS would behave in this situation.)
          # You did make a backup, right?
          $ rm repo/path/Attic/file.txt,v
        
  2. Remove the non-Attic version of the file and restart the conversion. This might be appropriate if the non-Attic version has less important content than the Attic version. But this completely discards one of the file's histories, probably causing file.txt to be missing in recent historical revisions.
          # You did make a backup, right?
          $ rm repo/path/file.txt,v
        
  3. Rename the Attic version of the file and restart the conversion. This avoids losing history, but it changes the name of the Attic version of the file to file-from-Attic.txt whenever it appeared, and might thereby cause revisions to be broken.
          # You did make a backup, right?
          $ mv repo/path/Attic/file.txt,v repo/path/Attic/file-from-Attic.txt,v
        
  4. Rename the non-Attic version of the file and restart the conversion. This avoids losing history, but it changes the name of the non-Attic version of the file to file-not-from-Attic.txt whenever it appeared, and might thereby cause revisions to be broken.
          # You did make a backup, right?
          $ mv repo/path/file.txt,v repo/path/file-not-from-Attic.txt,v
        

If you run cvs2svn on a case-insensitive operating system, it is possible to get this error even if the filename of the file in Attic has different case than the one out of the Attic. This could happen, for example, if the CVS repository was served from a case-sensitive operating system at some time. A workaround for this problem is to copy the CVS repository to a case-sensitive operating system and convert it there.

Using cvs2svn 1.3.x, I get an error "The command '['co', '-q', '-x,v', '-p1.1', '-kk', '/home/cvsroot/myfile,v']' failed" in pass 8.

By default, cvs2svn uses the "co" program from RCS to read the contents of files in your archive. (See the requirements section of the documentation.) The solution to this problem is either to install RCS, or to ensure that CVS is installed and use cvs2svn's --use-cvs option.

Getting help:

How do I get help?

There are several sources of help for cvs2svn:

How do I report a bug?

cvs2svn is an open source project that is largely developed and supported by volunteers in their free time. Therefore please try to help out by reporting bugs in a way that will enable us to help you efficiently.

The first question is whether the problem you are experiencing is caused by a cvs2svn bug at all. A large fraction of reported "bugs" are caused by problems with the user's CVS repository, especially trying to convert a CVSNT repository with cvs2svn. Please also double-check the manual to be sure that you are using the command-line options correctly.

A good way to locate potential repository corruption is to use the shrink_test_case.py script (which is located in the contrib directory of the cvs2svn source tree. This script tries to find the minimum subset of files in your repository that still shows the same problem. Warning: Only apply this script to a backup copy of your repository, as it destroys the repository that it operates on! Often this script can narrow the problem down to a single file which, as often as not, is corrupt in some way. Even if the problem is not in your repository, the shrunk-down test case will be useful for reporting the bug. Please see "How can I produce a useful test case?" and the comments at the top of shrink_test_case.py for information about how to use this script.

Assuming that you still think you have found a bug, the next step is to investigate whether the bug is already known. Please look through the issue tracker for bugs that sound familiar. If the bug is already known, then there is no need to report it (though possibly you could contribute a useful test case or a workaround).

If your bug seems new, then the best thing to do is report it via email to the dev@cvs2svn.tigris.org mailing list. Be sure to include the following information in your message:

  1. Exactly what version of cvs2svn are you using? If you are not using an official release, please tell us what branch and revision number from the svn archive you are using. If you have modified cvs2svn, please tell us exactly what you have changed.
  2. What platform are you using (Linux, BSD, Windows, etc.)?
  3. What is the exact command line that you used to start the conversion? If you used the --options option, please attach a copy of the options file that you used.
  4. What happened when you ran the program? Why do you think the behavior was wrong? Include transcripts and/or error output if available.
  5. If at all possible, include a test case repository that we can use to reproduce the problem. See "How can I produce a useful test case?" for more information. In most cases, if we cannot reproduce the problem, there is nothing we can do to help you.

How can I produce a useful test case?

If you need to report a bug, it is extremely helpful if you can include a test repository with your bug report. In most cases, if we cannot reproduce the problem, there is nothing we can do to help you. This section describes ways to overcome the most common problems that people have in producing a useful test case. When you have a reasonable-sized test case (say under 1 MB--the smaller the better), you can just tar it up and attach it to the email in which you report the bug.

If the repository is too big and/or contains proprietary information

You don't want to send us your proprietary information, and we don't want to receive it either. Short of open-sourcing your software, here is a way to strip out most of the proprietary information and simultaneously reduce the size of the archive tremendously.

The destroy_repository.py script tries to delete as much information as possible out of your repository while still preserving its basic structure (and therefore hopefully any cvs2svn bugs). Specifically, it tries to delete all file descriptions and text content, all nontrivial log messages, and all author names. (It does not affect the directory and file names or the number and dates of revisions to those files.)

  1. This procedure will destroy the repository that it is applied to, so be sure to make a backup copy of your repository and work with the backup!
  2. Make sure you have the destroy_repository.py script. If you don't already have it, you should download the source code for cvs2svn (there is no need to install it). The script is located in the contrib subdirectory.
  3. Run destroy_repository.py by typing
    # You did make a backup, right?
    /path/to/config/destroy_repository.py /path/to/copy/of/repo
    
  4. Verify that the "destroyed" archive does not include any information that you consider proprietary. Your data security is ultimately your responsibility, and we make no guarantees that the destroy_repository.py script works correctly. You can open the *,v files using a text editor to see what they contain.
  5. Try converting the "destroyed" repository using cvs2svn, and ensure that the bug still exists. Take a note of the exact cvs2svn command line that you used and include it along with a tarball of the "destroyed" repository with your bug report.

The repository is still too large

This step is a tiny bit more work, so if your repository is already small enough to send you can skip this step. But this step helps narrow down the problem (maybe even point you to a corrupt file in your repository!) so it is still recommended.

The shrink_test_case.py script tries to delete as many files and directories from your repository as possible while preserving the cvs2svn bug. To use this command, you need to write a little test script that tries to convert your repository and checks whether the bug is still present. The script should exit successfully (e.g., "exit 0") if the bug is still present, and fail (e.g., "exit 1") if the bug has disappeared. The form of the test script depends on the bug that you saw, but it can be as simple as something like this:

#! /bin/sh

cvs2svn --dry-run /path/to/copy/of/repo 2>&1 | grep -q 'KeyError'

If the bug is more subtle, then the test script obviously needs to be more involved.

Once the test script is ready, you can shrink your repository via the following steps:

  1. This procedure will destroy the repository that it is applied to, so be sure to make a backup copy of your repository and work with the backup!
  2. Make sure you have the shrink_test_case.py script. If you don't already have it, you should download the source code for cvs2svn (there is no need to install it). The script is located in the contrib subdirectory.
  3. Run shrink_test_case.py by typing
    # You did make a backup, right?
    /path/to/config/shrink_test_case.py /path/to/copy/of/repo testscript.sh
    
    , where testscript.sh is the name of the test script described above. This script will execute testscript.sh many times, each time using a subset of the original repository.
  4. If the shrunken repository only consists of one or two files, look inside the files with a text editor to see whether they are corrupted in any obvious way. (Many so-called cvs2svn "bugs" are actually the result of a corrupt CVS repository.)
  5. Try converting the "shrunk" repository using cvs2svn, to make sure that the original bug still exists. Take a note of the exact cvs2svn command line that you used, and include it along with a tarball of the "destroyed" repository with your bug report.