Using the Subversion Client API, Part 1

Listen Print

Using the Subversion Client API, Part 1

by Garrett Rooney
04/24/2003

Subversion, as you probably already know, is a version control system written from scratch to replace CVS, the most popular open source version control system. While there are many reasons to choose Subversion, one of the most interesting is that Subversion has been designed and implemented as a collection of reusable libraries, written in C. This allows your programs to use the same functionality found in the command line Subversion client without having to call out to the command line client, to execute commands, or parsing output. This article briefly reviews the Subversion libraries, explains some of their data structures, and demonstrates the use of the Subversion client APIs in other programs.

Getting Started

Before you can jump into the code, you need to install Subversion. This article was written with release 0.20.0 of Subversion in mind. It would be best if you had that version. The installation instructions are available in the INSTALL file. If you don't like to compile your own software, you can try a binary distribution of Subversion .

If you're using an older version of Subversion, it's a good idea to upgrade at least to 0.20.0. If you're using a more current version, just watch your step. The Subversion project has not yet released version 1.0, the APIs are not yet fixed, and things may change. To get a good idea of the changes between 0.20.0 and the version you have, look at the CHANGES file in the Subversion tarball, specifically the "Developer-visible changes" sections. The general concepts discussed in this article will still apply to any version of Subversion.

Once you've installed Subversion, you'll need to become familiar with its general use. This article assumes some basic knowledge of how Subversion works. If you've never used it before, take a break and learn how before reading any farther. Some good resources for this are Rafael Garcia-Suarez's great articles on Single User Subversion and Multiuser Subversion.

What's The Point Anyway?

You may be thinking that using the Subversion libraries directly will add a bit of complexity to your life as a software developer. You'll need to to make your build process find the proper libraries and pass the correct flags to your compiler to link to them, not to mention learning a whole new API--there's a lot to do! I'd be surprised if you haven't at least thought about giving up now and simply writing a little wrapper library that calls the Subversion command line client.

The Basic Building Blocks

Like most other software systems, Subversion is built on a number of smaller bits of code. In order to use Subversion's API well, you'll need to understand its underlying libraries.

APR

To ensure maximum portability across a wide number of operating systems, Subversion is built on APR, the Apache Group's portability layer. The APR developers use doxygen markup to comment their code, so you can access their documentation online here. In addition to abstracting away various platform specific bits of functionality Subversion needs, APR provides a set of basic data structures such as hash tables and memory pools. We'll cover memory pools first, as they're probably less familiar.

Before we get into any APR data types, you'll have to learn how to initialize and shut down APR. Simply call apr_initialize before calling any APR (or Subversion) functions and use atexit or some other means to arrange for apr_terminate to be called at shutdown.

Rather than manually allocating and deallocating memory with malloc and free, Subversion uses APR's memory pools to manage memory. Create a pool using svn_pool_create, which is actually a thin wrapper around apr_pool_create, with a simpler interface and a few Subversion debugging tricks. Allocate memory from the pool with functions like apr_palloc and apr_pcalloc. You don't need to worry about freeing the memory. When you're done with everything you allocated out of that pool, just destroy the pool with svn_pool_destroy (again, a thin wrapper around apr_pool_destroy). It will free the memory for you.

This is kind of cool, since you only need to worry about freeing memory once, but it's nothing to write home about. The real benefit comes when you take advantage of chaining pools together in a hierarchy. You can create subpools inside your main pool (or inside other subpools, ad infinitum), and clear them with svn_pool_clear. This lets you avoid making the operating system allocate more memory for you, and can give a nice performance boost in some situations.

Unfortunately, you need to be careful with pools, because you can easily get into situations where a pool is growing without bound as you allocate from it within a loop. To avoid this situation, you have to use common sense. Create a subpool before the loop and clear it each time through the loop. To avoid losing access to things you allocate inside the loop, duplicate them into the parent pool--like this:

char **
function (int iterations, apr_pool_t *pool)
{
  /* allocate an iteration pool. */
  apr_pool_t *subpool = svn_pool_create (pool);

  /* allocate some memory to hold our results. */
  char ** array = apr_pcalloc (pool, iterations + 1);
  int i;

  for (i = 0; i < iterations; ++i)
    {
      char * result = some_function_that_takes_a_pool (i, subpool);

      /* duplicate the result into our main pool. */
      array[i] = apr_pstrdup (pool, result);

      svn_pool_clear (subpool);
    }

  /* clean up after ourselves. */
  svn_pool_destroy (subpool);

  /* return our results, safely allocated in our main pool. */
  return array;
}

As long as you're careful with how you use pools, you'll find that they greatly simplify the logic of your code. You can stop worrying about memory management and start concentrating on what your code actually does.

Another common APR data type used in Subversion is the apr_hash_t. This is just a standard hash table, designed to work with APR pools. It uses void pointers for its keys and values, so you can stick whatever you want in it, as long as you're careful to remember the type so you can cast the contents appropriately when you retrieve values.

Error Handling

In addition to the various data structures it inherits from APR, Subversion has a several fundamental data types. The most important of these is svn_error_t, used everywhere in the Subversion API.

Rather than returning a generic error code to indicate that a function has failed, Subversion uses its own "exception" object, svn_error_t, as the return value for all its functions that can fail. If a Subversion API function succeeds, it returns the value SVN_NO_ERROR (which is actually 0, to simplify error checking). If it fails, it returns a new svn_error_t. Each svn_error_t contains an apr_status_t--either the return value of the underlying APR function that failed or the Subversion specific error code.

All Subversion error codes are defined in svn_error_codes.h. There is also a const char * that describes what precisely went wrong, a pointer to another error (as svn_error_ts can be chained together), and the pool that the error was allocated from. When Subversion returns an error, you need to handle it, usually with svn_error_clear, in order to free the memory associated with the error and any other errors in its chain. All of the other error-handling functions are declared and documented in svn_error.h. Here's an example of a function that handles an svn_error_t.

void
handle_error (svn_error_t *error)
{
  svn_error_t *itr = error;

  while (itr)
    {
      char buffer[256] = { 0 };

      printf ("source error code: %d (%s)\n",
              error->apr_err,
              svn_strerror (itr->apr_err, buffer, sizeof (buffer)));

      printf ("error description: %s\n", error->message);

      if (itr->child)
        printf ("\n");

      itr = itr->child;
    }

  svn_error_clear (error);
}

You might notice that many function calls are wrapped in the SVN_ERR macro. This is just a quick way of saying that if the function returns SVN_NO_ERROR, we should continue on, but if it returns anything else, we return the error to our calling function, propagating the error up the call stack to be handled elsewhere. As long as your functions also return svn_error_t *s, you can use this macro.

Pages: 1, 2

© 2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.

About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly

Other O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net