Wrapping C APIs in Python

When working in Python, there are times when you need more speed than it can deliver. These times to reach for C/C++. The interop between the languages is clean, minimal, and very fast. Here, I’ll give a quick example with a little commentary to assist with understanding the basics.

First, the basic problem. Let’s say that you’re working in Python, but there’s an excellent library implementing some critical functionality in C that you need. How do you compose it into your application? Fortunately, Python includes ctypes right out of the box. To demonstrate, I’ll show how I’ve directly surfaced the C API of the fantastic libgeohash for consumption in Python.

To achieve this, first we’ll need a binary for the target platform. For libgeohash, we can produce a shared object that is suitable for import into Python as follows.

$ gcc -fPIC -shared -o libgeohash.so geohash.c

Once we have our binary, we can proceed to pull it’s API up into Python. The process involves registering it with a call to ctypes.CDLL() and then each function to be used must be imported into the system with code specifying the fn name, arg types, and, the return type. I prefer to do this with a convenience function, a wrapper, that sets these all at once.

import ctypes

_geohash = ctypes.CDLL('./libgeohash.so')

def wrap_function(lib, funcname, restype, argtypes):
    """Simplify wrapping ctypes functions"""
    func = lib.__getattr__(funcname)
    func.restype = restype
    func.argtypes = argtypes
    return func

Now let’s say that we want to import the following function from geohash.h.

/*
 * Creates a the hash at the specified precision.  If
 * precision is set to 0 or less than it defaults to 12.
 */
extern char* geohash_encode(
    double lat, 
    double lng, 
    int precision
);

See that it takes 2 doubles and an int as arguments and returns a string in the form of a char pointer. We can represent these in Python as follows using our wrapper.

geohash_encode = wrap_function(
    _geohash, 
    'geohash_encode', 
    ctypes.c_char_p, 
    (ctypes.c_double, ctypes.c_double, ctypes.c_int)
)

Note that we pass the reference to the imported .so _geohash, the name of the target fn 'geohash_encode', and ctype type specifications for the args and return val. Since we are receiving a string back from this call, we use ctypes.c_char_p, which coerces the data at the address pointer back out as a python string. And of course the inputs follow the same convention, (ctypes.c_double, ctypes.c_double, ctypes.c_int).

Now we’re able to invoke with function directly from Python on the target platform.

geohash_encode(41.41845703125, 2.17529296875, 5)

# 'sp3e9'

This makes it simple and easy to pull just about any functionality from C directly into Python.

But what about more complex types such as C structs? It’s not a problem, because ctypes provides a base class that handles the low-level details. In order to map a struct to a Python class, simply inherit the corresponding Python type from ctypes.Structure and then specify the properties you want mapped using the _fields[] attribute.

Suppose we want to wrap the following function from libgeohash.h that returns a complex type.

// Metric in meters
typedef struct GeoBoxDimensionStruct {
	
	double height;
	double width;

} GeoBoxDimension;

/*
 * Returns the width and height of a precision value.
 */
extern GeoBoxDimension geohash_dimensions_for_precision(
    int precision
);

To represent the reference to the C struct returned, we create a Python class of the same shape, i.e. containing the same fields, and wrap the target function using our new GeoBoxDimension class as the return type.

class GeoBoxDimension(ctypes.Structure):
    _fields_ = [
        ('height', ctypes.c_double),
        ('width', ctypes.c_double)
    ]

geohash_dimensions_for_precision = wrap_function(
    _geohash, 
    'geohash_dimensions_for_precision', 
    GeoBoxDimension, 
    [ctypes.c_int]
)

geohash_dimensions_for_precision(6)

# (0.0054931640625, 0.010986328125)

The process for specifying composite types as function args is the same. With this simple and direct approach, we can hook any code from C libraries and bring it directly into Python. It’s notable the lack of cruft required to bring the languages together. In particular this makes an ideal solution for having the flexibility and joy of Python melded with the blazing speed of C/C++ with little to no fuss. A full wrapper for libgeohash can be found in my fork here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: