Looking for new projects

Aug 19, 2024

Hello. This post isn’t really an essay, it’s an ad, but it’s an ad for me, so fairly on brand for this newsletter. I’m also going to use it to talk about a bunch of stuff I don’t normally talk about here, so hopefully it will be interesting even if you don’t care about the ad part.

Short version: As you probably know1 I’m actually a software developer. I’m currently looking for new technical projects. If you work in software in some way2, and you have any projects you think you’d find me useful on, please get in touch at david@drmaciver.com, or book a free call with me to talk about it.

Brief background: I’m an experienced software developer and consultant. I most recently worked at Anthropic, on a mix of model evaluations and data sources.3 Before that, I did a variety of consulting, and did independent R&D developing cutting edge open source software testing tools, including Hypothesis (which, if you’re using Python, even if you don’t use it directly, many libraries you depend on probably do). Before that I worked at Google, and before that a variety of startups.

I am looking for short to medium term (say, a few days to three months) software contracting and consulting projects. I’m also open to offers of employment if I think there’s a great fit, but it’s not primarily what I’m looking for, as I have a slow-burn project of my own working on synthetic training data for LLMs that I want to keep working on, and am looking to extend my runway.

What you should hire me for

Think of me as a Solver for hire. If you’ve got a specific hard problem that you need help with, you can bring me in and let me bulldoze through it.

It’s fine if this problem isn’t well defined yet - I’m also happy to help with initial problem definition.

Here are what I think of as some good model projects:

Broad mandate troubleshooter - coming onto a software project with a general mandate of “make this project better”. This would require cooperation from the people already on the project, and would involve a mix of development work, process suggestions, and pairing with people to help them out.
“Pair programmer for hire” if you want someone to provide a consult on some technical work you’re doing. For example, I’ve previously helped a client debug a Python C extension they were trying to write that kept crashing.4
Either of the above with a specific focus on improving your software testing, or with using Hypothesis or other property-based testing libraries better.
Bringing me on in the early days of a project and helping do problem definition, set technical directions, and generally help figure out what is needed for it to succeed.
Well-defined technical problems where you need a high quality solution to something you don’t know how to solve. For example, a previous project I’ve worked on that turned out very well involved taking two data sources and trying to find a best fit alignment of them that replaced the overwhelming majority of a labour intensive manual process of fitting the two together.
Bounded R&D, possibly leading to a project of the above form or possibly leading to a report on what I’ve discovered and making suggestions for what to do next. Good examples of this would be investigating how LLMs could be useful in the context of an existing problem or product, or using off-the-shelf solver technologies to solve complex planning problems that occur in your software.
If you have software that handles some complex format that you find yourself regularly needing or wanting to include minimal reproducing examples in order to report or understand bugs in it, I’m very open to helping you integrate shrinkray into your work flow and doing paid development work to improve it for your use case.

This isn’t an exhaustive list, just a set of ideas for the sorts of things I can help with.

What I’m really good at

I think the things that distinguish me most from other highly competent staff-level software engineers are:

I am one of the world experts on property-based testing, and the world expert in test-case reduction.5 I wrote Hypothesis, and shrink ray, both of which are the leading tools of their kind. If you don’t know what these are, I’ll explain in a moment, but the relevant facts are that I build very good tooling for developers that is both extremely usable and improves the state of the art’s capabilities. These areas also correspond to quite broadly useful skills, and in general I am very good at turning ideas that have been developed in research prototypes and turning them into practical usable tools.
I’ve worked at Anthropic, and have a good sense of LLMs from that. I’m very far from a machine learning expert, but I know a fair bit about the practical engineering side of interacting with LLMs, and if you need help on that (or, more specifically, synthetic training data or LLM evaluations), I’m extremely qualified to help.
I am the sort of person who writes this newsletter, with all the strengths and attitudes that implies. I care a lot about problem solving, helping people improve, communication, and how we can produce high quality work. I’ve consulted and coached on issues related to process, helping developers acquire soft skills, software quality, etc and even where I’m not directly consulting on these issues it still heavily informs everything about how I work.

What you shouldn’t hire me for

The main things I would like to avoid are:

Very small amounts of work. e.g. I’ve previously offered once a week coaching sessions, and I no longer do because it turns out to be too disruptive to my schedule and my ability to work on other things. I’m happy to do as little as a day at a time of work, but my ideal project length is anywhere between a week to a few months.
Work that is primarily process-focused rather than technical. I’m very happy to do some of this as part of a larger project, but if I’m involved in a software project in some way I want my primary focus to be writing software.
Projects that I consider unethical. I don’t have an exhaustive list of these, but for example I am not interested in working on gambling, military, or surveillance applications.6

In general if you’re not sure if any of these apply to your project, by all means err on the side of asking, the worst that will happen is that I’ll politely decline.

More about my technical specialisations

Property-based testing is a type of testing that uses generated data to test your software, so it covers a much broader range of scenarios than traditional example based tests. The property-based testing library I wrote, Hypothesis, is probably the most advanced property-based testing library in the world,7 largely because of my focus on user experience and my developing an entirely new core approach in order to fix many of the things people previously found annoying or difficult about property-based testing.

My work on Hypothesis also gave me very strong opinions on how to design good libraries for usability, with clear error messages and APIs that are easy to use and hard to get wrong. If you have some tool or library that is hard to use well and you want to improve that, I’m very able to help with this.

Property-based testing style approaches have also proven promising for evaluating LLMs. I’ve done some preliminary open source work on this, although I’m not currently actively working on that angle.8

This background also means I have a lot of general purpose knowledge on how to generate diverse, high quality, data for any application, not just software testing. e.g. I’ve got an ongoing project to generate high quality synthetic data for training LLMs. This is currently a bit work in progress and would be easier to do with someone to partner with, but is my current plan for medium-term profitability.

If you’re interested in using synthetic data for evaluating or training LLMs specifically, please get in touch, but also if you’ve got some other use case for data generation I’d be interested in talking about it.

The other half of my technical specialisation is test-case reduction, which is about automatically taking test cases (documents, structured objects, etc) which trigger a bug (or exhibit some other interesting property) and automatically transforming them into minimal reproducible example of that bug. I learned a lot about this in the course of Hypothesis, and took the lessons from that and developed shrinkray, which is probably the most powerful general purpose test-case reducer around - it has the best approach to parallelism, and supports a much wider variety of formats out of the box than the alternatives.

If you need anything related to test-case reduction, I’d definitely be delighted to talk to you, but I think another way of looking at this is that I’m good at black-box optimisation problems with a heavy heuristic component and where it’s more important to get something good-enough fast. Test-case reduction just happens to be a particular example of this that I’ve spent a lot of time on, but if you’ve got a problem of roughly that shape that isn’t test-case reduction, I’m keen to broaden my horizons.

On top of this I also just have a relatively large eclectic toolkit of computer science interests and technical knowledge, so if you’ve got some hard to crack problem there’s a decent chance I know an interesting tool that might help to solve it.

What now?

If any of this sounds interesting to you, please drop me an email at david@drmaciver.com or book a free call with me to have a no-strings-attached chat about it. You don’t need to have a really concrete idea of a project or way to bring me on. If you’re at all interested, even if it’s for something very different than the sort of things I’ve outlined here, I’m happy to chat to you speculatively and see if we can figure out a way to work together.

I’m never sure who does and doesn’t know. I don’t talk about software development much on the public internet these days, but I sure sound like a software developer. Still, people miss this surprisingly often.

I’m not uninterested if you’ve got a non-software project you’d like my help on, but I think I would be unlikely to be a good choice for it. Still, if there’s something else you’d like to work with me on, do get in touch and we can at least talk about it and see if there’s a good fit after all.

Exact details mostly under NDA.

This wasn’t actually very lucrative for me as I solved their problem in half an hour, but I’m interested in more extended engagements of this form.

I think it’s fair to say that I’m the world expert on this - my opinion is that it’s either me or John Regehr, and John’s opinion is that between the two of us it’s probably me. I do think that if you want a test-case reducer, mine’s the best, but part of that is that I built on John’s work.

There’s a complex second-order question here where because I develop open source tools, people in these industries absolutely do use my work. I consider this mostly OK. If you work in one of these industries and want to fund something open source and generally useful and don’t require me to publicly state that you funded the work, I’m willing to consider it.

Certainly the most advanced open source one. A case could be made for Erlang’s QuickCheck being more advanced in some ways. It supports features that Hypothesis does not, but the converse is also true. Hypothesis is significantly more widely used and significantly more user friendly, but Erlang QuickCheck has more support for in-depth model based testing and concurrency.

The open source LLMs were too bad to bother with and running them against commercial LLMs was, at the time, a bit expensive. I should revisit this now that Claude Sonnet is more reasonably priced and the open source LLMs are better, though I’m not wildly optimistic.

Overthinking Everything

Discussion about this post