Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"03-06-13 - Sympathy for the Library Writer"

11 Comments -

1 – 11 of 11
Blogger Stephan said...

Having to cater to external clients and not being able to control how a library is used can also dramatically increase the testing and documentation effort, at least if you want to do it *right*.

March 11, 2013 at 9:11 AM

Blogger cbloom said...

Yeah.

It's something where clever design (more clever than me) could help you a lot.

You want to make individual components that are simple and well tested and can be put together without breaking each other.

That way you make N components and your testing load is N.

If you instead cram those features together with interactions, you have a 2^N size load.

(a rare case where we can actually say it's "exponentially" larger)

I often get fooled into doing it the wrong way in search of efficiency.

Say for example you want to do something like a PNG compressor. You have any number of pixel formats, filters, and back-end compressors. The bad API is like :

DoPNGlikeThingy( void * pixels, int format, int filter, int compressor );

you've multiplied the API space up massively and now it's for practical purposes too big to test the whole parameter space.

Instead you could do it like :

TransformToStandardRGB( void * pixels, int format );
DoFilters( void * pixels, int filter );
DoCompress( void * pixels, int compressor );

Now you can test each piece individually over all its options, and ensure they only interact through the simple well-defined channel of the pixel data.

It's x+y+z tests instead of x*y*z.

The trap that nerds like me fall into is that you can be more efficient if you combine the steps; eg. trying to work on pixel rows one by one to keep them in cache. That efficiency gain is real, but it requires you to tangle up all your systems together and leads to a hopelessly huge testing space.

The reason this is so much worse for a library writer is you don't know which uses the game actually cares about. Maybe the pixels are always just 8-bit RGB and we don't need that flexible pixel format at all. Maybe we only use a few compress modes. Then we can reduce the API and only optimize and test the cases we care about. And we can measure our performance in our final usage scenario which means no futzing around with synthetic testing. Ahh! So much better.

March 11, 2013 at 9:35 AM

Blogger won3d said...

Two Casey-related questions:

Any reflections on that library design talk he gave years ago?

Ever think about using some kind of metaprogramming/code-gen to give you some kind of leverage to deal with x*y*z style configuration spaces?

March 11, 2013 at 11:16 AM

Blogger Stephan said...

Developing a software library for external use is definitely more difficult than developing other kinds of software, both technically and commercially. The surprisingly large effort required to "polish up and package" some internal library into something that you can sell to clients is probably the reason why so few libraries are developed in this way. Arguably, most commercial work on non-internal software libraries is done to support a platform (Windows, OSX/iOS, Intel Processors, NVidia GPUs, some cloud hosting platform, etc) or because companies publish and contribute to open source libraries in the hope that they can gain some karma and profit from others' contributions.

Intuitively, this situation has the feel of a market failure, because you'd think that it would be better for the industry at large if there was enough commercial incentive for the best developers to focus on creating high-quality, state-of-the-art libraries for everybody to use, instead of reinwenting the wheel internally at some company for the hundredth time. Economically speaking one could probably make an argument for a market failure based on the transaction costs involved in software library licencing and the presence of positive external effects associated with high-quality software libraries.

I don't know how your company intends to market the library, but maybe it would make sense to sell source code licences. This would allow you to say: "We have optimized and tested this library for use cases X, Y and Z, as demonstrated in the sample code. If you have different requirements, you can adapt the source code and pay us to help you. Also, if the documentation isn't perfect, just look at the highly readable source code."

March 11, 2013 at 2:11 PM

Blogger cbloom said...

"Any reflections on that library design talk he gave years ago?"

Good question. I saw it when it was given but TBH didn't pay too much attention because I wasn't writing APIs at the time. Just had a look back at it now.

I agree with the basics, which is like don't impose systems on the client, work with their systems; don't impose a big retained-mode, let them use their systems for IO/memory/etc, dont force your own systems on them, etc.

There are some issues where I think he doesn't emphasize the negative enough.

Granularity :

exposing the micro-ops inside your larger operations is nice if clients actually need it. But it has a lot of negatives; it is sort of exposing how your internals work. It's the opposite of opacity and encapsulation.

The ideal for me is to expose only the highest level fewest functions possible. In an ideal world the library API would always just be

void magicfunc(void);

that does exactly what the user wants. That way I have very little coupling to their code and I can change the internals without breaking them. The more you start revealing the granular internal bits, the more your details are unchangeable.

Redundancy :

Some redundancy is indeed nice for the user, and when I write APIs for myself I like lots of redundancy.

(for example in cblib you can do strlen on a char* or a wchar* or a String, and all the file IO routines take all of those types, and etc. Redundancy is nice.)

But I'm not so sure the library should be the one offering the nice redundancy, I now sort of thing that should be left up to clients to do with their own wrappers.

One problem with redundancy is just that it makes the API bigger, which means more docs, more testing, more maintenance.

But the big problem is that redundancy is confusing. When you have lots of ways to do the same thing, the client doesn't know if one is "best" or where to get in. I'm starting to lean towards orthogonality as the ideal for the library API.

BTW followup note in next comment cuz this is getting too long.

March 11, 2013 at 4:06 PM

Blogger cbloom said...

followup :

I have come to the thinking that every API that you ever use should always be wrapped.

Don't call stdlib directly. Don't call Win32 or whatever OS. Don't call granny functions. Make your own wrappers.

You can provide yourself nice redundancy, add assertions, fix some stupid naming boofs, and it helps enormously with porting.

When I'm thinking about API design these days, I'm thinking that API that I provide should be sort of hard to use; it should be minimal, orthogonal, as small and clear as possible, and that to make it friendlier to use it should be wrapped on the client side.

I don't think that's actually possible for me at RAD because clients don't think the same way I do, and perhaps more importantly because it creates a barrier to entry which is something that's absolutely crucial to avoid.

March 11, 2013 at 4:10 PM

Blogger cbloom said...

BTW it would be super interesting to have Dave do an "API design revisited" talk after living with Granny all these years.

Sadly the world is a fucking useless shitty place and interesting honest discussions like that can never happen because we're all a bunch of sensitive babies. Boo.

March 11, 2013 at 4:13 PM

Blogger Aaron said...

"I don't think that's actually possible for me at RAD because clients don't think the same way I do, and perhaps more importantly because it creates a barrier to entry which is something that's absolutely crucial to avoid."

What if you ship the tight core, but all ship a good example wrapper as part of the product?

March 11, 2013 at 5:35 PM

Blogger Aaron said...

My own statement about 'examples' reminds me of another thing about 'Libraries'.

The examples are *everything*.

No one will read your documentation.

No one will attempt to actually write anything.

They will take your example, copy-paste it into their codebase, and fuck with it until it sorta works.

Examples should be basically the *very best* practice of doing everything. They should never be dumbed down and simple so people can understand them if that conflicts with making them the best way of doing things.

March 11, 2013 at 5:40 PM

Blogger cbloom said...

"What if you ship the tight core, but all ship a good example wrapper as part of the product?"

Yeah, I've been considering that. Make a small PITA official library API, and then have a nice bunch of wrappers on the outside, shipped as client-side example code.

I've sort of started that with some C++-ish wrappers in client-side example code (the API is all pure C) but haven't really made it official practice.

March 12, 2013 at 8:24 AM

Blogger cbloom said...

"The examples are *everything*."

Actually what I'm finding is that customers are all different, and none of them is very comprehensive in their approach. Each person tends to have their one thing that they focus on, and they don't like to use other methods of learning.

That is, some people indeed just go to the examples. But other people seem to be reading docs and don't look at the examples at all. They'll send me questions with broken code snippets that are trying to do something straight out of the examples and I'm like "why don't you just copy-paste from the example that does that" and they didn't look at the examples at all.

I definitely agree with this though :

"Examples should be basically the *very best* practice of doing everything."

A lot of people will just copy-paste the example, so if the example is shitty performance than lots of people will get shitty performance. They'll blame the library and they'll be right to do so.

March 12, 2013 at 8:28 AM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.