With a few exceptions (namely large objects and prepared-statement parameters), all data is transferred from the client to the server, and stored on the client side, in textual form.
General binary transfers will probably never be added. There are many reasons for this:
- It's a lot of work. All of the result class would have to be rewritten. Things get especially difficult if we are to support a choice between binary and textual data per column of a result set.
- TANSTAAFL: There Ain't No Such Thing as a Free Lunch. There would be some additional API complexity such as separate binary versions of functions, or a "binary mode" that your application would probably have to keep track of. We could support polymorphism so your program can treat binary and text-converted data more or less the same—but polymorphism comes with a performance cost of its own. And ultimately the handling of binary data would probably differ somewhat from textual data in any case.
- You'd still have to go through data conversions between the standardized format the database sends and your program's native types (and on the server end, possibly from the stored type to a type suitable for transfer). This may happen more often than you think because of differences in byte order, widths of integral types, and so on. It's probably a faster conversion than to/from text, but the cost is not completely eliminated.
- The necessary type information is not available at compile time, and unless we leave the conversion of binary data to your program—which is hard to get right—the library would have to look up that information every time you accessed a field. Another performance cost of binary transfers.
- Conversions to/from text are convenient and universal. Say, for example, that you are retrieving an integer from the database into a variable of type long in your program. That may require a conversion. But what if you wanted to read the same integer into a floating-point variable? If we support c different types on the client side, and s different types on the server side, we'd need to build a matrix of c*s possible conversions. Some of those would be supported, and some would probably be errors. If c=7 and s=10, for example, that makes 70, plus potentially a few more if a type's binary format changes between server versions. Or we could convert from one type to text, and from there on to the other type. That would make for only 2*(c+s) conversions (34 in the example), but at a greater runtime cost than you're paying now.
- Getting all these conversions exactly right on all platforms is not going to be easy. Extensive testing would be required for libpqxx, but if your application is portable, it too would have to be thoroughly tested against multiple compilers and/or platforms. The difficulty is compounded by the fact that binary data is harder to interpret when hunting down bugs: with textual data you can just print what you get and see for yourself what's going wrong.
- The performance cost of converting to and from, and transferring in, text form is not very great compared to the unavoidable cost of requesting, looking up, retrieving, transmitting, and processing the data. For all our efforts, the most we could gain for our efforts is a reduction of that portion of the program's run time. It's science.
That said, if anyone knows of a way that gets around all this, please get on the mailing lists and let us know!
