New features as of version 1.2.0 November 7th, 2010 This is a fairly large release: + adding C++ Ordered Dictionaries (OTab: works will Val) + adding C++ Tuples (Tup: works with Val) + adding C++ arbirary sized integers (int_un and int_n: works with Val) + Python Pickling Protocol 0,1,2 Loader rewritten to be faster, stabler + adding support for nan, inf, and -inf in Python dictionaries + CHANGE: Array now prints out as Python Numeric repr would: See below + CHANGE: Valreader (Eval, etc). sees "1L" as an int_n (used to be int_8), + CHANGE: large ints (70127340817203478192) become int_n instead of real_8 + DEPRECATE: Because the array module in Python 2.7 changes how Pickling works from 2.6, we deprecate the ArrayDisposition AS_PYTHON_ARRAY: we can detect and load both versions (Python 2.6 and Python 2.7) of a pickled table easily, but we currently dump the arrays as Python 2.6 does (not 2.7). See below for more discussion. Because Tup and OTab and BigInt are new features, there are some tools to convert them back and forth (to Arr, Tab, Str respectively); however we have gone to some lengths to make sure that conversion happens automatically in case you are talking to a "legacy" server/client that can't support OTab or Tup. +Added: The LoadValFromFile and DumpValToFile (and LoadValFromArray and DumpValToArray): these are convenience routines for the user to allow them to dump/load any Val from C++ to any of the supported serialization through one interface. In other words: We can choose easily HOW to serialize to a file: DumpValToFile(some_val, "dumpfile.p0", SERIALIZE_P0); DumpValToFile(some_val, "dumpfile.p2", SERIALIZE_P2); DumpValToFile(some_val, "dumpfile.bintbl", SERIALIZE_M2K); DumpValToFile(some_val, "dumpfile.oc", SERIALIZE_OC); DumpValToFile(some_val, "dumpfile.oc", SERIALIZE_TEXT); DumpValToFile(some_val, "dumpfile.tab", SERIALIZE_PRETTY); All Midas thingees use the DumpVal/LoadVal routines now. There is also a test called chooseser_test.cc where we test all sorts of different serializations and make sure legacy conversion (see below) works. + Compatibility concerns: Throughout the baseline, the new PickleLoader (in pickleloader.h) has replaced the previous two Pickling loaders: -PythonDepicklerA for Pickling Protocol 0 - DEPRECATED -P2LoadVal for Pickling Protocol 2) - DEPRECATED In general, the new loader is faster, simpler, handles both protocols, and handles the new data structures of 1.2.0 (OTab, Tup, int_n). By default, anyplace where you load P0 or P2 will use the new loader. The original code has been included in the baseline if you need it, and you can set your seralization to SERIALIZE_P0_OLDIMPL or SERIALIZE_P2_OLDIMPL to pick up the old behavior, but in general you shouldn't need to (in fact, the older load protocols don't support the new features of 1.2.0). + The M2k components do not currently understand the int_n or OTab or Tup, as there is no equivalent structure in M2k. Future releases will have M2k handle them and convert them. In general, this shouldn't be a problem: ALWAYS choose compatibility mode when you talk to an M2k thingee, or simply don't pass it OTab/Tup/BigInt. +INTERFACE CHANGE and BUG FIX: C++ and Python dicts were previously not completely interchangable! From C++: How POD arrays print out is different: Array a = Eval("array([1,2,3], 'd')"); cout << a << endl; // as before: 1 2 3 Val v = a; cout << v << endl; // OLD: array([1 2 3]) // NEW as 1.2.0: array([1,2,3], 'd') The new way DOES BREAK INTERFACE, but (a) The Python output and the C++ output are NOW the same (they weren't before) (b) From Python you can now input Numeric arrays and not lose precision (c) more consistent with repr The real reason is because you lose information and you couldn't read Numerics from C++. Now, you can: # Python >>> t = { 'a': array([1,2,3], 'i') } >>> eval(repr(t)) == t True // C++ Val t = Tab("{ 'a': array([1,2,3], 'i') }"); cout << bool(Eval(Stringize(t)) == t) << endl; // True Before, the Python repr and C++ Stringize were NOT the same, and you couldn't pass them between each other. Now, you should be able to read C++ produced tables with Array and not lose (too much) information. +Also added support for nan, inf, +inf, -inf when real values print out, as when they read from files. +Complete C++ rewrite of the loader for Pickling Protocol 0 and 2: smaller footprint (both runtime and codespace), easier to maintain and augment, just as fast. There are still some unimplemented features, but in general it is better and much more complete. +Stringize for any integer type is now 2x faster +Added an Ordered AVLHashT (preserves insertion order, not sorted) +Added ordavlhasht_test and ordavlhash_test +Updated the occontainer_test slightly for OrdAVLHashT +Added OTab to Val framework (Ordered Tab, ordered by insertion order): this is essentially a Python (2.7 and up) OrderedDict. +Added otab_test +Added Tup to Val framework(essentially a Python tuple). +Also added tup_test +Added a Randomizer class (allows you to see all random numbers between 0 and n-1 with O(1) space and amortized O(1) generation +Added randomizer_test +Added conversion routines to occonvert.h to allow converting Tup to Arr and OTab to Tab for someone who needs backwards compatibility. +The top-level (shallow) converters: ConvertOTabToTab/ConvertTabToOTab +The recursive (deep-copy) convert: ConvertAllOTabTupToTabTup +Added conversion tests to otab_test, tab_test +Updated ConvertTabsToArr (etc.) to understand OTab and Tup. +Made it so MOST of the Pickling protocols all understand OrderedDict/OTab and tuples/Tup. This includes: + A compatability mode: The protocols previously approximated Tuples with Arr, so we have to make sure we can talk to old servers as well (and "force" the Tuple->Arr, OTab->Tab, BigInt->str. + Backwards compat to Text processing for C++ - Backwards compat to Text processing for M2k OpalTables + Pickling Protocol 0 on C++ side + Pickling Protocol 2 on C++ side (w/ new loader, limited w/ old loader) + very limited Pickling Protocol -2 on C++ side (not supported anymore) + OpenContainers serialization on C++ side ALL generic serialization goes through the chooseser.h files, (i..e, DumpValToArray and LoadValFromArray interfaces). which has a default parameter where you can "request" conversion of Otab->Tab, Tup->Arr and BigInt->Str. This gives us a higher degree of confidence this works because we can test it separately from the socket comms. Midask 2K: Since M2k doesn't have anything like OTab, BigInt, Tup, they serialization ALWAYS converts to Tab, Str, Arr. In the future we will revisit the M2k components and use the new loader. For now, we have the old M2k components as is, but the raw C++ will always convert. Pickling Protocol -2 DOES NOT support any of the new features: as of this release we officially deprecate that code. +The text parsing now understands 2 forms of OrderedDict: OrderedDict([('a':1), ('b':2')]) # The long-winded, but Python compat. way o{ 'a':1, 'b':2 } # New idea [ 'a':1, 'b': 2 ] # TODO New idea ... maybe in future? +Augmentation: Added "remove" method to Val so Tab,Arr,and OTab understand it +Augmentation: Added swapInto method to all AVLTreeT, AVLHashT, OrdAVLHashT +Augmentation: Added ConvertOTabToTab and ConvertTabToOTab for easy conversions. Also added the unsafe SwapInto, which allows conversion, but at the cost of the other data structure. SwapInto allows only the destruction of the other when it's done. +Augmentation: Minor revamps to make all three AVLX classes more consistent +Bug Fix: Made it so select traps exception instead of runtime_error only inside of MidasServer +Bug Fix: Early return inside of MidasSocket in case file descriptor already disappeared: this fix allows the user to call disconnectClient manually. +Bug Fix: Make it so AVLTreeT, AVLHashTreeT, and OrdAVLHashT (and thus Tab and OTab) can use <,<=,>,>= without seg faulting: added to pre-existing tests as well. Comparison is very similar to how Python does it. +Bug Fix: Pickling Protocol 0 had trouble loading tuples because it forgot to remove the PY_MARK from dict and list puts +Added documentation to document the new OTab and Tup and int_n and DumpValToFile, LoadValFromFile, DumValToArray, LoadValFromArray +Need some global message to show we are in "compatibility mode" when using OTab and Tup so it is very clear they are being converted for you: All the examples show this. +Allow the clients/servers to choose a compatibility mode: all examples in X-Midas and standalone. +Updated X-Midas and tried with X-Midas 4.4.4 and 4.6.3 +Output chooses between OrderedDict and o{ } +Added ability to go back to OC_USE_OC_STRING (we had lost that ability a while ago), but its use requires setting a bunch of flags at once: -DOC_USE_OC_STRING -DOC_USE_OC_EXCEPTIONS -DOC_ONLY_NEEDED_STL -DOC_NEW_STYLE_INCLUDES Added a "commented out" version of this in the Makefile.Linux +Added arbitrary sized signed and unsigned integers: BigInt and BigUInt: the user uses int_un and int_n respectively (the best implementation is chosen for 32-bit or 64-bit machines). +Added a large suite of tests for BigInts. +Added the int_un and int_n into the suite of values supported by Val, and some tests (esp. for comparing). +BUG FIX: complex compare (<. <=, etc.) used to seg-fault. Now, just like Python, they throw an exception; We wanted them to "not compile", but all the type conversion did "too much", so we do what Python does: throw an exception. +Adding valreader_test, and changing so we reognize L as int_n +Added the ability to build a shared libptools.so +Added the 'speed_test.cc' into the C++ area and 'speed_test.py' in the Python area so you can compare speeds of different serializations: both versions use "about" the same table, so you can compare the relative Python speed vs. the C++ speed as well as the different protocols Current speeds in seconds: (-O4 on 64-bit Fedora 13 machine) C++ Python g++ 4.4.4 2.6 # Comments ----------------------------- Pickle Text 5.64 5.12 # About equiv Pickle Protocol 0 14.56 12.70 # Python slightly faster Pickle Protocol 2 1.36 8.05 # C++ significantly faster Pickle M2k 3.03 N/A Pickle OpenContainers 1.31 N/A # OC is fastest overall UnPickle Text 32.55 38.23 # About equiv UnPickle Protocol OLD 0 36.66 7.20 # Why OLD P0 is deprecated! UnPickle Protocol NEW 0 7.48 7.20 # About equiv UnPickle Protocol OLD 2 9.24 4.46 # Why OLD P2 is deprecated! UnPickle Protocol NEW 2 6.24 4.46 # Python still faster UnPickle M2k 9.00 N/A UnPickle OpenContainers 4.12 N/A # OC is fastest overall These numbers show we are, in general, on par with the Python serialization: in some cases we are faster (the dump for P2 is 5x faster in C++ than Python) and some cases slower (load for P2). + Added UDP Clients and Servers from M2k that can talk to MidasYellers and MidasListeners. + Updated the M2k area to reflect there are two units now: udpif and opalpython + Moved all the opalpython stuff to that dir, udpif to its own dir + Bug-fix and DEFAULT CHANGE for MidasServer shutdown: When a MidasServer does an open, it passes a timeout value which is how long the MidasServer should "watch" the socket port for activity before it looks around to see if it shutdown. The default was None, which means it watches the socket forever: this is almost never what you want: you want the server to wake up occasionally to make sure you haven't been told to shutdown. It you wake up too often, it wastes CPU: too rarely, and you aren't responsive. Changed the default to 1.0 (for 1.0 seconds). This, incidentally fixes the X-Midas server primitives (xmserver and permserver) which have trouble responding to ^C (they would apparently ignore it): Explicitly added the shutdown to both xmserver and permserver, and made sure they fixed the problem. + Fixed all the X-Midas primitives, server, and talker examples to be able to Ctrl-C out of them. Mostly, this meant adding something like: import time while 1: time.sleep(1) a.shutdown() a.waitForMainLoopToFinish() Coupled with the MidasServer change (the open takes 1 as default, see above), all primitives can be escaped via a ctrl-C. + Updated the pickledloader_test to test for user-defined classes and complicated data structures. + Moved IsLittleEndian to opencontainers from m2pythontools.h + Changed the MidasSocket to use the PickleLoader BY DEFAULT: this makes the Pickling Loading for Protocol 0 and 2 much faster and more robust. + Moved all the Pickle Opcodes into cpickle.h, and changed the names ALL OVER THE BASELINE to be PY_NAME instead of just NAME. This gets rid of some redundancy as the names were also in m2pythonpickler.h + Made it so we can pickle BigInt in both Pickling Protocol 0 and 2 + Updated valreader and valreader_test: whenever an L is at the end of an integer, it becomes an int_n or int_un. THIS BREAKS BACKWARDS COMPATIBILITY: those used to be converted to real_8, but now they are converted to int_ns. + Tested conversion to/from Val/int_n/int_un/real_8/plainint/string to make sure that all works as expected. + Added tests for different conversions to/from BigInt/BigUInt and real_8/binary (streams) + X-Midas libraries.cfg: Newer X-Midas's need to explicitly declare the libpend on ptools on oc, so we explicitly added that to libraries.cfg + Added a decimal printer so you can print out int_un/int_un to any precision you want. + Added Pickling Protocol 0 (and presumably Protcol 1 now works too) to new loader: updated tests and output. + Added the ability of the C++ Pickle Loader to handle user-defined REDUCEs and BUILDs: by default, a Pickled class becomes a dictionary. + The whole purpose of adding the ArrayDisposition AS_PYTHON_ARRAY was because, in Python 2.6, it was a binary dump: dumping arrays of POD (Plain Old Data, like real_4, int_8, complex_16) was blindingly fast (as it was basically a memcpy): This was to help Python users who didn't necessarily have the Numeric module installed. As of Python 2.7, however, the Pickling of Arrays has changed: it turns each element into a Python number and INDIVIDUALLY pickles each element (much like the AS_LIST option). The new Pickleloader DOES DETECT AND WORK with both 2.6 and 2.7 pickle streams, but we currently only dump as 2.6: this change in Python 2.7 (and also Python 3.x) defeats the whole purpose of supporting array .. we wanted a binary protocol for dumping large amounts of binary data! As of this release, we deprecate the AS_PYTHON_ARRAY serialization, but will keep it around for a number of releases. + BUG FIX: The M2k Components, when using OpenContainers serialization miscomputed the size of an M2_TIME, causing an underestimate which would cause a seg fault.