ew Features as of version 1.6.0 August 2015 BIG NEWS: Major updates across libraries to support very large serializations: greater than 2Gig! New Python C Extension Module for supporting OC Serialization in Python. Why? Pickling can't support large serialization (>2G), but OC Ser can. All MidasTalker, MidasServer, etc. can use SERIALIZE_OC in Python and C++. For example: >>> import numpy >>> from pyoceser import ocdumps, odloads # Python C Extension module >>> huge_array = numpy.zeros(2**32+1, 'b') >>> serialized_string = ocdumps(huge_array) >>> get_back_huge_array = odloads(serialized_string) Suport for cx_t for all int types. For DSP, complex ints can sometimes be the way data is sampled off an antenna. Top-Level * Adding timeout capability to TransactionLock * Added external break checker/timeouts for shared memory routines * Added fixed size allocator, and augment StreamingPool to possibly use * CircularQueue now cleans up after a get() to force reclamation * Clean up Array to allow 64-bit lengths * Minor change to BigInt and BigUInt interface (sign) because of Array fixes * Change length()/entries() to return size_t throughout baseline * Updated Val to support cx_t<> for int_1,..int_u8 * Updated ocser, ocserialize, xmldumper, xmlloader, convert to support cx_t<> * Updated MidasTalker/MidasServer to allow sockets to have very large msgs * Updated the OpenContainers serialization to allow very large msgs (>2*32) * Updated the PickleLoader and PickleDumper to allow very large msgs (>2*32) * Added a new Python C Extension module "pyocser" for Python OC Serialization - This is needed to do serialization of very large objects * Updated OpalPythonDaemon and related code to work w/large serialized data Details: * Python C Extension Because Python Pickling 2 cannot handle very large arrays or strings, we needed to create a serialization that will work with very large arrays and strings (byte lengths of over 2**32). To this end, we have added a new Python C Extension module to the baseline called 'pyocser' with two functions: ocdumps(po) : Dumps Python Object po, returning the serialized string ocloads(str) : Load a Python Object from the given string (str), returning the deserialized Python Object - pyocser.h,.cc : Implementation of OC Serialization in Python - pyocsermodule.cc : The binding code to make a Python module - check_ocser.py : A python script to verify the ocdumps/loads works - psuedonumeric.h : Since we may not have Numeric or numpy, pull the definition (refactored from pyobjconvert) - pyobjconverter.cc: - setup.py : Added the new pyocser module setup and build A few fixes to occomplex.h, ocnumerictools.h, ocnumpy.tools.h &ocval.cc so that the Python C Extension module would build correctly on RedHat 5, 32-bit, Python 2.4 systems . Note that the default testing environment is RedHat 6, 64-bit Python 2.6 systems. * Python midaslistener_ex.py, midasserver_ex.py, permutation_client.py, permutation_server.py, midasyeller_ex.py README: Updates to show you can use OC Serialization (--ser=5) if you have the Python C Extension module pyocser built. Also recommends NumPy (--arrdisp=4) going forward. midassocket.py: * Changes for SERIALIZE_OC is supported, plus the capability to try to give good error messages. * Changes so can have either 4 byte header (normal messages with length of message under 4G), or 12-byte header (0xFFFFFFFF and length in int_u8) * Updates to send and recv send: Don't use sendall anymore: actually break apart buffers because may run into problems with non-blocking sockets, and also buffers that are too big recv: Not allowed to use very large sizes, so we have to break buffers up into chunks midastalker.py: * Updates to documentation for SERIALIZE_OC speed_test.py: * updates to allow ocdumps and ocloads. Surprisingly, ocdumps and ocloads are 50%-100% faster than pickle2/unpickle2 * Updated to OpenContainers 1.8.1 - ocport.h, Makfile, Makefile.Linux, Makefile.Linux.factored, Makefile.icc : updated version numbers - ocproxy.h : Added a timeout capability for TransactionLock. By default, there is no timeout, so it can try to get in forever. If the timeout expires, however, a runtime_error is thrown, and the user can adjust accordingly. - ocspinfo.h, ocstreamingpool.h,cc : Updated so will work with pools larger than 2G (some ints returned change size to int_ptr, which is 4 bytes on a 32-bit machimne, 8 bytes on a 64-bit machine). Also, modified so can have a fixed size allocator section, if desired. - ocfixedsizeallocator.h fsa_test.cc, fsa_test.output : A small, low overhead allocator for small pieces. And a test. - run_all : Added fsa_test - occircularbuffer.h : When a get happens, it leaves the old value in there (fully constructed). For the OCCQ, this means a full copy of the packet stays in the queue, when it needs to immediately get reclaimed by the memory allocator. - ocarray.h, array_test.cc, array_test.output : length() and capacity() now return size_t. Restructured Array so that length and capacity are size_t (so potentially 64-bit quantities on 64-bit machines) so Array can hold very large sizes--This meant restructuring Array, which meant getting rid of the reservedSpace (replacing it with a bit in the capacity) and usenewandDlete field (got subsumed under the allocator). All of this was necessary to keep the sizeof(Array) to 32, but still allow 64-bit quantities. Updated the test to make sure setBit/getBit work. - ocbigint.h, ocbigint_test.cc, ocbigint_test.output : Because the "reservedSpace" is gone from Array, we have to use the new interface (getBit/setBit) of array which only gives us one bit of information for the sign bit. Also updated the code to use size_t for all the new sizes. Updated the test to make sure negate works. - ocbigunit.h : Updated code to propagate the extra bit along (mostly for BigUInt). Updated to use size_t in a few places (because of Array's change), Also moved length and expandTo to the class (instead of always deferring to the impl). - ocavlhasht.h, ocavltreet.h, ochashtablet.h, ocordavlhasht.h : Made entries return a size_t instead of an int_u4 (which is what it is stored as internally anyways). Also had the iterators use size_t when using an index to iterate. - ocval.cc, ocval.cc: : Everything that returned a size_t: updated the code to use size_t rather than int when appropriate. Also cleaned up iterators to use size_t. Added .length() method to Tab - ocstring_impl.h: Use size_t for internal length store for large strs - occomplex.h : Updated so better support for printing cx_t, and added MOVEARRAYPOD for cx_t - ocnumerictools.h, ocnumpytools.h : NumPy and Numeric DO NOT support complex ints, so for output we just use the PTOOLS char tags (which are distinct from Numeric tags). Also updated Array conversions to support cx_t. - ocproxy.h, ocproxy.cc: Updated to support cx_t - ocval.h, ocval.cc: Updated to support cx_t. Updated TagFor to support all new types, including Array - complex_test.cc, complex_test.output, runall: : New test: make sure the cx_t seems to be working - ocser.cc, ocserialize.cc, ocval.cc, ser_test.cc,.output, occonvert.h : Updated the built-in serializations to handle serialization of cx_t and Array> Also updated to allow very large tables, arrays to be able to be serialized. There are two tags for most large containers: 'n''t''o''u''q''Q' for int_u4 lengths 'N''T''O''U''y''Y' for int_u8 lengths. The new tags are only seen by the serialization routines. Also updated the tests to check these still work. - ocserialize.cc, ocloadumpcontext.h : Factored out the load and dump contexts so the Python C Ext modules can reuse. * C++: Added the ability to pass in an external error check (pointer to function) and set a timeout. Inside the shared memory code, many wait for pipes to be available/be created have a loop with a small timeout. With this change, the user can (a) specify the timeout and (b) force immediate failure by checking an external break checker. These changes are important for allowing the shared memory routines to do a clean shutdown in the face of error conditions. Also allows the SHMMain to use small allocators. This allows the little chunks of memory (mostly Vals) to not clutter memory so much. Also deferring to 16 byte alignment (SSE instructions, esp. FFTW require 16 byte alignment). Update memdump to use size_t - sharedmem.h,.cc: - shmboot.h,.cc See above. - xmlloader.h, xmldumper.h, xmlload_test.cc,.output : Updated XML tools to make sure they work with cx_t Updated the socket code (MidasTalker and the ilk) to allow very large messages (greater than 2*32 bytes) - fdtools.h: * Added htonll/ntohll for large ints in Network Byte Order * updated methods ReadExact, WriteExact, clientReadExact, and readUntilFull, and data members of FDTools class to use size_t instead of int * Updated the number of bytes written/read routines so that if the data is larger than int_u4, will write special escape sequence 0xFFFFFFFF and then another 8 bytes for the real bytelength - midastalker.h : * Updates to size_t for byte counts (from int) - midassocket.h : * Updates so uses 4 bytes for normal message counts, for > 0xFFFFFFFF, uses 4 bytes ESC count of 0XFFFFFFF and then an 8 byte length. Also udpates for size_t when needed - xmldump_test.cc : updated to size_t for byte counts - valprotocol.h, valprotocol.cc : updated to use size_t for lengths - pickleloader_test.cc, .output: updated for size_t for lengths - pickleloader.h: * Significant updates for size_t for byte counts to support very large pickles - p2_test.cc, .output, p2common.h * Updated for size_t interface changes, and updated the p2_test to be current and correct - chooseser.h: * Updated to use size_t instead of int for all lengths - README: Show that NumPy is preferred, and there is a new Python C-extension module for pyocser (OC serialization in Python). - midastalker_ex2.cc: * Left some code in to try very large arrays and strings to test new large facilities M2K: Updated a bunch of files so that the OpalPythonDaemon works with serialized data of greater than 2G. NOTE! These only work if you update M2k Vector and string to use size_t for lengths/capacities. By default, early MITE/M2k use unsigned int/int_u4, which does not work for data > 4G. At this time, only OC (OpenContainers serialization) works with data gerater - m2openconser.cc: Updating protocol for int_u8 for some lengths - m2opalmsgextnethdr.h.,cc: * Substantial changes so works with underlying Midas 2k infrastructure and still work with int_u8 sizes. Added - m2opalpythondaemon.cc: uses size_t for lengths instead of int - m2opalpythontablewriter.h,cc: Allows to use OC Serialization