PCC (the calling convention) and signatures were radically simplified and deoptimized a few years ago. Merge some of the old compile-time optimizations back in.

See the refactor CallingConventionsTasklist and CallingConventionsOverview for details.

misc/timeline.md contains the links the whiteknight's writeup on the PCC cleanup, benchmarks and future ideas on PCC improvemements.

Some of these items had a motivation for the good. Switching set_returns with get_results, dispatching by CallContext PMC everywhere, less PMCs created by call

(Refactor CallSignature to store arguments as a typed array, instead of as a PMC array. The types we know from the signature, so can store and retrieve the elements in a type-safe way. This will reduce the number of PMCs created, since integer, number, and string arguments can be stored directly instead of boxed in PMCs.)

Some background

PMC's are compiled automatically from .pmc files to .c and .h files by a simple compiler written in perl5 - Parrot::Pmc2c.

Each pmc defines a class definition with attributes and methods. Each method is optionally a multi-method, i.e. has multiple signatures. Each signature is defined in a FFI-alike way, i.e. called NCI within parrot - native call interface, simple strings to define the 4 possible types of each argument, and the resulting return type. S string, N number (a double), I integer, P pmc (an object). E.g. "SS->P" defines a method accepting two strings and returning one pmc.

Since parrot is supporting dynamic languages, the method call must find the object it is operating on at run-time. So we cannot optimize all method calls at compile-time, since the class of the object is not known at compile-time. (but we can cache it, and optimize for the common case)

The dispatch happens in src/call/pcc.c, and the args are handled in src/call/args.c. Currently all calls use a cached CallContext pmc for each signature, which uses a freshly allocated ResizablePMCArray to hold all args, and a FixedIntegerArray to hold all args flags.

The old PCC held self and a fixed max. number of args in pre-allocated registers in the Context object. The re-architectured CallContext PMC forbid direct access to the Context attrs, and needs to be done indirectly via vtable methods. The args are now held in an array of autoboxed PMCs.

Possible improvements

  1. For cases with no optional or slurpy args this can be optimized to an FixedPMCArray with known size beforehand. So at each call the sig array does not need to be realloc'ed.

  2. For each call the signature string is parsed at run-time, builds the CallContext pmc, the ResizablePMCArray for the args, a FixedIntegerArray for the arg flags and then calls the method.

    Improve the cases to keep the signature in CallContext pmc once it is created. Same signatures can be shared in a global sig hash, since they only describe one translation.

    { "SS->P" -> sigintarray1, "S->P" -> sigintarray2, ... }

  3. PMC methods are pre-compiled to NCI calls which use a similar workflow, parsing the string at run-time, translating the args and calling the extern method. So most method calls were changed to go via the NCI route in the generic case.

    This was improved with release 6.6.0, documented in "compile-time expand pcc params and set the return result" GH #1080. method calls (via nci) now set the args and return values directly at compile-time.

  4. For the remaining run-time handling of uncompiled methods: NCI sig handling is already better than pcc, e.g. Parrot_nci_sig_to_pcc() statically stack allocates a static_buf[16] for smaller signature strings, and it also creates a FixedIntegerArray, not a ResizableIntegerArray.

  5. Export the needed CallContext methods to allow direct access to the needed fields

  6. Move the autoboxed PMC args or at least the max allowed number of it back to the static CallContext registers.

  7. Omit unnecessary GC write barriers and _orig functions with most VTABLE and some METHOD calls. Done with release 6.5.0, documented in "Optimize GC write barriers in the pmc's" GH #1069

    For a typical vtable write method there are 3 generated c functions:

    • orig (fast)

    the PMC definition

    • native

    orig + gc write barrier

    => _orig should be expanded here, and can be omitted

    • pcc (slow)

    handles sigs at run-time and via varargs

    => should assign each typed sig in the compiler and also expand the return value handling

More Discussions

My tags:
Popular tags:
Powered by Catalyst