Puk

Documentation

« back to PM2 home.
Puk documentation

Documentation for Puk-2025-03-18+b4ed7d07041aecd589e8d5bb9d281654cd4543a7 (build runner-7cspzrfxd-project-25853-concurrent-0 Thu Nov 13 12:02:57 UTC 2025)

Overview

Puk is the foundation library of PadicoTM implementing base services. It contains the following:

Puk was designed to be used by PadicoTM but may be freely used in any project. It is currently used in PadicoTM, pioman, and NewMadeleine.

For installation instructions, see README.

Base tools

Tracing and logging

Tracing and logging are merged as a single facility: messages with a verbosity level and filtering capabilities.

Profiling

Profiling variables, similar to MPI pvars, are implemented. These variables are typed and come with a description. Profiling support is always enabled. This interface allows to define profiling variables in a common fashion; however, the user is supposed to increment the variables manually in his code.

Additionally, memory profiling and profiling of Puk itself may enabled through the configure flag --enable-profile.

Profiling variables are dumped at the end of execution by applying the filter defined in the PUK_PROFILE environment variable. Set ‘PUK_PROFILE=’*'` to dump all variables. By default, nothing is dumped.

Allocators

Standard libc memory allocator malloc may be too slow for use in HPC code on the critical path. We propose optimized allocators for fixed size objects:

XML parsing

XML is used in a lot of places in Puk and PadicoTM. Puk relies on Expat and comes with a higher-level wrapper of the XML parlser:

Encoding and checksuming

Various checksums and encoding operators are implemented for internal uses and are available for the user.

Base64 encoding is used internally to encode binary content that needs to be passed on the command line or in XML files.

  • file: Puk-base64.h
  • command-line tools: puk-base64-encode, puk-base64-decode

Checksuming and hashes are used for hash tables and for network checksums. Various algorithms are implemented: XOR, Adler, Fletcher, Jenkins, FNV1a, Knuth hash, Murmurhash2a, Paul Hsieh superfast, CRC (including using SSE4.2 hardware acceleration), siphash.

Data structures

Various data structures are implemented in a generic fashion. Most data structures are typed: a macro is used to generate a typed container for the given type of object to contain, as well as all the typed inline functions.

Modules

Modules are libraries that can be dynamically loaded. They are linked using padico-mkmod, and installed in <prefix>/lib/padico/. They are the base of PadicoTM.

Modules may have dependancies over other modules and directly use their symbols. However, plain old binaries cannot depend on a module and thus cannot use directly their symbols. They must use indirection, or more likely, components (see below). As a consequence, a code that needs to depend on a module must be itself a module.

A module has a type, defined by a driver. The different drivers are:

  • binary: this is ths most common type of modules. It contains dynamically loadable libraries (.so), loaded through dlopen().
  • builtin: a variant of binary module that is not dynamically loadable, but compiled in the library instead.
  • pkg: a module that contain other modules.
  • multi: a module that contain other modules, and loaded in SPMD on all nodes (when using PadicoTM)
  • sh: module contain a shell script in stead of a library (when using PadicoTM).

When configured with flag --enable-builtin, all binary modules are instead compiled as builtin. These decreases the number of files to load upon startup, and thus improve startup time, especially for large deployment when using NFS, but it reduces modularity (all modules must be loadable on all nodes). Since modules are usually built with their own Makefile, when building all module as builtin, a separate build system must be used: nmad and PadicoTM use mod.mk files that are then included in the main Makefile to insert the given module in the main library, and per-module makefiles are ignored.

A binary or builtin module must define functions for:

  • init: called when the module is loaded
  • run: called when the module is run; only relevant for modules containing a main and runnable. Most modules leave it to NULL
  • finalize: called before the module is unloaded.

Any of these functions may be left to NULL.

To define a binary module, we must use the macro PADICO_MODULE_DECLARE from Module.h. It must be called only once for the whole module. If the module is comprised of multiple source files, other files may be hooked to the module by using PADICO_MODULE_HOOK. For a builtin module, use PADICO_MODULE_BUILTIN instead of PADICO_MODULE_DECLARE. The module name must be a valid name for a C symbol.

In addition, a module may have configuration attributes (options) declared through PADICO_MODULE_ATTR. They contain:

  • a label
  • a type
  • the name of the environment variable that will be used to set its value
  • a plain text description
  • a default value
  • optionnal flags See API: Puk options for more information on attributes.

Resources:

Components

Puk defines its own model for software components. Components are runtime entities and must not be mistaken with modules. A component may be hosted by a module. A module containing a single component may be defined with PADICO_MODULE_COMPONENT.

Basic elements of components are:

  • interfaces (puk_iface_t): the type for an interface, comprised of a name and a driver (a struct that contain functions). Typed functions are generated when an interface is declared with PUK_IFACE_TYPE.
  • facet (puk_facet_t): an interface provided by a component; they have a type and a label.
  • receptacle (struct puk_uses_s): an interface used by a component; they have a type and a label.
  • component (puk_component_t): the component itself.
  • context (puk_context_t): configuration context for a component, i.e. connections for facets and receptacles, and values for attributes
  • composite: an assembly of components, with their connections and configuration; the composite itself is seen from outside as a component.
  • instance (puk_instance_t): an instance of a component

All components must define at least interface PadicoComponent, used for instantiation.

To find direction through assembly, the following method are available:

  • from outside of a component, find the entry for interface of type FOO of an instance:

    puk_instance_indirect_FOO(puk_instance_t, const char*label, struct puk_receptacle_FOO_s*r) where r contains a field for the driver (struct that contain functions) and the _status that contain the status of the instance for the given component.

  • from outside of a component, find the entry point for interface of type FOO, without instance:

    puk_component_get_driver_FOO(puk_component_t component, const char*label)

  • from inside a component, find who is connected to one of our receptacles of type FOO:

    puk_instance_context_indirect_FOO(puk_instance_t instance, const char*label, struct puk_receptacle_FOO_s*r) where instance is the inner instance (the puk_instance_t given to the component at instantiation.

Assembly of components (creating a composite) may be performed using the internal API. It is however less clumsy using XML description. For syntax for XML assembly (see example puk_test_component_context.c and uses in PadicoTM/Services/NetSelector-besteffort/ns-besteffort.c)

Resources: