penlog ====== Specification ------------- The PENLog logging format is intended to be used as a generic and reusable data format for measurement data. The penlog format specifies an abstract data format consisting of various fields with data and metadata. The abstract penlog format can be mapped to multiple output formats, for instance ``json``, or ``hr``, … All available output formats are explained below. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. The penlog structured logging format consists of the following fields. Unset fields which are considered optional MUST be absent. ``component`` (string, OPTIONAL) The component, e.g. software module, which has issued the log message. In absence, an implementation SHOULD pull the content of the environment variable ``PENLOG_COMPONENT`` and MUST set it to ``root`` as a fallback. ``data`` (string, REQUIRED) The log message as an UTF-8 string. ``host`` (string, OPTIONAL) The hostname of the machine who generated the messages. This field is OPTIONAL, since it is missing in the human readable format. It is RECOMMENDED that implementations include this field, as it increases reproducability of logging data. ``id`` (string, OPTIONAL) A unique message identifier. ``line`` (string, OPTIONAL) Information about the file and line number where this log entry was generated. The information MUST be in the form ``filename:number``. ``filename`` can be an absolute or relative path, or a filename. ``priority`` (int, OPTIONAL) This field can be used to optionally set the priority. For priorities, the syslog priorities are used as defined by RFC5424. Implementations can indicate priorities by e.g. a separate color. ``stacktrace`` (string, OPTIONAL) Implementations can optionally include a stacktrace. This could be useful for debugging if fatal errors occur. Stacktraces are very specific to the used programming language, e.g. python or go. Thus, this field is just an unstructured string. ``tags`` (list[string], OPTIONAL) To each log entry a custom list of tags can be applied. For instance: ``["autogenerated", "pre-test", "post-test", …]``. Tags MAY be key value pairs, separated by ``=``. ``timestamp`` (string, REQUIRED) ISO8601 string of the current date. ``type`` (string, REQUIRED) The type field is a free field which can be used to assign a particular message type. Custom fields can be added freely, in other words, additional custom fields are OPTIONAL. Their post-processing and tooling around these custom fields is up to the developer and MUST be ignored by generic converters. JSON Format (json) ------------------ A penlog log file stored on disk is typically stored in the ``json`` output format. The tool ``hr(1)`` is intended to be used -- similar to ``cat(1)`` -- for viewing penlog data in the ``json`` output format. If encoding of a log message fails, the component MUST be set to ``JSON``, the type to ``ERROR``, and the error message MUST be included in ``data``. The ``json`` format consists of a verbatim sequence of the described JSON objects. Each JSON object MUST be present at one line, separated by ``\n`` (ascii 0x0a). In order to keep decoding simple and line based, no JSON arrays or virtual, endless JSON structures are employed. JSON pretty (json-pretty) ------------------------- The ``json`` format forces every JSON object to appear in a single line. The ``json-pretty`` format provides an indented, more readable json form for debugging purposes. The actual content of ``json`` and ``json-pretty`` is the same. It is adviced to use ``json`` for data processing pipelines due to less overhead. Human Readable Format (hr) -------------------------- The syntax of the human readable format looks like the following. Curly braces indicate a field from the JSON format. If a field is empty it expands to an zero length string; if ``id``, ``line``, ``tags``, or ``stacktrace`` are not availabe, the whole line is omitted. A verbatim curly brace brace is expressed with two ones: ``{{`` means ``{``:: {timestamp} {{{component}}} [{type}]: {prio-prefix} {data} -> id : {id} -> line: {line} -> tags: {tags} -> stacktrace: | {stacktrace} ``timestamp`` The RECOMMENDED timestamp format is Go's ``StampMilli`` format as defined to ``Jan _2 15:04:05.000``. ``component``, ``type`` The ``component`` and ``type`` fields MUST be padded or truncated that the colons, ``:``, in every single line are perfectly aligned. ``data`` The actual log message. It MAY be truncated to fit in the current terminal size. When it is truncated an ellipsis character (``…``) MUST be appended to indicate the truncation for the user, ``id`` The optional unique message identifier. ``line`` The optional filename and line number where this log entry origins from. ``stacktrace`` The optional stacktrace where this log entry origins from. ``prio-prefix`` An optional priority prefix. It is RECOMMENDED to indicate message priorities via colors. If colors are not available it MAY be desireable to indicate the priority via a short prefix. The prefixes are enclosed by brackets ``[`` and ``]``: ``E`` ``A``, ``C``, ``e``, ``w``, ``n``, ``i``, ``d``. These letters stand for: emergency, alert, critical, error, warning, notice, info, debug. ``tags`` The optional tags as comma separated values. Tiny Human Readable Format (hr-tiny) ------------------------------------ The ``hr-tiny`` format is the same as ``hr`` except that ``component`` and ``type`` are omitted:: Apr 2 12:48:08.906: Starting tshark with Apr 2 12:48:09.583: Doing stuff Example:: Apr 2 12:48:08.906 {scanner } [message]: Starting tshark with Apr 2 12:48:09.583 {moncay } [message]: Doing stuff If a JSON line cannot be decoded, the faulty text MUST be included in messages of type ``ERROR`` and component ``JSON``:: $ python -c "import foo" 2>&1 | hr Jun 16 08:19:01.305 {JSON } [ERROR ]: Traceback (most recent call last): Jun 16 08:19:01.305 {JSON } [ERROR ]: File "", line 1, in Jun 16 08:19:01.305 {JSON } [ERROR ]: ModuleNotFoundError: No module named 'foo' Environment Variables --------------------- The following environment variables MAY be understood by penlog implementations. The supported datatypes are ``string`` and ``bool``. A ``bool`` is a special string consisting of either ``t, T, true, TRUE, 1`` or ``f, F, false, FALSE, 0``. ``PENLOG_COMPONENT`` (string) If no component is set, the ``component`` field MAY be set via the ``PENLOG_COMPONENT`` variable at the scope of an operating system process. ``PENLOG_CAPTURE_LINES`` (bool) If this environment variable is set, implementations SHOULD emit filenames with line numbers via the ``line`` field. ``PENLOG_CAPTURE_STACKTRACES`` (bool) If this environment variable is set, implementations SHOULD provide stacktraces via the ``stacktrace`` field. ``PENLOG_OUTPUT`` (string) A switch for implementations to choose from several output forms. Available are: ``hr``, ``hr-tiny``, ``json``, ``json-pretty``, ``systemd``. ``PENLOG_LOGLEVEL`` (string) In order to limit the emitted logging messages, loglevels MAY be supported. If a library supports filtering based on loglevels, it MUST check this environment variable. The supported values are ``critical``, ``error``, ``warning``, ``notice``, ``info``, ``debug``, ``trace``. The default MUST be ``info``. A message MUST omitted if its ``priority`` field contains a value greater than ``PENLOG_LOGLEVEL``. A mapping between these strings and integer values is availabe in RFC5424. See Also -------- :manpage:``hr(1)``