pdf-stamper

0.2.14-SNAPSHOT


Combine template descriptions and template PDFs with data to produce PDFs.

dependencies

org.clojure/clojure
1.6.0
org.clojure/data.xml
0.0.8
org.apache.pdfbox/pdfbox
1.8.7
potemkin
0.3.10
prismatic/schema
1.0.4



(this space intentionally left almost blank)
 

PDF creation from templates

pdf-stamper lets you build complete PDF documents without worrying about building the layout in code. Those who have tried will know that it is by no means a simple task getting the layout just right, and building a layout that can adapt to changing requirements can get frustrating in the long run.

With pdf-stamper the layout is decoupled from the code extracting and manipulating the data. This leads to a simpler process for building PDF documents from your data: Data placement is controlled by template description datastructures, and data is written to PDF pages defining the layout.

(ns pdf-stamper
  (:require
    [pdf-stamper.context :as context]
    [pdf-stamper.text :as text]
    [pdf-stamper.text.parsed :as parsed-text]
    [pdf-stamper.images :as images]
    [potemkin])
  (:import
    [org.apache.pdfbox.pdmodel PDDocument]
    [org.apache.pdfbox.pdmodel.edit PDPageContentStream]))

Templates

template descriptions are regular Clojure maps with the three keys:

  • :name
  • :overflow
  • :holes

The :overflow key is optional. It defines which template description to use if/when a hole on this template overflows. If it is not present text will be truncated.

Holes

Holes are what make a template description: They define where on the page the various pieces of data are put, and how.

There are a number of hole types built in to pdf-stamper, but new hole types can be added by implementing this multimethod.

If a hole type should be able to overflow, the return value from a call to fill-hole must be a map of the form {<hole name> {:contents ...}}.

(defmulti fill-hole
  (fn [document c-stream hole location-data context] (:type hole)))

All holes have these fields in common:

  • :height
  • :width
  • :x
  • :y
  • :name
  • :type
  • :priority

Coordinates and widths/heights are always in PDF points (1/72 inch).

Note: The PDF coordinate system starts from the bottom left, and increasing y-values move the cursor up. Thus, all (x,y) coordinates specified in templates should be to the lower left corner.

:priority is effectively a layering of the contents on template pages; e.g. if you have two overlapping holes on a template the one with the lowest value in :priority will be drawn on the page first, and the other hole on top of that.

When filling the holes on a page we have to take into account that Clojure sequences are lazy by default; i.e. we cannot expect the side-effects of stamping to the PDF page to have happened just by applying the map function. doall is used to force all side-effects before returning the resulting seq of overflowing holes.

Note: Holes where the page does not contain data will be skipped.

(defn- fill-holes
  [document c-stream holes page-data context]
  (doall
    (into {}
          (map (fn [hole]
                 (when-let [location-data (get-in page-data [:locations (:name hole)])]
                   (fill-hole document c-stream hole location-data context)))
               (sort-by :priority holes)))))

The types supported out of the box are:

  • :image
  • :text
  • :text-parsed

For specifics on the hole types supported out of the box, see the documentation for their respective namespaces.

(defmethod fill-hole :image
  [document c-stream hole location-data context]
  (let [data (merge hole location-data)]
    (images/fill-image document
                       c-stream
                       data
                       context)))
(defmethod fill-hole :text-parsed
  [document c-stream hole location-data context]
  (let [data (update-in (merge hole location-data)
                        [:contents :text]
                        #(if (string? %) (parsed-text/get-paragraph-nodes %) %))]
    (text/fill-text-parsed document
                           c-stream
                           data
                           context)))
(defmethod fill-hole :text
  [document c-stream hole location-data context]
  (let [data (merge hole location-data)]
    (text/fill-text document
                    c-stream
                    data
                    context)))

The Context

The context is the datastructure that contains additional data needed by pdf-stamper. For now that is fonts and templates (both descriptions and files). This namespace contains referrals to the three important user-facing functions from the context namespace, namely add-template, add-font, and base-context. For a detailed write-up on the context, please refer to the namespace documentation.

(potemkin/import-vars
  [pdf-stamper.context
   add-template
   add-font
   base-context])

Filling pages

pdf-stamper exists to fill data onto pages while following a pre-defined layout. This is where the magic happens.

Trying to stamp a page that requests a template not in the context is an error. This is function is used to give a clear name to the precondition of fill-page.

(defn- page-template-exists?
  [page-data context]
  (get-in context [:templates (:template page-data)]))

Every single page is passed through this function, which extracts the relevant template and description for the page data, adds it to the document being built, and delegates the actual work to the hole- filling functions defined above.

The template to use is extracted from the page data. Using this the available holes, template PDF page, and template to use with overflowing holes (if any) are extracted from the context.

Any overflowing holes are handled by calling recursively with the overflow. All other holes are copied as-is to the new page, to make repeating holes possible.

Future: It would probably be wise to find a better way than a direct recursive call to handle overflows. Otherwise handling large bodies of text could become a problem.

(defn- fill-page
  [document page-data context]
  (assert (page-template-exists? page-data context)
          (str "No template " (:template page-data) " for page."))
  (let [template (:template page-data)
        template-overflow (context/get-template-overflow template context)
        template-holes (context/get-template-holes template context)
        template-doc (context/get-template-document template context)
        template-page (-> template-doc (.getDocumentCatalog) (.getAllPages) (.get 0))
        template-c-stream (PDPageContentStream. document template-page true false)]
    (.addPage document template-page)
    (let [overflows (fill-holes document template-c-stream (sort-by :priority template-holes) page-data context)
          overflow-page-data {:template template-overflow
                              :locations (when (seq overflows)
                                           (merge (:locations page-data) overflows))}]
      (.close template-c-stream)
      (if (and (seq (:locations overflow-page-data))
               (:template overflow-page-data))
        (conj (fill-page document overflow-page-data context) template-doc)
        [template-doc]))))

When the context is populated with fonts and templates, this is the function to call. The data passed in as the first argument is a description of each individual page, i.e. a seq of maps containing the keys:

  • :template
  • :locations

    The former is the name of a template in the context, and the latter is a map where the keys are hole names present in the template. The value is always a map with the key :contents, which itself is a map. The key in the contents map depends on the type of the hole, as defined in the template; e.g. :image for image holes, :text for text and parsed text holes. This is really an implementation detail of the individual functions for filling the holes.

    The completed document is written to the resulting java.io.ByteArrayOutputStream, ready to be sent over the network or written to a file using a java.io.FileOutputStream.

(defn fill-pages
  [pages context]
  (let [output (java.io.ByteArrayOutputStream.)]
    (with-open [document (PDDocument.)]
      (let [context-with-embedded-fonts (reduce (fn [context [font style]]
                                                  (context/embed-font document font style context))
                                                context
                                                (:fonts-to-embed context))
            open-documents (doall (map #(fill-page document % context-with-embedded-fonts) pages))]
        (.save document output)
        (doseq [doc (flatten open-documents)]
          (.close doc))))
    output))

This concludes the discussion of the primary interface to pdf-stamper. Following are the namespace documentations for the functionality that is not directly user-facing.

 

The state of pdf-stamper is encapsulated in a datastructure called the context. This structure contains fonts and templates, and is partly constructed by users of pdf-stamper by adding to a base context. This base context contains the fonts included in PDFBox standard, and nothing else.

The functions in this namespace all exist to modify or query the context datastructure. Functions relevant to client code is exported in the pdf-stamper namespace, so this namespace is intended only for internal use. However, the documentation may still be relevant to clients of pdf-stamper.

(ns pdf-stamper.context
  (:require
    [clojure.edn :as edn]
    [clojure.string :as string]
    [clojure.java.io :as io]
    [pdf-stamper.schemas :as schemas])
  (:import
    [org.apache.pdfbox.pdmodel PDDocument]
    [org.apache.pdfbox.pdmodel.font PDFont PDType1Font PDTrueTypeFont]))

Templates

There are no standard templates in pdf-stamper.

(def base-templates
  {})

Adding templates to the context is achieved using this function. When adding a template two things are needed: The template description, i.e. what goes where, and a locator for the PDF page to use with the template description. The template locator can be either a URL or a string.

(defn add-template
  ^{:pre [(some? template-uri)]}
  [template-def template-uri context]
  (when-let [schema-check (schemas/validation-errors template-def)]
    (throw (ex-info (str schema-check " | IN: " template-def) schema-check)))
  (-> context
      (assoc-in [:templates (:name template-def)] template-def)
      (assoc-in [:templates (:name template-def) :uri] template-uri)))

The template file is loaded lazily, i.e. it is not until a page actually requests to be written using the added template that it is read to memory.

(defn get-template-document
  [template context]
  (let [file-uri (get-in context [:templates template :uri])]
    (assert file-uri (str "file-uri is nil for template " template))
    (PDDocument/load file-uri)))

Any template consists of a number of holes specifying the size and shape of data when stamped onto the template PDF page.

(defn get-template-holes
  [template context]
  (get-in context [:templates template :holes]))

Templates can specify an overflow template, a template that will be used for any data that did not fit in the holes on the original template's page.

(defn get-template-overflow
  [template context]
  (get-in context [:templates template :overflow]))

Fonts

Fonts in PDF follow the typographical conventions. Important font concepts for this project are:

  • baseline, the line that the cursor follows when writing
  • ascent, the maximum ascent of any glyph above the baseline
  • descent, the maximum descent of any glyph below the baseline

These are illustrated below:

Font explanation

When writing text the cursor origin is placed on the baseline.

pdf-templates uses PDFBox under the hood, and because of that includes all the standard PDF fonts defined by PDFBox.

(def base-fonts
  {:times {#{:regular} PDType1Font/TIMES_ROMAN
           #{:bold} PDType1Font/TIMES_BOLD
           #{:italic} PDType1Font/TIMES_ITALIC
           #{:bold :italic} PDType1Font/TIMES_BOLD_ITALIC}
   :helvetica {#{:regular} PDType1Font/HELVETICA
               #{:bold} PDType1Font/HELVETICA_BOLD
               #{:italic} PDType1Font/HELVETICA_OBLIQUE
               #{:bold :italic} PDType1Font/HELVETICA_BOLD_OBLIQUE}
   :courier {#{:regular} PDType1Font/COURIER
             #{:bold} PDType1Font/COURIER_BOLD
             #{:italic} PDType1Font/COURIER_OBLIQUE
             #{:bold :italic} PDType1Font/COURIER_BOLD_OBLIQUE}
   :symbol {#{:regular} PDType1Font/SYMBOL}
   :zapf-dingbats {#{:regular} PDType1Font/ZAPF_DINGBATS}})

If any templates have need of fonts that are not part of the standard PDF font library, they can be added by providing a font descriptor, the font name and the font style. As an example, had the Times New Roman bold font not been present already, here is how one would add it: (add-font "times_bold.ttf" :times #{:bold}).

Notice how the style is a set of keywords. This is to support the combined font styles like bold AND italic, without requiring an arbitrary ordering on the individual parts of the style.

In the example above the font descriptor was provided as a string representing a file name, but it could just as well have been a java.net.URL, java.net.URI or java.io.File.

In PDF, non-standard fonts should be embedded in the document that uses them. Adding a font like above does not automatically embed it to a document, since the context does not have knowledge of documents. Instead, the context is updated with a seq of [font style] pairs that need to be embedded when a new document is created.

Note: Only TTF fonts are supported.

(defn add-font
  [desc font style context]
  (-> context
      (assoc-in [:fonts (keyword font) style :desc] desc)
      (update-in [:fonts-to-embed] #((fnil conj []) % [(keyword font) style]))))

On creation of a new document all fonts in the seq of fonts to embed should be embedded. If for some reason a font is found in the seq of fonts to embed but does not contain a descriptor, nothing happens and the context is returned unmodified. In practice this situation is highly unlikely, and the check is primarily in place to prevent unanticipated crashes (in case code external to pdf-stamper modified the context).

The font descriptor is coerced to an input stream and loaded into the document, after which it is automatically closed.

(defn embed-font
  [doc font style context]
  (if-let [font-desc (get-in context [:fonts font style :desc])]
    (assoc-in context [:fonts font style] (with-open [font (io/input-stream font-desc)]
                                            (PDTrueTypeFont/loadTTF doc font)))
    context))

When a font has been added to the context and embedded in a document, it can be queried by providing the font name and style.

It is guaranteed that a font is always found. Thus, if no font with the given name is registered the default font (Times New Roman) is used with the supplied style. If again no font is found, the default font and style are used (Times New Roman Regular).

(defn get-font
  [font-name style context]
  {:post [(instance? PDFont %)]}
  (get-in context [:fonts font-name style]
          (get-in context [:fonts :times style]
                  (get-in context [:fonts :times #{:regular}]))))

Font utilities

The following utility functions rely on PDFBox' built-in font inspection methods. In PDFBox the font widths and heights are returned in a size that is multiplied by 1000 (presumably because of rounding, but I may be wrong), which explains the, otherwise seemingly arbitrary, divisions by 1000.

Computing line lengths of unknown strings requires knowledge of the average width of a font, given style and size.

(defn get-average-font-width
  [font-name style size context]
  (let [font (get-font font-name style context)]
    (* (/ (.. font (getAverageFontWidth)) 1000) size)))

With complete knowledge of the string it is possible to get the exact width of the string.

(defn get-font-string-width
  [font-name style size string context]
  (let [font (get-font font-name style context)]
    (* (/ (.. font (getStringWidth string)) 1000) size)))
(defn get-font-descent
  [font-name style size context]
  (let [font (get-font font-name style context)
        font-descriptor (.. font (getFontDescriptor))
        descent (.. font-descriptor (getDescent))]
    (* (/ (Math/abs descent) 1000) size)))
(defn get-font-ascent
  [font-name style size context]
  (let [font (get-font font-name style context)
        font-descriptor (.. font (getFontDescriptor))
        ascent (.. font-descriptor (getAscent))]
    (* (/ ascent 1000) size)))

By adding the absolute value of the font's descent to the font's ascent, we get the actual height of the font. We have to use the absolute value of the descent since it might be a negative value (it probably is, at least for FreeType fonts).

(defn get-font-height
  [font-name style size context]
  (let [font (get-font font-name style context)
        font-descriptor (.. font (getFontDescriptor))
        ascent (.. font-descriptor (getAscent))
        descent (.. font-descriptor (getDescent))]
    (* (/ (+ ascent (Math/abs descent)) 1000) size)))

The leading is the extra spacing from baseline to baseline, used for multi-line text.

(defn get-font-leading
  [font-name style size context]
  (let [font (get-font font-name style context)
        font-descriptor (.. font (getFontDescriptor))
        leading (.. font-descriptor (getLeading))]
    (* (/ leading 1000) size)))

Base context

The base context is a combination of the base fonts with the base templates, and simply provides a good starting point for adding custom fonts and own templates.

(def base-context
  {:templates base-templates
   :fonts base-fonts})
 

Image holes

Holes where :type is :image. In addition to the above keys image holes must have an :aspect key.

(ns pdf-stamper.images
  (:import
    [org.apache.pdfbox.pdmodel PDDocument]
    [org.apache.pdfbox.pdmodel.edit PDPageContentStream]
    [org.apache.pdfbox.pdmodel.graphics.xobject PDJpeg PDXObjectImage PDPixelMap]))

To calculate the new dimensions for scaled images, the image is first scaled such that the height fits into the available bounds. If the image width is still larger than it should be, it means that the scaling factor for the width is larger than for the height, and we use that to compute a new height.

The arguments are passed as a map to provide some context to the four numbers, as it is otherwise too easy to mix up the parameters when applying this function.

Future: Going by the description above it should be possible to refactor this to compute both scaling factors up front, and simply use the largest.

(defn- scale-dimensions
  [{:keys [b-width b-height i-width i-height]}]
  (let [height-factor (/ b-height i-height)
        new-width (* i-width height-factor)]
    (if (> new-width b-width)
      (let [width-factor (/ b-width new-width)
            new-height (* b-height width-factor)]
        [b-width new-height])
      [new-width b-height])))

Stamping an image onto the PDF's content stream while still preserving aspect ratio potentially requires moving the image's origin. new-x and new-y move the image origin by half the scaled images delta width and height.

(defn- draw-image-preserve-aspect
  [c-stream image data]
  (let [{:keys [x y width height]} data
        awt-image (.. image (getRGBImage))
        img-height (.. awt-image (getHeight))
        img-width (.. awt-image (getWidth))
        [scaled-width scaled-height] (scale-dimensions {:b-width width
                                                        :b-height height
                                                        :i-width img-width
                                                        :i-height img-height})
        new-x (+ x (Math/abs (/ (- width scaled-width) 2)))
        new-y (+ y (Math/abs (/ (- height scaled-height) 2)))]
    (.. c-stream (drawXObject image new-x new-y scaled-width scaled-height))))

Stamping an image onto the PDF's content stream without preserving aspect ratio is much simpler: PDFBox resizes the image to fill the entire box specified by width and height, potentially skewing the image.

(defn- draw-image
  [c-stream image data]
  (let [{:keys [x y width height]} data]
    (.. c-stream (drawXObject image x y width height))))

When stamping an image, the image is always shrunk to fit the dimensions of the hole. The value of the :aspect key in data defines whether the image is shrunk to fit, or aspect ratio is preserved. Possible values are :fit or :preserve, with :preserve being the default.

It is possible to specify the quality of the stamped image by setting the :quality key to a value between 0.0 and 1.0. The default quality if not specified is 0.75.

Note: Using PDJpeg does not cancel out support for PNGs. It seems that the PNGs are internally converted to JPEGs (**TO BE CONFIRMED**).

(defn fill-image
  [document c-stream data context]
  (let [aspect-ratio (get data :aspect :preserve)
        image-quality (get data :quality 0.75)
        image (PDJpeg. document (get-in data [:contents :image]) image-quality)]
    (assert image "Image must be present in hole contents.")
    (condp = aspect-ratio
      :preserve (draw-image-preserve-aspect c-stream image data)
      :fit (draw-image c-stream image data))))
 

User input to pdf-stamper is validated using the schema library from Prismatic.

(ns pdf-stamper.schemas
  (:require
    [schema.core :as s]))
(def BaseHole
  {:height s/Num
   :width s/Num
   :x s/Num
   :y s/Num
   :name s/Keyword
   :priority s/Int})
(def ImageHole
  (merge BaseHole
         {:type (s/enum :image)
          (s/optional-key :quality) s/Num
          (s/optional-key :aspect) (s/enum :preserve :fit)}))
(def ParagraphFormat
  {:font s/Keyword
   :style #{s/Keyword}
   :size s/Int
   :color [(s/one s/Int "R") (s/one s/Int "G") (s/one s/Int "B")]
   :spacing {:paragraph {:above s/Num
                         :below s/Num}
             :line {:above s/Num
                    :below s/Num}}
   :indent {:all s/Num}})
(def BulletParagraphFormat
  (merge ParagraphFormat
         {(s/optional-key :bullet-char) s/Str}))
(def TextHole
  (merge BaseHole
          {:format ParagraphFormat
           :type (s/enum :text)
           :align {:horizontal (s/enum :center :left :right)
                   :vertical (s/enum :center :top :bottom)}}))
(def TextParsedHole
  (merge BaseHole
         {:type (s/enum :text-parsed)
          :format {:paragraph ParagraphFormat
                   :head-1 ParagraphFormat
                   :head-2 ParagraphFormat
                   :head-3 ParagraphFormat
                   :bullet BulletParagraphFormat
                   :number BulletParagraphFormat}}))
(def Hole
  (s/conditional
    #(= :image (:type %)) ImageHole
    #(= :text (:type %)) TextHole
    #(= :text-parsed (:type %)) TextParsedHole
    'has-valid-type-key))
(def hole-checker (s/checker Hole))

Return v if v is a valid hole, false otherwise.

If error-fn is supplied, calls that function with the error message. The return value of error-fn is discarded.

(defn valid-hole?
  ([v]
   (not (hole-checker v)))
  ([v error-fn]
   {:pre [(fn? error-fn)]}
   (if-let [err (hole-checker v)]
     (do
       (error-fn (merge v (if (map? err)
                            err
                            {:error err})))
       false)
     true)))
(def Template
  {:name s/Keyword
   (s/optional-key :overflow) s/Keyword
   :holes [Hole]})
(defn validation-errors
  [template]
  (s/check Template template))
 

In some situations, templates in pdf-stamper can become difficult to maintain. One such situation can occur when you have a number of template parts that combine with each other to form the final templates. If the template parts form semantic "layers", and each part of a layer needs to be combined with all parts of the following layer, we get an exponential explosion in the number of templates. Since there is a direction in the way parts are combined the final templates can be described by a number of trees, where each path from a leaf to the root describes one template.

To avoid having to write an exponential number of template descriptions by hand, this namespace provides utilities that allow you to specify how the semantic layers relate to each other.

(ns pdf-stamper.template-utils
  (:require
    [pdf-stamper.schemas :as schemas]
    [clojure.zip :as zip]))

The zipper

Since we are looking at trees, we use clojure.zip as an efficient way to manipulate the trees. We need some helpers to make the zippers easier to work with in this context.

First, we define what a zipper of template parts is.

Create a zipper of parts. This is basically a tree where nodes carry values. Every node is potentially a branch, i.e. leaf nodes are just nodes without children.

(defn- parts-zip
  [root]
  (zip/zipper
    (constantly true)
    ::children
    (fn [node children]
      (assoc node ::children children))
    root))

Subtrees

Since we cannot rely on a regular depth-first traversal of the trees when inserting new parts, we define ways to travel around subtrees.

We always add to leaves, and the final templates are constructed from the leaf paths, so an easy way to access leaves is needed.

Go to the left-most leaf in a given subtree.

(defn- to-leaf
  [tree]
  (loop [loc tree]
    (if (zip/down loc)
      (recur (zip/down loc))
      loc)))

Since leaves can be spread over several subtrees, and clojure.zip's left and right operations only travel to siblings, some way to find the next subtree that contains leaves is needed.

Returns the loc of the next sutree to insert parts in, or nil if there is none.

(defn- next-subtree
  [loc]
  (loop [parent loc]
    (if (zip/right parent)
      (zip/right parent)
      (when (zip/prev parent)
        (recur (zip/up parent))))))

Add the parts to all leaves of tree.

(defn- add-to-all-leaves
  [tree part & parts]
  ((comp parts-zip zip/root)
    (loop [leaf (to-leaf tree)]
      (let [new-node (reduce (fn [node p]
                               (zip/append-child node p))
                             (zip/append-child leaf part)
                             parts)]
        (if (zip/right new-node)
          (recur (zip/right new-node))
          (if-let [next-subtree (next-subtree new-node)]
            (recur (to-leaf next-subtree))
            new-node))))))

Parts

Parts are separated into variadic and non-variadic parts. Non-variadic parts are just regular maps and will be merged into the final template as-is.

(defn- add-non-variadic-part
  [trees part]
  (if (seq trees)
    (map (fn [tree]
           (add-to-all-leaves tree {::value part}))
         trees)
    (conj trees (parts-zip {::value part}))))

Variadic parts are parts that in the final template can be one of several values. The structure of a variadic part is as follows:

```clojure {:pdf-stamper/name "part" :pdf-stamper/optional? truthy :pdf-stamper/variants [{::variant-name "flower" ::variant-part {...}} {::variant-name "roots" ::variant-part {...}]} ```

Before a variadic part is inserted into the trees, some metadata is added to it. This metadata allows the construction of the final template name from the leaf-root path.

(defn- variadic-part?
  [part]
  (contains? part ::name))

Construct the final variadic parts, adding metadata fields for template construction.

(defn- variadic-parts
  [part-name variants optional?]
  (let [parts (map (fn [variant]
                     {::value (with-meta
                                (::variant-part variant)
                                {::name part-name
                                 ::part-name (::variant-name variant)})})
                   variants)]
    (if optional?
      (conj parts {::value (with-meta {} {::name part-name ::part-name ""})})
      parts)))

Make a variadic part a child to all leaves in all trees. Creates a new tree if there are none.

(defn- add-variadic-part
  [trees part]
  (if (seq trees)
    (map (fn [tree]
           (apply add-to-all-leaves tree (variadic-parts
                                           (::name part)
                                           (::variants part)
                                           (::optional? part))))
         trees)
    (apply conj trees (map parts-zip (variadic-parts
                                       (::name part)
                                       (::variants part)
                                       (::optional? part))))))
(comment
  (add-variadic-part [] {::name "foo"
                         ::optional? false
                         ::variants [{::variant-name "a" ::variant-part {:a 1}}
                                     {::variant-name "b" ::variant-part {:b 1}}]})
  
  (add-variadic-part (add-non-variadic-part [] {:nv 1})
                     {::name "foo"
                      ::optional? false
                      ::variants [{::variant-name "a" ::variant-part {:a 1}}
                                  {::variant-name "b" ::variant-part {:b 1}}]})
  
  (add-variadic-part (add-non-variadic-part [] {:nv 1})
                     {::name "foo"
                      ::optional? true
                      ::variants [{::variant-name "a" ::variant-part {:a 1}}
                                  {::variant-name "b" ::variant-part {:b 1}}]})
  
  (add-variadic-part (add-variadic-part
                       (add-non-variadic-part [] {:nv 1})
                       {::name "foo"
                        ::optional? true
                        ::variants [{::variant-name "a" ::variant-part {:a 1}}
                                    {::variant-name "b" ::variant-part {:b 1}}]})
                     {::name "bar"
                      ::optional? false
                      ::variants [{::variant-name "d" ::variant-part {:d 1}}
                                  {::variant-name "e" ::variant-part {:e 1}}]}))
(defn- add-part
  [trees part]
  (if (variadic-part? part)
    (add-variadic-part trees part)
    (add-non-variadic-part trees part)))

Paths

The paths from leaf to root describe the final templates by merging the value at each node. Values closer to the leaves overwrite values closer to the root in case of conflicts.

Construct a vector of root-leaf paths for tree. The paths contain only the node values.

(defn tree-paths
  [tree]
  (let [root->leafs (loop [paths []
                           leaf (to-leaf tree)]
                      (let [leaf-path (mapv ::value (zip/path leaf))
                            leaf-value (::value (zip/node leaf))
                            full-path (conj leaf-path leaf-value)]
                        (if (zip/right leaf)
                          (recur (conj paths full-path)
                                 (zip/right leaf))
                          (if-let [next-subtree (next-subtree leaf)]
                            (recur (conj paths full-path)
                                   (to-leaf next-subtree))
                            (conj paths full-path)))))]
    root->leafs))

Construct a seq of all root-leaf paths from all trees.

(defn- all-paths
  [trees]
  (mapcat tree-paths trees))

Building templates

The templates have a naming scheme with holes, which let variadic parts update influence the final template name.

(defn- replace-holes
  [name-with-holes hole-name part-name]
  (if (and hole-name part-name)
    (clojure.string/replace
      name-with-holes
      (re-pattern (str "\\$" hole-name "\\$"))
      part-name)
    name-with-holes))

Merges a seq of hole bases into a single seq of holes. A [hole base] is a vector of maps. The result is a single vector of maps.

(defn- merge-hole-bases
  [hole-bases & {:keys [?merge-fn ?validation-error-fn]}]
  (let [valid-hole? (if ?validation-error-fn
                      #(schemas/valid-hole? % ?validation-error-fn)
                      schemas/valid-hole?)]
    (into []
          (filter valid-hole?
                  (map (partial apply (or ?merge-fn merge))
                       (vals
                         (group-by :name
                                   (flatten hole-bases))))))))

Build a template from a naming scheme and a leaf-root path. Takes an optional merge function used when merging two templates.

(defn path-to-template
  [naming-scheme path & {:keys [?merge-fn ?validation-error-fn]}]
  (let [unmerged-holes (reduce (fn [template template-part]
                                 (let [metadata (meta template-part)
                                       part-name (::part-name metadata)
                                       scheme-value (::name metadata)]
                                   (-> template
                                       (update-in [:holes] conj template-part)
                                       (update-in [:name] replace-holes scheme-value part-name))))
                               {:holes []
                                :name naming-scheme}
                               path)
        validation-error-fn (when ?validation-error-fn
                              (partial ?validation-error-fn (:name unmerged-holes)))]
    (-> unmerged-holes
        (update-in [:holes] #(merge-hole-bases % :?merge-fn ?merge-fn :?validation-error-fn validation-error-fn))
        (update-in [:name] keyword))))
(defn parts->trees
  [parts]
  (reduce (fn [trees part]
            (add-part trees part))
          []
          parts))

Naming scheme is a keyword with "holes" defined by $hole-name$. Example naming scheme:

:rhubarb$part$

Values inbetween $'s are matched to the :name of individual parts and replaced as needed. Example with the above naming scheme:

parts = [ {:pdf-stamper.template-utils/name "part" :pdf-stamper.template-utils/optional? true :pdf-stamper.template-utils/variants [{:pdf-stamper.template-utils/variant-name "flower" :pdf-stamper.template-utils/variant-part } {:pdf-stamper.template-utils/variant-name "roots" :pdf-stamper.template-utils/variant-part }]}]

would yield templates with the names:

[:rhubarbflower :rhubarbroots]

And the appropriate template parts merged together in the order they are specified in the parts vector. is a vector of hole parts.

Returns a vector of