Reproducible, Archivable, and Accessible PDFs with Modern LaTeX
With modern LaTeX we can create documents that are well suited for long-term archiving and include accessibility features. When combined with Nix dev environments, we can create fully reproducible builds—two builds of the same sources produce byte-identical PDFs.
Reproducibility has two parts: (1) pin the toolchain so the same LaTeX stack is used every time, and (2) remove the remaining nondeterminism inside the PDF output (timestamps, random IDs, and metadata).
Below is a minimal template that passes veraPDF’s PDF/A-4f and PDF/UA-2 checks with a reproducible build using a Nix flake featuring
- LuaLaTeX as engine
- A recent TeXLive distribution (last: 2025-03-09)
- veraPDF for validation
Creating PDF/A-4f and PDF/UA-2 compliant documents
We target two standards: PDF/A-4f for long-term archival (the “f” variant allows embedded files), and PDF/UA-2 for accessibility (universal accessibility, enabling screen readers and assistive technologies).
The key command is \DocumentMetadata provided by the LaTeX kernel.
The new PDF management engine is in active development, so modifications and changes are expected.
The current documentation is available here: The documentmetadata-support code.
For a PDF 2.0 conforming to the standards A-4f and UA-2, we use
\DocumentMetadata{
lang = en-US,
pdfversion = 2.0,
pdfstandard = A-4f,
pdfstandard = UA-2,
tagging-setup = { role/user-NS = latex }
}
Note: pdfstandard=… declares conformance in metadata; it doesn’t automatically enforce/validate all requirements—especially for PDF/UA-2, where tagging and correct structure are essential.
To create a tagged PDF, we also need the tagpdf package.
\usepackage{tagpdf}
\tagpdfsetup{activate, tabsorder=structure}
Note: tagging support is still evolving in LaTeX, as part of the “Tagged PDF” project, and only a few document classes are compatible. You can find the current status here: Tagging Status of LaTeX Packages and Classes.
Making the build reproducible
Our goal is to make the build reproducible so that the produced document is a function of the input and any two builds produce byte-identical output.
To this end, we set up the toolchain with Nix, then remove remaining per-build nondeterminism inside LuaLaTeX, hyperref, and tagpdf.
The Nix flake
Compiling the same document with a different TeXLive release or even a different package version will most likely lead to a slightly different document.
With Nix flakes we can easily create a fixed build environment that bundles a specific version of TeXLive. This also allows us to only include the packages we actually need.
{
description = "Minimal LaTeX environment";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
flake-utils.url = "github:numtide/flake-utils";
};
outputs = { self, nixpkgs, flake-utils }:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = import nixpkgs { inherit system; };
tex = pkgs.texlive.combine {
inherit (pkgs.texlive)
scheme-small
lualatex-math
luamml
tagpdf
pdfmanagement-testphase;
};
src = pkgs.lib.cleanSourceWith {
src = ./.;
filter = path: type:
let
rel = pkgs.lib.removePrefix (toString ./. + "/") (toString path);
ignored = [ ".cache" ".direnv" "result" ];
in
!builtins.elem rel ignored
&& !builtins.any (d: pkgs.lib.hasPrefix "${d}/" rel) ignored;
};
version = if self ? shortRev then self.shortRev else "dirty";
in
{
devShells.default = pkgs.mkShell {
packages = [
tex
pkgs.verapdf
];
# Match the nix build environment for reproducibility
SOURCE_DATE_EPOCH = 0;
# TZ as mkShell attribute doesn't work on macOS, use shellHook
shellHook = ''
export TZ=UTC
'';
};
packages.default = pkgs.stdenvNoCC.mkDerivation {
pname = "main";
inherit version src;
nativeBuildInputs = [
tex
];
TZ = "UTC";
# LuaLaTeX's luaotfload needs writable cache directories
TEXMFHOME = "./texmf";
TEXMFVAR = "./texmf-var";
# stdenv auto-detects SOURCE_DATE_EPOCH from source file timestamps,
# override it in preBuild to ensure reproducibility with epoch 0
preBuild = ''
export SOURCE_DATE_EPOCH=0
'';
buildPhase = ''
runHook preBuild
lualatex main.tex
runHook postBuild
'';
installPhase = ''
runHook preInstall
mkdir -p $out
cp main.pdf $out/main.pdf
runHook postInstall
'';
};
});
}
You can enter the dev environment via nix develop.
Making LaTeX deterministic
Modern LaTeX still introduces a few nondeterministic bits by default, which we fix one-by-one.
Fix the SOURCE_DATE_EPOCH and timezone
By default, LaTeX includes the creation date into the file.
We therefore need to set SOURCE_DATE_EPOCH to a constant value and TZ=UTC to ensure consistent timezone handling in timestamps.
TZ=UTC SOURCE_DATE_EPOCH=0
Fix the seed of the RNG
The tagpdf package used to create a tagged PDF generates random IDs.
Hence, to be deterministic, we need to fix the seed of the RNG.
\UseName{sys_gset_rand_seed:n}{20260130}
Suppress LuaTeX dates
LuaTeX adds dynamic timestamps for the creation and modification dates to the PDF info dict. For reproducibility, we need to suppress these.
% 32=CreationDate, 64=ModDate
\pdfvariable suppressoptionalinfo \numexpr32+64\relax
Fix the trailer ID
LuaTeX generates unique trailer IDs per build. For reproducibility, we set the ID to a constant value.
\pdfvariable trailerid{[<00112233445566778899aabbccddeeff><ffeeddccbbaa99887766554433221100>]}
Fix hyperref’s PDF metadata
Hyperref also includes dynamic dates and IDs that we need to set to constant values for reproducibility.
\hypersetup{
pdfcreationdate={D:20260130000000Z},
pdfmoddate={D:20260130000000Z},
pdfmetadate={D:20260130000000Z},
pdfdocumentid={uuid:00112233-4455-6677-8899-aabbccddeeff},
pdfinstanceid={uuid:ffeeddcc-bbaa-9988-7766-554433221100}
}
The result
Combining everything together, we obtain a minimal template for creating reproducible and standard-compliant documents.
You can change \Taggingtrue to \Taggingfalse to create an untagged PDF.
% Tagging toggle (set true to enable PDF/UA tagging)
\newif\ifTagging
\Taggingtrue
\ifTagging
% Reproducible RNG seed (tagpdf user namespace and any other random use)
\UseName{sys_gset_rand_seed:n}{20260130}
% Create a tagged PDF/A-4 and UA-2 compliant document.
\DocumentMetadata{
lang = en-US,
pdfversion = 2.0,
pdfstandard = A-4f,
pdfstandard = UA-2,
tagging-setup = { role/user-NS = latex }
}
\else
% Create an untagged PDF/A-4f compliant document.
\DocumentMetadata{
lang = en-US,
pdfversion = 2.0,
pdfstandard = A-4f
}
\fi
\documentclass{article}
\usepackage{hyperref}
\ifTagging
\usepackage{tagpdf}
\tagpdfsetup{activate, tabsorder=structure}
\fi
% Required for enabling MathML generation.
\usepackage{unicode-math}
% Suppress timestamps for reproducible builds (LuaTeX only).
% 32=CreationDate, 64=ModDate
\pdfvariable suppressoptionalinfo \numexpr32+64\relax
% Hardcoded PDF trailer ID for reproducible builds (required for PDF/A).
\pdfvariable trailerid{[<00112233445566778899aabbccddeeff><ffeeddccbbaa99887766554433221100>]}
% Static dates for reproducibility.
\hypersetup{
pdfcreationdate={D:20260130000000Z},
pdfmoddate={D:20260130000000Z},
pdfmetadate={D:20260130000000Z},
pdfdocumentid={uuid:00112233-4455-6677-8899-aabbccddeeff},
pdfinstanceid={uuid:ffeeddcc-bbaa-9988-7766-554433221100}
}
\begin{document}
This is a reproducible and standard-compliant document.
\end{document}
You can build the document either via nix build producing the output in the result directory or directly with
TZ=UTC SOURCE_DATE_EPOCH=0 lualatex main.tex
In either case, you should have an identical PDF file.
$ shasum -a 256 main.pdf result/main.pdf
32e4e384a97816dd0798d2320c98992472df0eef394a5e45425d65e335181325 main.pdf
32e4e384a97816dd0798d2320c98992472df0eef394a5e45425d65e335181325 result/main.pdf
Note that the digest will differ if you use a different version of nixpkgs.
Verifying standard compliance
We can verify that our document passes the validation check with veraPDF.
$ verapdf --format text main.pdf
PASS /path/to/main.pdf 4f
PASS /path/to/main.pdf ua2
PASS /path/to/main.pdf wt1a
PASS /path/to/main.pdf wt1r
For more information and possible configuration options, see the veraPDF CLI Validation page.
Alternatively, Adobe Acrobat Pro’s Preflight tool can also validate documents.
With this setup, you have a solid foundation for creating long-term archivable, accessible, and fully reproducible PDF documents.