Mr.Docs as a Library

A C++ program can also link the mrdocs-core library, drive corpus extraction itself, and pass the resulting Corpus to its own analysis. This is the alternative to use when the scripting extensions are not powerful enough, such as processing multiple codebases at once, embedding Mr.Docs’s extractor in another C++ host (IDE plugin, static analyzer), or writing a custom Generators with non-trivial C++ logic that depend on special system libraries.

Build integration

mrdocs-core is exported through a CMake package config:

find_package(mrdocs REQUIRED CONFIG)
add_executable(my_tool main.cpp)
target_link_libraries(my_tool PRIVATE mrdocs::mrdocs-core)
target_compile_features(my_tool PRIVATE cxx_std_23)

Building a corpus

The Library Reference for mrdocs-core covers the public API for extracting and processing a corpus. A consumer typically drives extraction with:

  1. Read an mrdocs.yml into a fully validated Config.

  2. Call Corpus::build with the resulting config.

The compilation database is resolved from the configuration with the same rules Mr.Docs uses. LLVM and Clang are never exposed to the caller.

The breaking-changes example packages those steps in one helper:

examples/library/breaking-changes/src/Corpus.cpp
Expected<std::unique_ptr<Corpus>>
loadCorpusFromConfig(
    std::string const& configPath,
    ReferenceDirectories const& dirs,
    ThreadPool& threadPool)
{
    Config::Settings settings;
    ReferenceDirectories localDirs = dirs;
    localDirs.cwd = files::getParentDir(configPath);
    MRDOCS_TRY(Config::Settings::load_file(settings, configPath, localDirs));
    MRDOCS_TRY(settings.normalize(localDirs));

    MRDOCS_TRY(auto config, Config::load(settings, localDirs, threadPool));
    return Corpus::build(config);
}

Reading a corpus

A Corpus is iterable and looks up symbols by name. The breaking-changes example walks the baseline corpus once, asks the candidate corpus for the matching symbol with Corpus::lookup, and classifies each pair:

  • Symbols in v1 that are missing from v2 are removed;

  • the reverse are added;

  • same-name same-kind pairs go to a kind-specific comparator.

examples/library/breaking-changes/src/Diff.cpp
DiffResult
diff(Corpus const& v1, Corpus const& v2)
{
    DiffResult result;

    // For every public symbol in v1, see what happened in v2.
    for (Symbol const& a : v1)
    {
        if (!isPublicTopLevel(a))
        {
            continue;
        }
        std::string name = v1.qualifiedName(a);
        auto found = v2.lookup(name);
        if (!found)
        {
            result.removed.emplace_back(std::move(name), &a);
            continue;
        }
        Symbol const& b = *found;

        if (a.Kind != b.Kind)
        {
            // A function turned into a record, or similar.
            // Treat it as removed + added rather than changed.
            result.added.emplace_back(name, &b);
            result.removed.emplace_back(std::move(name), &a);
            continue;
        }

        if (a.isFunction())
        {
            auto reasons = compareFunctions(a.asFunction(), b.asFunction());
            if (!reasons.empty())
            {
                result.changed.push_back(
                    {std::move(name), &a, &b, std::move(reasons)});
            }
        }
    }

    // Anything public in v2 that wasn't in v1 is a new addition.
    for (Symbol const& b : v2)
    {
        if (!isPublicTopLevel(b))
        {
            continue;
        }
        std::string name = v2.qualifiedName(b);
        if (!v1.lookup(name))
        {
            result.added.emplace_back(std::move(name), &b);
        }
    }

    if (!result.removed.empty() || !result.changed.empty())
    {
        result.impact = SemverImpact::Major;
    }
    else if (!result.added.empty())
    {
        result.impact = SemverImpact::Minor;
    }
    return result;
}

Custom Generators

You can subclass Generator to create new output formats that need to do something script generators cannot express, such as comparing multiple corpora, multi-pass analyzes, emitting binary formats, depending on system libraries. Generator::build writes the format to disk for the multi-file case, and Generator::buildOne is used for single-page output.

The example’s BreakingChangesGenerator holds a reference to the baseline corpus and runs the diff against whatever candidate corpus Generator::buildOne is invoked with:

examples/library/breaking-changes/src/BreakingChangesGenerator.hpp
// A Generator subclass that emits a breaking-change report when its
// `buildOne(stream, current)` is invoked. The baseline corpus is
// captured at construction; the candidate corpus is supplied per
// call. Registering this generator into the process-global registry
// under the id "breaking-changes" lets a caller look it up the same
// way they would any built-in generator.
class BreakingChangesGenerator final : public Generator
{
public:
    explicit BreakingChangesGenerator(Corpus const& baseline)
        : baseline_(&baseline)
    {}

    std::string_view id()           const noexcept override;
    std::string_view displayName()  const noexcept override;
    std::string_view fileExtension()const noexcept override;

    Expected<void>
    buildOne(std::ostream& os, Corpus const& current) const override;

private:
    Corpus const* baseline_;
};

The body of Generator::buildOne is short, the heavy lifting is the diff and the formatter:

examples/library/breaking-changes/src/BreakingChangesGenerator.cpp
Expected<void>
BreakingChangesGenerator::
buildOne(std::ostream& os, Corpus const& current) const
{
    writeReport(os, diff(*baseline_, current));
    return {};
}

The example ships two test fixtures, each its own directory with an mrdocs.yml and a single lib.cpp that stands in for a small library:

  • v1

  • v2

examples/library/breaking-changes/test/v1/lib.cpp
/// A point on the plane.
struct Point
{
    /// X coordinate.
    int x;

    /// Y coordinate.
    int y;

    /** Distance from the origin.

        @return The Euclidean length of the vector.
    */
    double length() const;
};

/// Compute the area of a circle.
double area_of_circle(double radius);

/// Helper that should be removed in v2.
int legacy_helper();
examples/library/breaking-changes/test/v2/lib.cpp
/// A point on the plane.
struct Point
{
    /// X coordinate.
    int x;

    /// Y coordinate.
    int y;

    /** Distance from the origin.

        @return The Euclidean length of the vector.
    */
    double length() const noexcept;
};

/// Compute the area of a circle.
double area_of_circle(double radius);

/// New helper added in v2.
double area_of_square(double side);

Running the tool against those two fixtures produces the report through the generator’s Generator::buildOne:

examples/library/breaking-changes/test/expected.txt
== Breaking-change report ==

Added (1):
  + [function] area_of_square

Removed (1):
  - [function] legacy_helper

Changed (1):
  * [function] Point::length
      - noexcept differs

Suggested version bump: major