Resolved: How to read a graph given in a DOT file containing subgraphs with boost::read_graphviz?

Question:

I am trying to figure out how to use boost::read_graphviz in order to access subgraph structures in a given dot file. I can already use boost::read_graphviz in order to read some simple properties by using a property map (boost::dynamic_properties). However, i do not know how to use the property map in order to access the subgraphs. How could one access the subgraph clusters for the graph given below?
digraph G {

subgraph cluster_0 {
    a0 -> a1;
    a1 -> a2;
    a2 -> a3;
}

subgraph cluster_1 {
    b0 -> b1;
    b1 -> b2;
    b2 -> b3;
}
c1 -> a0;
c1 -> b0;
a1 -> b3;
b2 -> a3;
a3 -> a0;
a3 -> c2;
b3 -> c2;

}

Answer:

Okay. It does seem that the read_graphviz support has been dropped somewhere in history:
#if 0
  // This interface has not worked for a long time
  ...

  // These four require linking the BGL-Graphviz library: libbgl-viz.a
  // from the /src directory.
  // Library has not existed for a while
  extern void read_graphviz(const std::string& file, GraphvizDigraph& g);
  extern void read_graphviz(FILE* file, GraphvizDigraph& g);

  extern void read_graphviz(const std::string& file, GraphvizGraph& g);
  extern void read_graphviz(FILE* file, GraphvizGraph& g);
#endif
That’s sad.

A Plan


As I commented, I did once write a pretty comprehensive graphviz parser: How to use boost spirit list operator with mandatory minimum amount of elements?. I did /some/ work to translate the resulting AST into a graph model (not BGL, but it could be).
However, that seems like an overcomplicated approach for the sample given.

Simple Graphviz Parser


Let’s roll a very simple X3 grammar for the given subset:
namespace x3 = boost::spirit::x3;

using Id = std::string;

namespace Parser {
    using namespace x3;

    auto kw = [](auto... p) {
        return lexeme[(x3::as_parser(p) | ...) >> !(alnum | '_')];
    };

    x3::rule<struct _> graph{"graph"};

    auto subgraph  = kw("subgraph") >> graph;
    auto semi      = &('}'|subgraph) | ';';

    auto id        = lexeme[                            //
        !kw("edge", "node", "graph", "subgraph") //
        >> +char_("A-Za-z0-9_")];
    auto edge      = id >> *(kw("->") >> id) >> semi;
    auto node      = id >> semi;
    auto element   = subgraph | edge | node;
    auto graph_def = -id >> '{' >> *element >> '}' >> eps;

    auto simple_graphviz = skip(space)[kw("digraph") >> graph];

    BOOST_SPIRIT_DEFINE(graph)

} // namespace Parser
That is enough (and not even minimal). Note that
  • it does have the smarts to special case some keywords and
  • and treat ';' optional in some cases, but
  • it also introduces an empty root graph – I leave that as an exercise for the reader

See Live On Wandbox, printing "Parse success"

Wiring Events


In order to build /anything/ we should be able to inject Parser Event handlers. E.g.:
struct Events{};
struct Handlers {
    void node(Id id) { last = id; }
    void edge(Id id) { std::cerr << stack.back() << ": " << last << " -> " << id << "\n"; }
    void enter(auto optional_id) { stack.push_back(optional_id.value_or("(unnamed)")); }
    void leave(unused_type)      { stack.pop_back(); } 
  private:
    std::deque<std::string> stack;
    Id last{};
};
Now, Events is going to be used as context key:
#define ON(event) ([](auto& ctx) { get<Events>(ctx).event(_attr(ctx)); })
And then we can wire up events in semantic actions:
auto edge      = id[ON(node)] >> *(kw("->") >> id[ON(edge)]) >> semi;
auto node      = id[ON(node)] >> semi;
// ...
auto graph_def = (-id >> '{')[ON(enter)] >> *element >> '}' >> eps[ON(leave)];
In the actual invocation we should bind an instance of Handler:
auto const bound_grammar = //
    x3::with<Parser::Events>(Parser::Handlers{})[Parser::simple_graphviz];

try {
    if (/*bool ok =*/parse(f, l, bound_grammar))
And now the output is (Live On Wandbox):
cluster_0: a0 -> a1
cluster_0: a1 -> a2
cluster_0: a2 -> a3
cluster_1: b0 -> b1
cluster_1: b1 -> b2
cluster_1: b2 -> b3
G: c1 -> a0
G: c1 -> b0
G: a1 -> b3
G: b2 -> a3
G: a3 -> a0
G: a3 -> c2
G: b3 -> c2
Parse success

Putting it together: Building a graph


We don’t need a lot for our graph model, but as we found out in the older answers, the subgraph overload of write_graphviz does mandate all the attribute maps:
using Attrs = std::map<std::string, std::string>;
using VProp = property<vertex_name_t, std::string,
                       property<vertex_attribute_t, Attrs>>;
using EProp = property<edge_attribute_t, Attrs, //
                       property<edge_index_t, int>>;

using GProp = property<graph_graph_attribute_t, Attrs,
             property<graph_vertex_attribute_t, Attrs,
                      property<graph_edge_attribute_t, Attrs,
                               property<graph_name_t, std::string>>>>;

using Graph = subgraph< //
    adjacency_list<vecS, vecS, undirectedS, VProp, EProp, GProp>>;

using Digraph = subgraph< //
    adjacency_list<vecS, vecS, directedS, VProp, EProp, GProp>>;
Now we can implement the Handlers interface to build one of those:
template <typename G> struct Builder {
    using V = typename G::vertex_descriptor;
    using E = typename G::edge_descriptor;

    Builder(G& g) : g(g) {}

    V node(Id id) // return global descriptor to optionally locally added
    {
        if (auto it = vertices.find(id); it != vertices.end()) {
            return last = it->second;
        } else {
            auto& sub = current();
            V vd = add_vertex(sub);
            put(boost::vertex_name, sub, vd, id);

            // translate to global descriptors for later use
            vd = sub.local_to_global(vd);
            [[maybe_unused]] auto [_, ok] = vertices.emplace(id, vd);
            assert(ok);
            return last = vd;
        }
    }
    void edge(Id id) {
        V s = last;
        V t = node(id); // updates last, sequence point is important!
        add_edge(s, t, g);
    }
    void enter(auto optional_id)
    {
        auto id = optional_id.value_or("");

        if (!hasChild(current(), id))
            get_property(current().create_subgraph(), boost::graph_name) = id;

        stack.push_back(std::move(id));
    }
    void leave(x3::unused_type) { stack.pop_back(); }

  private:
    using Stack = std::deque<std::string>;

    G&              g;
    Stack           stack;
    std::map<Id, V> vertices;
    V               last;

    static G* hasChild(G& g, Id childId) {
        for (auto& child : make_iterator_range(g.children()))
            if (childId == get_property(child, boost::graph_name))
                return &child;
        return nullptr;
    }

    using F = Stack::const_iterator;
    static G& descend(G& g, F f, F l) {
        if (f == l)
            return g;
        if (G* c = hasChild(g, *f))
            return descend(*c, ++f, l);
        else
            throw std::range_error("descend");
    }

    G& current() { 
        return descend(g, stack.cbegin(), stack.cend());
    }
};
Note that it could be optimized and I was sloppy with the semantics of multiple unnamed subgraphs. However, the parser remains unmodified, just bind with the Builder bound to the Events:
BGL::Digraph g;
auto const   bound_grammar = //
    x3::with<Parser::Events>(BGL::Builder{g})[Parser::simple_graphviz];

if (/*bool ok =*/parse(f, l, bound_grammar))
    std::cerr << "Parse success\n";
else
    std::cerr << "Parse failed\n";

auto name  = get(boost::vertex_name, g);

print_graph(g, name);
write_graphviz(std::cout << "\n ---- GRAPHVIZ:\n", g);
This results in the output (Live On Wandbox):
Parse success
a0 --> a1
a1 --> a2 b3
a2 --> a3
a3 --> a0 c2
b0 --> b1
b1 --> b2
b2 --> b3 a3
b3 --> c2
c1 --> a0 b0
c2 -->

 ---- GRAPHVIZ:
digraph "" {
subgraph G {
subgraph cluster_0 {
0;
1;
2;
3;
0 -> 1;
1 -> 2;
2 -> 3;
3 -> 0;
}
subgraph cluster_1 {
4;
5;
6;
7;
4 -> 5;
5 -> 6;
6 -> 7;
}
8;
9;
1 -> 7;
3 -> 9;
6 -> 3;
7 -> 9;
8 -> 0;
8 -> 4;
}
}

Icing On The Cake


So now we have structure-preserving, not id-preserving. We can at least make sure that the id is used as node label then:
    auto name  = get(boost::vertex_name, g);
    auto attrs = get(boost::vertex_attribute, g);

    for (auto v : make_iterator_range(vertices(g))) {
        attrs[v]["label"] = name[v];
    }

    write_graphviz(std::cout, g);
Now prints:
digraph "" {
subgraph G {
subgraph cluster_0 {
0[label=a0];
1[label=a1];
2[label=a2];
3[label=a3];
0 -> 1;
1 -> 2;
2 -> 3;
3 -> 0;
}
subgraph cluster_1 {
4[label=b0];
5[label=b1];
6[label=b2];
7[label=b3];
4 -> 5;
5 -> 6;
6 -> 7;
}
8[label=c1];
9[label=c2];
1 -> 7;
3 -> 9;
6 -> 3;
7 -> 9;
8 -> 0;
8 -> 4;
}
}

Full Listing


Live On Wandbox
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graphviz.hpp>
#include <boost/graph/subgraph.hpp>
#include <boost/spirit/home/x3.hpp>

namespace x3 = boost::spirit::x3;
using boost::make_iterator_range;

using Id = std::string;

namespace BGL {
    using namespace boost;
    using Attrs = std::map<std::string, std::string>;
    using VProp = property<vertex_name_t, std::string,
                           property<vertex_attribute_t, Attrs>>;
    using EProp = property<edge_attribute_t, Attrs, //
                           property<edge_index_t, int>>;

    using GProp = property<graph_graph_attribute_t, Attrs,
                 property<graph_vertex_attribute_t, Attrs,
                          property<graph_edge_attribute_t, Attrs,
                                   property<graph_name_t, std::string>>>>;

    using Graph = subgraph< //
        adjacency_list<vecS, vecS, undirectedS, VProp, EProp, GProp>>;

    using Digraph = subgraph< //
        adjacency_list<vecS, vecS, directedS, VProp, EProp, GProp>>;

    template <typename G> struct Builder {
        using V = typename G::vertex_descriptor;
        using E = typename G::edge_descriptor;

        Builder(G& g) : g(g) {}

        V node(Id id) // return global descriptor to optionally locally added
        {
            if (auto it = vertices.find(id); it != vertices.end()) {
                return last = it->second;
            } else {
                auto& sub = current();
                V vd = add_vertex(sub);
                put(boost::vertex_name, sub, vd, id);

                // translate to global descriptors for later use
                vd = sub.local_to_global(vd);
                [[maybe_unused]] auto [_, ok] = vertices.emplace(id, vd);
                assert(ok);
                return last = vd;
            }
        }
        void edge(Id id) {
            V s = last;
            V t = node(id); // updates last, sequence point is important!
            add_edge(s, t, g);
        }
        void enter(auto optional_id)
        {
            auto id = optional_id.value_or("");

            if (!hasChild(current(), id))
                get_property(current().create_subgraph(), boost::graph_name) = id;

            stack.push_back(std::move(id));
        }
        void leave(x3::unused_type) { stack.pop_back(); }

      private:
        using Stack = std::deque<std::string>;

        G&              g;
        Stack           stack;
        std::map<Id, V> vertices;
        V               last;

        static G* hasChild(G& g, Id childId) {
            for (auto& child : make_iterator_range(g.children()))
                if (childId == get_property(child, boost::graph_name))
                    return &child;
            return nullptr;
        }

        using F = Stack::const_iterator;
        static G& descend(G& g, F f, F l) {
            if (f == l)
                return g;
            if (G* c = hasChild(g, *f))
                return descend(*c, ++f, l);
            else
                throw std::range_error("descend");
        }

        G& current() { 
            return descend(g, stack.cbegin(), stack.cend());
        }
    };
} // namespace BGL

namespace Parser {
    using namespace x3;

    auto kw = [](auto... p) {
        return lexeme[(x3::as_parser(p) | ...) >> !(alnum | '_')];
    };

    struct Events{};

#define ON(event) ([](auto& ctx) { get<Events>(ctx).event(_attr(ctx)); })

    x3::rule<struct _> graph{"graph"};

    auto subgraph  = kw("subgraph") >> graph;
    auto semi      = &('}'|subgraph) | ';';

    auto id        = lexeme[                            //
        !kw("edge", "node", "graph", "subgraph") //
        >> +char_("A-Za-z0-9_")];
    auto edge      = id[ON(node)] >> *(kw("->") >> id[ON(edge)]) >> semi;

    auto node      = id[ON(node)] >> semi;
    auto element   = subgraph | edge | node;
    auto graph_def = (-id >> '{')[ON(enter)] >> *element >> '}' >> eps[ON(leave)];

    auto simple_graphviz = skip(space)[kw("digraph") >> graph];

    BOOST_SPIRIT_DEFINE(graph)

#undef ON

} // namespace Parser

int main() {
    std::ifstream ifs("input.txt");
    std::string const s(std::istreambuf_iterator<char>(ifs >> std::noskipws), {});
    auto              f = s.begin(), l = s.end();

    BGL::Digraph g;
    auto const   bound_grammar = //
        x3::with<Parser::Events>(BGL::Builder{g})[Parser::simple_graphviz];

    try {
        if (/*bool ok =*/parse(f, l, bound_grammar))
            std::cerr << "Parse success\n";
        else
            std::cerr << "Parse failed\n";

        auto name  = get(boost::vertex_name, g);
        auto attrs = get(boost::vertex_attribute, g);

        for (auto v : make_iterator_range(vertices(g))) {
            attrs[v]["label"] = name[v];
        }

        write_graphviz(std::cout, g);

        if (f != l)
            std::cerr << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
    } catch (x3::expectation_failure<std::string::const_iterator> const& e) {
        std::cerr << e.what() << ": " << e.which() << " at "
                  << std::string(e.where(), l) << "\n";
    }
}
Resulting graph renders:
enter image description here

If you have better answer, please add a comment about this, thank you!

Source: Stackoverflow.com

If you like this answer, you can give me a coffee by click here (view Ads)