Skip to content

Bug when using graph-convert to convert symmetric matrix market file to galois graph file #424

@xiaozxiong

Description

@xiaozxiong

Hi!
I am trying to run your BFS program and the dataset I used is hollywood-2009. I used the following command to perform conversion:

 ./tools/graph-convert/graph-convert --mtx2gr --edgeType=int32 hollywood-2009.mtx  hollywood-2009.gr

Then I started to run the bfs-cpu under lonestar/analytics/cpu/bfs directory, but the result was wrong:

# ./bfs-cpu --symmetricGraph  hollywood-2009.gr 
Running SyncTile algorithm with PARALLEL execution
Node 1 has distance 2147483646
INFO: # visited nodes is 1
INFO: Max distance is 0
INFO: Sum of visited distances is 0
1139904 unvisited nodes; this is an error if the graph is strongly connected
max dist: 0
Verification successful.

Finally, I found the problem locates in tools/graph-convert/graph-convert.cpp in which the conversion program is:

struct Mtx2Gr : public HasNoVoidSpecialization {
  template <typename EdgeTy>
  void convert(const std::string& infilename, const std::string& outfilename) {
    typedef galois::graphs::FileGraphWriter Writer;

    Writer p;
    uint32_t nnodes;
    size_t nedges;

    for (int phase = 0; phase < 2; ++phase) {
      std::ifstream infile(infilename.c_str());
      if (!infile) {
        GALOIS_DIE("failed to open input file");
      }

      // Skip comments
      while (infile) {
        if (infile.peek() != '%') {
          break;
        }
        skipLine(infile);
      }

      // Read header
      char header[256];
      infile.getline(header, 256);
      std::istringstream line(header, std::istringstream::in);
      std::vector<std::string> tokens;
      while (line) {
        std::string tmp;
        line >> tmp;
        if (line) {
          tokens.push_back(tmp);
        }
      }
      if (tokens.size() != 3) {
        GALOIS_DIE("unknown problem specification line: ", line.str());
      }
      // Prefer C functions for maximum compatibility
      // nnodes = std::stoull(tokens[0]);
      // nedges = std::stoull(tokens[2]);
      nnodes = strtoull(tokens[0].c_str(), NULL, 0);
      nedges = strtoull(tokens[2].c_str(), NULL, 0);

      // Parse edges
      if (phase == 0) {
        p.setNumNodes(nnodes);
        p.setNumEdges<EdgeTy>(nedges);
        p.phase1();
      } else {
        p.phase2();
      }

      for (size_t edge_num = 0; edge_num < nedges; ++edge_num) {
        // if ((nedges / 500 > 0) && (edge_num % (nedges / 500)) == 0) {
        //   printf("Phase %d: current edge progress %lf%%\n", phase,
        //          ((double)edge_num / nedges) * 100);
        // }
        uint32_t cur_id, neighbor_id;
        double weight = 1;

        infile >> cur_id >> neighbor_id >> weight;
        if (cur_id == 0 || cur_id > nnodes) {
          GALOIS_DIE("node id out of range: ", cur_id);
        }
        if (neighbor_id == 0 || neighbor_id > nnodes) {
          GALOIS_DIE("neighbor id out of range: ", neighbor_id);
        }

        // 1 indexed
        if (phase == 0) {
          p.incrementDegree(cur_id - 1);
        } else {
          if constexpr (std::is_void<EdgeTy>::value) {
            p.addNeighbor(cur_id - 1, neighbor_id - 1);
          } else {
            p.addNeighbor<EdgeTy>(cur_id - 1, neighbor_id - 1,
                                  static_cast<EdgeTy>(weight));
          }
          if(cur_id - 1 == 0){
            printf("%d - %d\n", cur_id - 1, neighbor_id - 1);
          }
        }

        skipLine(infile);
      }

      infile.peek();
      if (!infile.eof()) {
        GALOIS_DIE("additional lines in file");
      }
    }
    // this is for the progress print

    p.finish();

    p.toFile(outfilename);
    printStatus(p.size(), p.sizeEdges());
  }
};

This code can only be used to process directed graphs, leading to the issue for undirected graphs.

BTW, maybe it is the reason of #407 .

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions