#StackBounty: #c++ #c++11 #g++ #zeromq Possible bug in gcc when using member initializer

Bounty: 200

I am working on a project where I use 0MQ, and hence the zeromq tag.

I am experiencing a weird problem in my code which I am not sure is a bug in g++ or in my wrapping the 0MQ library. I hope that I can get some help from you. Basically, I am testing against

~> g++ --version
g++ (GCC) 7.3.1 20180312
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

~> clang++ --version
clang version 6.0.0 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

You can find the zmq.hpp file in my GitHub account, which I did not want to paste here because of its length. My minimal working example based on that header would then read:

#include <iostream>
#include <string>

#include "zmq.hpp"

int main(int argc, char *argv[]) {
  auto version = zmq::version();
  std::cout << "0MQ version: v" << std::get<0>(version) << '.'
            << std::get<1>(version) << '.' << std::get<2>(version) << 'n';

  zmq::message msg1, msg2;
  std::string p1{"part 1"};
  uint16_t p2{5};
  msg1.addpart(std::begin(p1), std::end(p1));
  msg1.addpart(p2);

  std::cout << "msg1 is a " << msg1.numparts() << "-part message.n";
  std::cout << "msg1[0]: " << static_cast<char *>(msg1.data(0)) << 'n';
  std::cout << "msg1[1]: " << *static_cast<uint16_t *>(msg1.data(1)) << 'n';

  msg2 = {{msg1[0], msg1[1]}};

  std::cout << "msg2 is a " << msg2.numparts() << "-part message.n";
  std::cout << "msg2[0]: " << static_cast<char *>(msg2.data(0)) << 'n';
  std::cout << "msg2[1]: " << *static_cast<uint16_t *>(msg2.data(1)) << 'n';

  return 0;
}

When I compile the code with

~> clang++ -Wall -std=c++11 -O3 mwe.cpp -o mwe.out -lzmq
~> ./mwe.out

I see the following output:

0MQ version: v4.2.5
msg1 is a 2-part message.
msg1[0]: part 1
msg1[1]: 5
msg2 is a 2-part message.
msg2[0]: part 1
msg2[1]: 5

However, when I compile the code with

~> g++ -Wall -std=c++11 -O3 mwe.cpp -o mwe.out -lzmq
~> ./mwe.out

I get the following:

0MQ version: v4.2.5
msg1 is a 2-part message.
msg1[0]: part 1
msg1[1]: 5
msg2 is a 1-part message.
msg2[0]: <some garbage here>
fish: “./mwe.out” terminated by signal SIGSEGV (Address boundary error)

Obviously, I am getting SIGSEGV due to my reading a memory location that I do not own. The interesting part is that when I change Line 764 of the zmq.hpp file to read:

// message::message(std::vector<part> parts) noexcept : parts_{std::move(parts)} {}
message::message(std::vector<part> parts) noexcept {
  parts_ = std::move(parts);
}

the code works as intended when compiled with both of the compilers.

In short, I would like to know if I am doing something fishy that results in the g++-compiled code’s not working, or there is a possibility that g++ has some bug. g++ does not have the same behavior with simple dummy structs that I use (that’s why I could not write an MWE with simpler structs, and that’s why I do suspect my wrappers). And, the same behavior is also observed with -O0 -g switches.

Thank you in advance for your time.

EDIT. I have changed the MWE to read as below (as per @Peter’s comment):

#include <iostream>
#include <string>

#include "zmq.hpp"

int main(int argc, char *argv[]) {
  auto version = zmq::version();
  std::cout << "0MQ version: v" << std::get<0>(version) << '.'
            << std::get<1>(version) << '.' << std::get<2>(version) << 'n';

  zmq::message msg1, msg2;
  std::string data1{"part 1"};
  uint16_t data2{5};
  msg1.addpart(std::begin(data1), std::end(data1));
  msg1.addpart(data2);

  std::cout << "msg1 is a " << msg1.numparts() << "-part message.n";
  // std::cout << "msg1[0]: " << static_cast<char *>(msg1.data(0)) << 'n';
  // std::cout << "msg1[1]: " << *static_cast<uint16_t *>(msg1.data(1)) << 'n';

  msg2 = {{msg1[0], msg1[1]}};

  std::cout << "msg2 is a " << msg2.numparts() << "-part message.n";
  // std::cout << "msg2[0]: " << static_cast<char *>(msg2.data(0)) << 'n';
  // std::cout << "msg2[1]: " << *static_cast<uint16_t *>(msg2.data(1)) << 'n';

  zmq::message::part p1 = 5.0; // double
  std::cout << "[Before]: p1 has size " << p1.size() << 'n';
  zmq::message::part p2{std::move(p1)};
  std::cout << "[After]: p1 has size " << p1.size() << 'n';
  std::cout << "[After]: p2 has size " << p2.size() << 'n';

  zmq::message::part p3;
  std::cout << "[Before]: p3 has size " << p3.size() << 'n';
  p3 = std::move(p2);
  std::cout << "[After]: p2 has size " << p2.size() << 'n';
  std::cout << "[After]: p3 has size " << p3.size() << 'n';

  return 0;
}

With g++ and the original zmq.hpp file that I provide in the GitHub gist (well, this time message::part being public), I have the following:

0MQ version: v4.2.5
msg1 is a 2-part message.
msg2 is a 1-part message.
[Before]: p1 has size 8
[After]: p1 has size 0
[After]: p2 has size 8
[Before]: p3 has size 0
[After]: p2 has size 0
[After]: p3 has size 8

However, when I use clang++, I get the following:

0MQ version: v4.2.5
msg1 is a 2-part message.
msg2 is a 2-part message.
[Before]: p1 has size 8
[After]: p1 has size 0
[After]: p2 has size 8
[Before]: p3 has size 0
[After]: p2 has size 0
[After]: p3 has size 8

Both move construction and move assignment seem to work for message::part objects. Finally, valgrind ./mwe.out gives no leaks or errors.

EDIT. I have debugged the code over the weekend. It appears that g++ is calling

template <class T> message::part::part(const T &value) : part(sizeof(T)) {
  std::memcpy(zmq_msg_data(&msg_), &value, sizeof(T));
}

after std::move in

message::message(std::vector<part> parts) noexcept : parts_{std::move(parts)} {}

For this reason, it creates a vector having only 1 message::part, which is (incorrectly) constructed with value = {msg1[0], msg1[1]}. However, clang++ does the correct thing and does not call the templated constructor.

Is there a way to fix this problem?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.