Page 1 of 2

receive SIGABRT

Posted: Tue Nov 04, 2014 3:43 am
by mou
The calling stack is as follow.
I have not found any rules in this error until now.

*** SIGABRT (@0x2bd000045ec) received by PID 17900 (TID 0x7f992d3ff700) from PID 17900; stack trace: ***
@ 0x315780f500 (unknown)
@ 0x31574328a5 (unknown)
@ 0x3157434085 (unknown)
@ 0x315742ba1e (unknown)
@ 0x315742bae0 (unknown)
@ 0x7f999e79cfc8 /usr/lib64/libRCFProto.so RCF::AssertFunctor::~AssertFunctor()
@ 0x7f999e808e81 /usr/lib64/libRCFProto.so RCF::ConnectionOrientedClientTransport::send()
@ 0x7f999e7fbfef /usr/lib64/libRCFProto.so RCF::ClientStub::beginSend()
@ 0x7f999e7fc2a0 /usr/lib64/libRCFProto.so RCF::ClientStub::onRequestTransportFiltersCompleted()
@ 0x7f999e832b08 /usr/lib64/libRCFProto.so RCF::ClientStub::onConnectCompleted()
@ 0x7f999e80434d /usr/lib64/libRCFProto.so RCF::TcpClientTransport::implConnect()
@ 0x7f999e7e3004 /usr/lib64/libRCFProto.so RCF::ClientStub::connect()
@ 0x7f999e7fd7d0 /usr/lib64/libRCFProto.so RCF::ClientStub::beginCall()
@ 0x7f999e7fe5ce /usr/lib64/libRCFProto.so RCF::FutureImplBase::callSync()
@ 0x7f999e79b404 /usr/lib64/libRCFProto.so RCF::RcfProtoChannel::CallMethodInternal()
@ 0x7f999e79c059 /usr/lib64/libRCFProto.so RCF::RcfProtoChannel::CallMethod()

Re: receive SIGABRT

Posted: Wed Nov 05, 2014 1:31 am
by jarl
There should be an assert message in the output window of your application, when this happens - can you send that to us.

Is there a way of reproducing this?

Re: receive SIGABRT

Posted: Thu Nov 06, 2014 2:23 am
by mou
jarl wrote:There should be an assert message in the output window of your application, when this happens - can you send that to us.

Is there a way of reproducing this?
It's random as I observed. I will keep watching.

Re: receive SIGABRT

Posted: Thu Jun 25, 2015 10:56 am
by mou
I found a reason. When the server sise is processiong high QPS, the client side runs in this problem.
But, how to deal with this?

Re: receive SIGABRT

Posted: Wed Jul 01, 2015 12:40 am
by jarl
Hi,

I suspect this may be due to multi-threading issues in the client side code. Can you send us some code of how you are establishing and maintaining client connections? It's important that only one thread at a time is accessing a RcfClient<> object.

Re: receive SIGABRT

Posted: Thu Jul 02, 2015 5:22 am
by mou

Code: Select all

class RpcClient {
 public:
  // Construct a RPC client for the RPC service designated by
  // "service_descriptor". To get a service descriptor, you can just use static
  // member function "descriptor()" of the corresponding service class.
  RpcClient(const google::protobuf::ServiceDescriptor& service_descriptor,
            const std::string& server_ip_list,
            const int port,
            const int timeout_ms);

  // Similar to above, but use servers listed in "FLAGS_rpc_server_ip_list",
  // "FLAGS_rpc_port" and "FLAGS_rpc_call_timeout_ms".
  RpcClient(const google::protobuf::ServiceDescriptor& service_descriptor);

  ~RpcClient();

  // Call the given method designated by "method_index" of the RPC service.
  // The method index is zero based. It is determined by the order that the
  // method was defined in proto file.
  // Return true if response is received successfully, otherwise return false.
  bool Call(const int method_index,
            const google::protobuf::Message& request,
            google::protobuf::Message* response);

  // Similar to above, just a simplified version for most scenarios.
  bool Call(const google::protobuf::Message& request,
            google::protobuf::Message* response) {
    return Call(0, request, response);
  }

 private:
  google::protobuf::RpcChannel* AllocateChannel();
  void ReclaimChannel(google::protobuf::RpcChannel* channel);

  const RCF::TcpEndpoint& GetServer();

  const google::protobuf::ServiceDescriptor& service_descriptor_;

  const int timeout_ms_;

  std::mutex mutex_;
  size_t next_server_index_ = 0;                                                                                                                                                                                                                                              
  std::vector<RCF::TcpEndpoint> servers_;
  std::queue<google::protobuf::RpcChannel*> channels_;
};
============= cpp =================

Code: Select all

namespace {

class InternalRpcChannel : public RCF::RcfProtoChannel {
 public:
  InternalRpcChannel(const RCF::TcpEndpoint& endpoint)
      : RCF::RcfProtoChannel(endpoint), endpoint_(endpoint) {
  }

  const RCF::TcpEndpoint& endpoint() const {
    return endpoint_;
  }

 private:
  // We use reference here, becuase we know it is not a dangling one.
  const RCF::TcpEndpoint& endpoint_;
};

} // namespace

RpcClient::RpcClient(const ServiceDescriptor& service_descriptor,
                     const std::string& server_ip_list,
                     const int port,
                     const int timeout_ms)
    : service_descriptor_(service_descriptor), timeout_ms_(timeout_ms) {
  RCF::init();

  std::vector<std::string> ips;
  SplitStringUsing(server_ip_list, ",", &ips);
  for (const std::string& ip : ips) {
    servers_.emplace_back(RCF::TcpEndpoint(ip, port));
  }

  next_server_index_ = std::rand() % servers_.size();
}

RpcClient::RpcClient(const ServiceDescriptor& service_descriptor)
  : RpcClient(service_descriptor,
              FLAGS_rpc_server_ip_list,
              FLAGS_rpc_port,
              FLAGS_rpc_call_timeout_ms) {
}

RpcClient::~RpcClient() {
  LOG(INFO) << "Destructing RpcClient.";
  while (!channels_.empty()) {
    delete channels_.front();
    channels_.pop();
  }
}

bool RpcClient::Call(const int method_index,
                     const google::protobuf::Message& request,
                     google::protobuf::Message* response) {
  google::protobuf::RpcChannel* channel = AllocateChannel();
  try {
    channel->CallMethod(service_descriptor_.method(method_index),
                        nullptr,
                        &request,
                        response,
                        nullptr);
  } catch (const RCF::Exception& exception) {
    LOG(ERROR) << service_descriptor_.full_name() << " RPC exception: " <<
        exception.getErrorString() << " Server: " <<
        static_cast<InternalRpcChannel*>(channel)->endpoint().getIp();
    delete channel;
    return false;
  }

  ReclaimChannel(channel);
  return true;
}

google::protobuf::RpcChannel* RpcClient::AllocateChannel() {
  std::lock_guard<std::mutex> guard(mutex_);
  if (channels_.empty()) {
    const RCF::TcpEndpoint& endpoint = GetServer();
    LOG(INFO) << "New " << service_descriptor_.full_name() <<
        " RpcChannel instance to " << endpoint.asString();
    RCF::RcfProtoChannel* channel = new InternalRpcChannel(endpoint);
    channel->setRemoteCallTimeoutMs(timeout_ms_);
    return channel;
  }

  google::protobuf::RpcChannel* channel = channels_.front();
  channels_.pop();

  return channel;
}

inline void RpcClient::ReclaimChannel(google::protobuf::RpcChannel* channel) {
  std::lock_guard<std::mutex> guard(mutex_);
  channels_.push(channel);
}

const RCF::TcpEndpoint& RpcClient::GetServer() {
  if (next_server_index_ == servers_.size()) {
    next_server_index_ = 0;
    return servers_[next_server_index_];
  } else {
    return servers_[next_server_index_++];
  }
}

Re: receive SIGABRT

Posted: Tue Jan 12, 2016 1:18 am
by mou
:idea:

Re: receive SIGABRT

Posted: Wed Jan 27, 2016 8:04 am
by mou
jarl wrote:Hi,

I suspect this may be due to multi-threading issues in the client side code. Can you send us some code of how you are establishing and maintaining client connections? It's important that only one thread at a time is accessing a RcfClient<> object.
Hi, jarl. I posted the source code I'm using above. Any suggestions?

Re: receive SIGABRT

Posted: Thu Jan 28, 2016 3:06 am
by mou
Finally, I got the following infomation output by RCFProto:

../../src/RCF/src/RCF/ConnectionOrientedClientTransport.cpp:366: Assertion failed. mAsync . Values:

But I don't understand what does this mean. Can somebody help?

Re: receive SIGABRT

Posted: Fri Jan 29, 2016 12:08 am
by jarl
Hi,

Thanks for obtaining that assert message. I suspect that the assert on that line is actually incorrect, and may be the cause of the problem. Can you comment it out and rerun your tests?