    Generic Netlink unicast from kernel to user fails (-111)

    (Linux 4.4)

    I am trying to get a kernel module to send information to a user process over a Generic Netlink. It seems that the message is not successfully received by the user process - the nlmsg_unicast function returns with -111.

    Here is what I know:

    • The kernel module successfully registers a Generic Netlink family - it prints a message in the syslog indicating the (autogenerated) family ID (which is always 26).
    • The user process successfully discovers the family ID (26).
    • The user process sends a sort of "I'm alive" command to the kernel module, which successfully logs the (auto-selected) port ID of the user process - I know from messages printed by both the user process and the kernel module that the correct portID is resolved.
    • Subsequently, the kernel, upon an event, tries to send a message to the resolved portID over the Generic Netlink family that had been set up.
    • The user process never receives the message (it never enters the callback function; in fact, I don't think mnl_socket_recvfrom ever returns). In the kernel module, the nlmsg_unicast function returns with -111.
    • I am using libmnl in the user process (as you might have guessed from my allusion to mnl_socket_recvfrom).

    uname -a
    Linux yaron-VirtualBox 4.4.0-57-generic #78-Ubuntu SMP Fri Dec 9 23:50:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

    Here is, essentially, my send code in the kernel:

        struct sk_buff *msg;
        struct sock *socket;
        struct netlink_kernel_cfg nlCfg = {
            .groups = 1,
            .flags = 0,
            .input = NULL,
            .cb_mutex = NULL,
            .bind = NULL,
            .unbind = NULL,
            .compare = NULL,
        void *msg_head;
        int retval;
        struct net init_net;
                /* Open a socket */
                socket = netlink_kernel_create(&init_net, NETLINK_GENERIC, &nlCfg);
                if (socket == NULL) goto CmdFail;
                /* Allocate space */
                msg = genlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
                if (msg == NULL) goto CmdFail;
                /* Generate message header
                 * arguments of genlmsg_put: 
                 *    struct sk_buff *, 
                 *    int portID, 
                 *    int netlinkSeqNum,
                 *    struct genl_family *, 
                 *    int flags, 
                 *    u8 command_idx         */
                msg_head = genlmsg_put(msg, userNetlinkPortID, ++netlinkSeqNum, &genlFamily, 0, MYFAMILY_CMD_MYMSG);
                if (msg_head == NULL) goto CmdFail;
                /* Add a MYFAMILY_ATTR_MYCMD attribute (command to be sent) */
                retval = nla_put_string(msg, MYFAMILY_ATTR_MYMSG, "Temporary message");
                if (retval != 0) goto CmdFail;
                /* Finalize the message */
                genlmsg_end(msg, msg_head);  /* void inline function - no return value */
                /* Send the message */
                retval = nlmsg_unicast(socket, msg, userNetlinkPortID);
                printk("nlmsg_unicast returned %d\n", retval);
                if (retval != 0) goto CmdFail;
        printk(KERN_ALERT "*** Failed to send command !\n");
    Here is, essentially, my receive code in the user process:

    char bufferHdr[getpagesize()];
    struct nlmsghdr *nlHeader;
    struct genlmsghdr *nlHeaderExtraHdr;
    int numBytes, seq, ret_val;
    // Set up the header.
    // Function mnl_nlmsg_put_header will zero out a length of bufferHdr sufficient to hold a Netlink header,
    // and initialize the nlmsg_len field in that space to the size of a header.
    // It returns a pointer to bufferHdr.
    if ( (nlHeader = mnl_nlmsg_put_header(bufferHdr)) != (struct nlmsghdr *) bufferHdr ) {
        perror("mnl_nlmsg_put_header failed");
    nlHeader->nlmsg_type = genetlinkFamilyID;
    // Function mnl_nlmsg_put_extra_header extends the header, to allow for these extra fields.
    if ( (nlHeaderExtraHdr = (struct genlmsghdr *) mnl_nlmsg_put_extra_header(nlHeader, sizeof(struct genlmsghdr))) != (struct genlmsghdr *) (bufferHdr + sizeof(struct nlmsghdr)) ) {
        perror("mnl_nlmsg_put_extra_header failed");
    // No command to set
    // No attributes to set
    // Wait for a message, and process it
    while (1) {
        numBytes = mnl_socket_recvfrom(nlSocket, bufferHdr, sizeof(bufferHdr));
        if (numBytes == -1) {
            perror("mnl_socket_recvfrom returned error");
        // Callback run queue handler - use it to call getMsgCallback
        std::cout << "received a msg, handling it" << std::endl;
        ret_val = mnl_cb_run(bufferHdr, numBytes, seq, portid, getMsgCallback, NULL);
        if (ret_val == -1) {
            //perror("mnl_cb_run failed");
        } else if (ret_val == 0)
    return ret_val;

    111 - connection refused.

    Yes, but why? Where has my code gone wrong?

    ADDENDUM: Having scoured through the kernel source code some more (on, I'm guessing that my message never even gets to the user process.
    nlmsg_unicast calls netlink_unicast, which in turn calls netlink_getsockbyportid, which looks like this:
    static struct sock *netlink_getsockbyportid(struct sock *ssk, u32 portid)
        struct sock *sock;
        struct netlink_sock *nlk;
        sock = netlink_lookup(sock_net(ssk), ssk->sk_protocol, portid);
        if (!sock)
            return ERR_PTR(-ECONNREFUSED);
        /* Don't bother queuing skb if kernel socket has no input function */
        nlk = nlk_sk(sock);
        if (sock->sk_state == NETLINK_CONNECTED &&
            nlk->dst_portid != nlk_sk(ssk)->portid) {
            return ERR_PTR(-ECONNREFUSED);
        return sock;
    I'm guessing one of the two conditions here for punting and returning -ECONNREFUSED is triggered.

    Any suggestions for how I can debug whether either of these conditions is true? It doesn't look like I can call netlink_lookup or nlk_sk directly from my module code - I guess the symbols are not exposed - nor their subfunctions - a whole lot of symbols are buried in af_netlink.h and af_netlink.c, and I guess the symbols are not available when building your external module, at least the normal way. (It doesn't look like af_netlink.h is available as part of the distro.)

