Unable to connect a bot to the websocket


  • I've been playing around with setting up a bot to connect to NodeBB. I'm able to get it to login successfully, by sending a POST request to the /login page, but when I attempt to connect to the websocket, it fails.

    If I simply point a socket.io client at the socket, I get messages for sid, checkSession, and setHostname, but then it just sits there. I receive new post notifications, but no messages if I send a DM from my main account to the bot or do other things that the bot's user should be personally notified of.

    If I send one or more GET requests to the socket.io URL first, mimicking what the webpage does, and then try to connect the websocket using the retrieved sid, it's even worse: the connect attempt sits around waiting for approximately 8 seconds, then immediately disconnects. Fiddler shows it was sent a Close from the server. No Socket.io events are received from the server.

    Looking at the relevant source code, it looks like the only thing that should matter when connecting the websocket is the encrypted session cookie, which is used by NodeBB to identify the user. I've verified that my code is sending this cookie correctly. From everything I can tell, looking at it in Fiddler, my requests ought to be indistinguishable to the server from the requests being made by the webpage, and yet the websocket connection does not function as expected.

    Relevant source code, in C#:

    Bot:

    using System;
    using System.Collections.Generic;
    using System.Net;
    using System.Net.Http;
    using System.Threading.Tasks;
    
    using Newtonsoft.Json.Linq;
    using H.Socket.IO;
    using H.WebSockets.Args;
    
    namespace NodeBot
    {
        class Bot
        {
            private const string HOST = "host here";
            private const string USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36";
            private const string USERNAME = "username here";
            private const string PASSWORD = "pass here";
    
            private readonly HttpClient _client;
            private readonly string _url;
            private readonly string _wsUrl;
            private JObject _config;
            private SocketIoClient _ws;
            private readonly CookieContainer _cookieJar = new();
    
            public Bot()
            {
                _url = $"https://{HOST}";
                _wsUrl = $"wss://{HOST}/?sid=";
                var handler = new HttpClientHandler { UseCookies = true, CookieContainer = _cookieJar };
                _client = new HttpClient(handler);
            }
    
            private async Task GetConfig()
            {
                var data = await _client.GetStringAsync($"{_url}/api/config");
                _config = JObject.Parse(data);
                _client.DefaultRequestHeaders.Add("User-Agent", USER_AGENT);
            }
    
            public async Task Connect()
            {
                await GetConfig();
                Console.WriteLine("begin post login data");
                var connectResult = await _client.PostAsync($"{_url}/login", GetConnectContent());
                var body = await connectResult.Content.ReadAsStringAsync();
                
                if (!connectResult.IsSuccessStatusCode)
                {
                    Console.WriteLine($"Login failed: {body}");
                    throw new Exception($"Login failed: {body}");
                }
            }
    
            private HttpContent GetConnectContent()
            {
                var result = new FormUrlEncodedContent(new Dictionary<string, string>
                    {
                        { "username", USERNAME },
                        { "password", PASSWORD },
                        { "noscript", "false"}
                    }
                );
                result.Headers.Add("x-csrf-token", (string)_config["csrf_token"]);
                return result;
            }
    
            public async Task ConnectWS()
            {
                if (_ws != null)
                {
                    return;
                }
                
                var socketInfo = await _client.GetStringAsync($"{_url}/socket.io/?EIO=3&transport=polling&t=NZcTp-d");
                var socketJson = socketInfo[socketInfo.IndexOf('{')..];
                var sid = (string)JObject.Parse(socketJson)["sid"];
                /*
                var nextInfo = await _client.GetStringAsync($"{_url}/socket.io/?EIO=3&transport=polling&t=NZcn2H7&sid={sid}");
                var infos = nextInfo.Split('[', ']');
                var roomUrl = $"{_url}/socket.io/?EIO=3&transport=polling&t=NZcn2Ia&sid={sid}";
                var roomPost = await _client.PostAsync(roomUrl, new StringContent("49:420[\"meta.rooms.enter\",{\"enter\":\"unread_topics\"}]"));
                var roomPostResponse = await roomPost.Content.ReadAsStringAsync();
                var roomGet = await _client.GetStringAsync(roomUrl);
                */
    
                var client = new SocketIoClient();
                var options = client.EngineIoClient.WebSocketClient.Socket.Options;
                options.Cookies = _cookieJar;
                options.SetRequestHeader("User-Agent", USER_AGENT);
                options.SetRequestHeader("Origin", _url);
                options.SetRequestHeader("Cache-Control", "no-cache");
                options.SetRequestHeader("Pragma", "no-cache");
    
                client.Connected += (sender, args) => Console.WriteLine($"Connected: {args.Namespace}");
                client.Disconnected += OnDisconnect;
                client.EventReceived += (sender, args) => Console.WriteLine($"EventReceived: Namespace: {args.Namespace}, Value: {args.Value}, IsHandled: {args.IsHandled}");
                client.HandledEventReceived += (sender, args) => Console.WriteLine($"HandledEventReceived: Namespace: {args.Namespace}, Value: {args.Value}");
                client.UnhandledEventReceived += (sender, args) => Console.WriteLine($"UnhandledEventReceived: Namespace: {args.Namespace}, Value: {args.Value}");
                client.ErrorReceived += (sender, args) => Console.WriteLine($"ErrorReceived: Namespace: {args.Namespace}, Value: {args.Value}");
                client.ExceptionOccurred += (sender, args) => Console.WriteLine($"ExceptionOccurred: {args.Value}");
                client.On<ChatMesage>("event:chats.receive", (message) => Console.WriteLine(message));
                await client.ConnectAsync(new Uri(_wsUrl + sid));
                _ws = client;
            }
    
            private void OnDisconnect(object sender, WebSocketCloseEventArgs args)
            {
                Console.WriteLine($"Disconnected. Reason: {args.Reason}, Status: {args.Status:G}");
            }
        }
    
        class ChatMesage
        {
            public int fromUid;
            public ChatMessageData message;
            public string roomId;
            public int self;
            public string[] uids;
    
            public override string ToString()
            {
                return $"Chat message: {message}";
            }
        }
    
        public class ChatMessageData
        {
            public string content;
            public string cleanedContent;
            public ChatFromUser fromUser;
            public override string ToString()
            {
                return $"from {fromUser.username}: {content}";
            }
        }
    
        public class ChatFromUser
        {
            public int uid;
            public string username;
        }
    }
    

    Program:

    using System;
    using System.Threading.Tasks;
    
    namespace NodeBot
    {
        class Program
        {
            static async Task Main(string[] args)
            {
                var bot = new Bot();
                await bot.Connect();
                await bot.ConnectWS();
                while (true)
                {
                    await Task.Delay(10000);
                }
            }
        }
    }
    

    NuGet dependencies:
    Newtonsoft.Json
    H.Socket.IO

    Built in .NET 5.

    Any idea what's going wrong and how to fix it?

  • Global Moderator Plugin & Theme Dev

    @masonwheeler I think your client socket needs to enter rooms for the chat and stuff


  • @pitaj OK, how do I do that?

    You see the commented out requests about halfway down the Bot code? Those are parroting what the webpage does when loading the page. If I include that, it doesn't change anything; I just get a websocket that disconnects immediately.

  • Global Moderator Plugin & Theme Dev


  • @pitaj Yes, I mentioned and linked to that file in my original post. I've been over it painstakingly.

    Near as I can tell, this function only even runs at all if I connect the socket without first retrieving a sid via HTTP, and when that happens, I think I'm being joined as a guest rather than as my logged-in user, because I receive new post notifications but nothing that's relevant to the user. (I assume that the point of line 91 is to subscribe this socket to user-specific notifications, such as upvotes, mentions, chat, etc.)

    When I retrieve a sid first and try to use it, as the web client does, connecting to the websocket results in immediate disconnection.

  • Global Moderator Plugin & Theme Dev

    My bad, I thought I the one I linked was a client side file (distant from the one you linked before). I'm on my phone.

    What have you tried on the NodeBB side to debug why you're being disconnected?


  • @pitaj said in Unable to connect a bot to the websocket:

    My bad, I thought I the one I linked was a client side file (distant from the one you linked before). I'm on my phone.

    What have you tried on the NodeBB side to debug why you're being disconnected?

    Nothing; I'm not running a NodeBB instance and (no offense intended) I'd really prefer for my dev machine to stay Node-free. I've got enough stuff taking up my hard drive space as it is. This is why I need help from people who actually know NodeBB from the server side.


  • ...anybody?

  • Global Moderator Plugin & Theme Dev

    One thing you might need to check is the Origin header. We have socket.io configured to reject connections that don't have an origin equivalent to the forum URL.

    Ah nevermind you're already doing that.

    Maybe given that you're using the socket.io library, I doubt you need to send the /socket.io requests like you are. You probably just need to emulate what we do here

    https://github.com/NodeBB/NodeBB/blob/master/public/src/sockets.js


  • @pitaj Those calls are just to get it to do the same thing the webpage is doing. The nature of HTTP being what it is, sending the same requests should give the same results. But somehow it isn't, which means I'm missing something somewhere, and I have no idea what!

    Is there any command I can send to have socket.io tell me a list of all rooms I'm subscribed to? That way I could at least debug and test my theory of whether I'm getting connected as a user or as a guest.

  • Global Moderator Plugin & Theme Dev

    I think you're mistaken. This is the relevant portion of our client socket connection code:

    var ioParams = {
    	reconnectionAttempts: config.maxReconnectionAttempts,
    	reconnectionDelay: config.reconnectionDelay,
    	transports: config.socketioTransports,
    	path: config.relative_path + '/socket.io',
    };
    
    socket = io(config.websocketAddress, ioParams);
    

    Here are the values for those config variables for a logged in user on this forum:

    config.maxReconnectionAttempts = 5
    config.reconnectionDelay = 200
    config.socketioTransports = [ "polling", "websocket" ]
    config.relative_path = ""
    config.websocketAddress = ""
    

    So with everything written out, and converting relative to absolute paths, this is the socket setup to connect

    socket = io("https://community.nodebb.org", {
      reconnectionAttempts: 5,
      reconnectionDelay: 200,
      transports: [ "polling", "websocket" ],
      path: "https://community.nodebb.org/socket.io",
    });
    socket.connect();
    

    We don't do any requests or anything like that, the socket.io client does all of that.

    Given you're also using a client library, you shouldn't have to do much more than provide it the same options. I think you will need to provide the same cookie jar as you use to log in, but I think your socket.io client should do the rest.


  • Yes, that's how it looks from JavaScript. I'm using a C# client that has a completely different set of configuration options. But configuration options aren't particularly relevant to this issue; they're just an implementation detail.

    Socket.io isn't magic; it's just a protocol implemented on top of HTTP and WS. The nature of network communication being what it is, the server has no way of knowing that there's "a real socket.io client, configured properly" on the other end of the line. All it knows is that the messages it receives either do or don't conform to the protocol it knows how to speak.

    To the best of my knowledge, the messages I am sending are identical to the ones the webpage is sending. The server should be responding to them in an identical manner. The server is not responding to them in an identical manner; this means that there is some detail I'm missing.

    This is the issue I am asking for help with. Once I can get that worked out at the messaging level, then I can look at the higher-level abstractions and see why the client library I'm using isn't automatically doing what you think it should be doing. But until I get the messaging level worked out, messing around at higher levels of abstraction is useless.

  • Global Moderator Plugin & Theme Dev

    I think I see the disconnect. I'm essentially telling you that doing this:

    If I send one or more GET requests to the socket.io URL first, mimicking what the webpage does, and then try to connect the websocket using the retrieved sid, it's even worse: the connect attempt sits around waiting for approximately 8 seconds, then immediately disconnects. Fiddler shows it was sent a Close from the server. No Socket.io events are received from the server.

    Is counter-productive because your socket.io client library (H.Socket.IO) should do this for you. Essentially, the requests are not identical because you are sending extra requests.

    Have you inspected your outgoing requests and compared them to the browser requests?

    Given you have something that works at least partially:

    If I simply point a socket.io client at the socket, I get messages for sid, checkSession, and setHostname, but then it just sits there. I receive new post notifications, but no messages if I send a DM from my main account to the bot or do other things that the bot's user should be personally notified of.

    I'd suggest not continuing down the line of sending requests manually. Instead let me look for other socket things you need to send. Most of it is from the client source file.

    When you get the checkSession message, what is the data associated with it? The uid of the user you're logged in as should be sent as the data payload. If you're logged in as @masonwheeler that should be 21756.

    On socket connection we have

    reJoinCurrentRoom();
    

    which calls

    app.enterRoom(room);
    

    which results in

    socket.emit('meta.rooms.enter', {
      enter: room,
    });
    

    If your socket is associated with a user, it will trigger this code which will add your user to that room. If room = "topic_15574" for instance you will be subscribed to edits and new posts in this topic.

    We also call

    socket.emit('meta.reconnected');
    

    If your socket is associated with a user, it will trigger this code and send a couple of messages including the unread topics count and unread notifications count.


  • @pitaj said in Unable to connect a bot to the websocket:

    I think I see the disconnect. I'm essentially telling you that doing this:

    If I send one or more GET requests to the socket.io URL first, mimicking what the webpage does, and then try to connect the websocket using the retrieved sid, it's even worse: the connect attempt sits around waiting for approximately 8 seconds, then immediately disconnects. Fiddler shows it was sent a Close from the server. No Socket.io events are received from the server.

    Is counter-productive because your socket.io client library (H.Socket.IO) should do this for you. Essentially, the requests are not identical because you are sending extra requests.

    Yes, the applicable word there is "should." The client library I'm using isn't doing that stuff for me. I've tried a handful of different C# libraries for socket.io, and none of them do that stuff.

    Have you inspected your outgoing requests and compared them to the browser requests?

    Yes, as evidenced by my multiple references to using Fiddler to set this up and ensuring that the message traffic is identical on the HTTP level. (Are you familiar with Fiddler?)

    Given you have something that works at least partially:

    If I simply point a socket.io client at the socket, I get messages for sid, checkSession, and setHostname, but then it just sits there. I receive new post notifications, but no messages if I send a DM from my main account to the bot or do other things that the bot's user should be personally notified of.

    I'd suggest not continuing down the line of sending requests manually. Instead let me look for other socket things you need to send. Most of it is from the client source file.

    Yeah, I looked at that file. It says nothing useful whatsoever on the subject of anything that happens before successfully obtaining a WS connection.

    When you get the checkSession message, what is the data associated with it? The uid of the user you're logged in as should be sent as the data payload. If you're logged in as @masonwheeler that should be 21756.

    OK, just checked, and I'm receiving a checkSession message with a nonzero value. (It's not 21756 because it's not this forum I'm trying to connect to. But that's progress at least, right?)

    On socket connection we have

    reJoinCurrentRoom();
    

    which calls

    app.enterRoom(room);
    

    which results in

    socket.emit('meta.rooms.enter', {
      enter: room,
    });
    

    Right. At the moment I'm not trying to enter any URL-based rooms. I'm just trying to get connected to the bot's user notifications, which I think happens on the server side on this line automatically as part of the connection process. (Is that correct? If not, where do user notifications come from?) It's in a conditional branch, though, which makes me suspect that I'm somehow ending up in the wrong branch and getting logged in on the server side as a guest rather than a user.

    If your socket is associated with a user, it will trigger this code which will add your user to that room. If room = "topic_15574" for instance you will be subscribed to edits and new posts in this topic.

    We also call

    socket.emit('meta.reconnected');
    

    If your socket is associated with a user, it will trigger this code and send a couple of messages including the unread topics count and unread notifications count.

  • Global Moderator Plugin & Theme Dev

    The client library I'm using isn't doing that stuff for me

    I don't understand how you can say this while as evidenced by your own words you are being connected as expected when you don't do stuff yourself, but attempting to do it yourself messes things up.

    Are you familiar with Fiddler?

    Never heard of it.

    It says nothing useful whatsoever on the subject of anything that happens before successfully obtaining a WS connection.

    Because the client library handles all of that, just as yours does apparently. I think you may not be seeing an http request like from the browser because your client doesn't support http polling, only websockets, so it has no need to use the polling endpoint.

    Right. At the moment I'm not trying to enter any URL-based rooms.

    I know. I was just walking through exactly what we do on connection. Ignore the room entry for now.

    OK, just checked, and I'm receiving a checkSession message with a nonzero value.

    It's in a conditional branch, though, which makes me suspect that I'm somehow ending up in the wrong branch and getting logged in on the server side as a guest rather than a user.

    The condition socket.uid is exactly what is sent in the checkSession message payload. So I'm pretty confident your socket has a user associated with it and you're in the uid_nnn room.


  • @pitaj said in Unable to connect a bot to the websocket:

    The client library I'm using isn't doing that stuff for me

    I don't understand how you can say this while as evidenced by your own words you are being connected as expected when you don't do stuff yourself, but attempting to do it yourself messes things up.

    Because I can see the messages that are being passed, and the client library is not passing them. It's "successfully" connecting me to some broken mode that is not receiving user notifications, and it's never sending the handshake messages that the web client is sending.

    Are you familiar with Fiddler?

    Never heard of it.

    🤦 No wonder you don't understand the stuff I've been saying!

    Fiddler is an HTTP debugging tool that proxies your computer and lets you watch all the traffic. If you know what Wireshark is, it's like that, only about 100x more intuitive and user friendly.

    You know how Chrome and Firefox have a network debugger in their dev tools that shows you all the requests your page is making? Fiddler lets you monitor that for the entire computer. It can proxy its way into SSL connections, and read WS traffic. So just trust me on this: when I describe which messages are and are not being sent, I know exactly what I'm talking about.

    It says nothing useful whatsoever on the subject of anything that happens before successfully obtaining a WS connection.

    Because the client library handles all of that, just as yours does apparently. I think you may not be seeing an http request like from the browser because your client doesn't support http polling, only websockets, so it has no need to use the polling endpoint.

    Hmm... that's possible. A look at the engine.io protocol makes it appear that the two are more or less interchangeable.

    OK, just checked, and I'm receiving a checkSession message with a nonzero value.

    It's in a conditional branch, though, which makes me suspect that I'm somehow ending up in the wrong branch and getting logged in on the server side as a guest rather than a user.

    The condition socket.uid is exactly what is sent in the checkSession message payload. So I'm pretty confident your socket has a user associated with it and you're in the uid_nnn room.

    All right. So much for that theory. So then why am I not getting chat and other user-specific messages?

  • Global Moderator Plugin & Theme Dev

    🤦 No wonder you don't understand the stuff I've been saying!

    Fiddler is an HTTP debugging tool that proxies your computer and lets you watch all the traffic. If you know what Wireshark is, it's like that, only about 100x more intuitive and user friendly.

    You know how Chrome and Firefox have a network debugger in their dev tools that shows you all the requests your page is making? Fiddler lets you monitor that for the entire computer. It can proxy its way into SSL connections, and read WS traffic. So just trust me on this: when I describe which messages are and are not being sent, I know exactly what I'm talking about.

    Yes I know that class of tools, just never heard of Fiddler before. Never personally used them but I understand what you're doing now.

    Because I can see the messages that are being passed, and the client library is not passing them. It's "successfully" connecting me to some broken mode that is not receiving user notifications, and it's never sending the handshake messages that the web client is sending.

    Hmm... that's possible. A look at the engine.io protocol makes it appear that the two are more or less interchangeable.

    When I set config.socketioTransports = ["websocket"], I only see wss:// requests, and all of the https://...polling... requests are gone in the browser.

    All right. So much for that theory. So then why am I not getting chat and other user-specific messages?

    Good question. It looks like you're correct: that room is used for messaging and for notifications.

    It's going to be quite difficult to debug this without a NodeBB instance. Placing logs in a few places would be very helpful.

    One thing you could do is double-check that the uid in the checkSession payload actually equals your bots uid. Just log in as your bot, open the browser devtools, and evaluate app.user.uid in the JS console. They should match.

  • Global Moderator Plugin & Theme Dev

    Another thing you can try is checking the online users page (https://community.nodebb.org/users?section=online). If your bot is connected as a known user, it should show up there as not a guest.


  • @pitaj OK, I have no idea what's going on, but after I verified all of those things, without actually changing the code that wasn't working before... suddenly I'm getting user messages! 😕 😕 😕

    So... yeah, it's working now. Thanks for the help!

Suggested Topics

| |