New Data Not Found When Connecting to Mongo With ReplicaSet and set to ReadSecondary preference.

<baris>

You are correct if you update a value in the database manually via cli and if that value is cached then it won't be reflected until the cache is cleared or the value falls out of cache.

How many nodebb processes are running? If you have more than one use redis so that when a value in the database changes the cache is cleared on all nodebb processes.

Seek-AndyAng

Hi,

We deploy 2 nobeBB, one for the Web which only have single instance, and another for API which have up to 5 max instance. When we perform the testing, we only tested on the Web which is single instance, however, since we have DB replicas, it only writes to the Primary and read from Secondary.

When we only write and read from Primary, everything works fine; but when we write to Primary and read only from Secondary, we notice the new data is not fetch by nodeBB but the data actually was synced successfully to the secondary (which is similar to we update the database manually via CLI). So, we're wondering, how does nodeBB knows when to clear the cache and how can we trigger a refresh?

As of our testing yesterday, we actually didn't notice the cache being clear; How can we know or configure this?

If we did not use redis, and only 1 database (mongo) in config.json, would it still be having cache?

PitaJ

@Seek-AndyAng NodeBB uses Redis pubsub to tell all of the other processes when to evict from the cache. Because of that, Redis is necessary when running multiple processes.

What is your multi-process setup in config.json?

<baris>

@Seek-AndyAng said in New Data Not Found When Connecting to Mongo With ReplicaSet and set to ReadSecondary preference.:

If we did not use redis, and only 1 database (mongo) in config.json, would it still be having cache?

If you are only using a single nodebb instance you don't need redis pubsub to clear cache on the other instances.

Each nodebb process has it's own local in memory cache so when they load some data from the mongodb database they store it in that cache. If another nodebb process changes that data it uses redis pub/sub to tell everyone else to clear that data from their in memory cache. That's why if you are using more than one nodebb process you also have to setup redis so cache is properly cleared on all instances.

If you manually change data outside of nodebb like through mongodb cli, you need to go into nodebb acp and manually clear the cache.

<baris>

Another thing you can look into is mongodb writeConcern https://www.mongodb.com/docs/manual/reference/connection-string/#write-concern-options, this changes the beviour of write so the acknowledgement only comes back when the write goes to secondaries I think the default is just to return when it is written to the primary.

julian

Are you dead set on sharding databases? It usually is one of the last things you do to increase throughput, but often times it is not necessary. Horizontally scaling your app servers is usually more than sufficient.

FWIW we've scaled out to four servers running two processes each, and the database has never been the bottleneck.

Seek-AndyAng

Hi @julian
Sorry, we're currently applying only replication techniques for horizontal scaling. The reason we needed the replication was because we're cursor limit issue. On the code implementation side, there's definitely thing we can look into and improve, but it will take time and effort, in the meantime, we're trying to implement as well replication.

Seek-AndyAng

Hi @baris ,

Thanks for the detailed explanation on the caching behaviour! That definitely helps and clears up a lot of our doubt.

And, also for the writeConcern options, that would not have been what we had in mind because we have been searching high-and-low around nodeBB; It seem that should solve the problem, if yes, we'll probably need to look into the delays.

Seek-AndyAng

Hi @baris ,

After some more testing, we identified the problem is only happening when we reply to a topic (so far). We have no problems flagging, upvoting, creating new user, even creating a topic. However, creating a topic often resulted in the main post not showing up. When we reply to a topic, most of time, it will cause error because it can't find the post and since it's cached, it will keep failing when we try to access the topic. The reply was posted/insert into the database, but it cause error because the post was not found (reads from secondary)

Upon further tracing, we found the problem seems to be happening in this particular method "topicsAPI.reply" under "src/api/topics.js".
In this method, there was a call to get "posts.getPostSummayByPids" and since we have replica and set the read Preference to secondary, we notice most of the time, it returns a structure that contains no data (e.g. the pid is 0, the rawContent is null). During our testing, sometimes, it's able to return and work fine but most of the time it will have issue. We suspect the post has not been replicated to the secondary, thus causing this issue, but occasionally, it works, which should be because the data was already replicated to secondary when it reads.

The writeConcern is not solving the problem, even if we set to the max instance we have. We're only testing with 1 replica instance (1 Primary and 1 Secondary). We're now looking into specific cursor option, for example, to read from Primary for specific case or as a fallback solution when data was not found on secondary, but that would mean changing the shared module (e.g. hash.js under mongo folder).

At the moment, I'm not sure how transactions were configured or handled, but I read that if it's under the same transactions, it should ensure or guarantees "read your on writes"? Or we could set transactions options to read from primary?

Seek-AndyAng

Hello,

We also found problems in API Access Token when using DB Replica. When we set a new token and press save, most of the time, the token is not shown in the UI, but it is there in the database. This was also due to the DB Replica, when we save, it's reading from the secondary when we set the readPreference mode to secondary.

To solve this, we had to modify the "sorted.js" file under mongo folder to make a fallback when the find return no data, we will trigger another find which will modify the cursor option to read from Primary. We also modified "sets.js" to have the same behaviour when performing a collection find..

So far, most of the feature we tested works; at the moment, we're fixing the problem selectively by modifying the files under mongo folder as we detect the problem, which doesn't seems like an appropriate solution.
Is there any take on this or how could we set so that by default, it will have the behaviour of falling back to read from Primary when there is no data found on Secondary? Or setting all operation within a same request to use the same cursor option; in this case, whenever we're performing an insert, update or delete, it will be done on primary and therefore any query during the same request should also be using the same cursor option which will read from primary.

<baris>

How did you set the write concern? Can you share your config.json and how you use the mongodb connection string? This sounds like it should be solved by using w=<number of mongod instances> in the connection string. Are all nodebb app servers using a proper config.json with the same settings for mongodb redis?

Seek-AndyAng

@baris said in New Data Not Found When Connecting to Mongo With ReplicaSet and set to ReadSecondary preference.:

How did you set the write concern? Can you share your config.json and how you use the mongodb connection string? This sounds like it should be solved by using w=<number of mongod instances> in the connection string. Are all nodebb app servers using a proper config.json with the same settings for mongodb redis?

HI @baris ,

We've tried setting w=2, since we're testing, we only setup 1 replica instance. But when we test, it seems to have the same problem (we tested only once, since the problem still persist, we stop testing). For the UI, we only have 1 instance running since it's only used by very small limited of users (we have also the API instance for consumers). In our testing, we're only testing on the UI, so redis shouldn't be the problem and I was told that this they have notice this problem in the early setup before redis was implemented, and yes, all nodebb instances are using the same settings.

In the config json uri, it's just very simple now:
?replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false`
We have tried w=majority and w=2, so far it makes no difference. w=majority seems like it's the default configuration. We also tried to set read concern to majority but doesn't notice any difference.

Is there any use case or example that has been using nodeBB with DB replica? Wondering how was the configuration setup to avoid this issue? This issue is happening even for singe instance since the the same request (user press save or reply to a topic), it's performing a write and read, which in this case, if the read was set to secondary preferred, the rate of the issue happening seems logical since the data might not have been replicated to secondary
In the mean time, I'll try to setup and test again in my local with the specific 'w=<number of instance>', however, this might still not be a viable solution because if we set up auto-scaling or we needed to scale additional instances, we will face the problem again if we do not update the config.json.

Seek-AndyAng

Hi @baris ,

Just to provide some updates:
I tested the w=<number of mongod instance> in my local and it works. However, in our real environment, we're using Amazon DocumentDB, which unfortunately doesn't support user defined write concern and ignored the configuration, it will always use the default MAJORITY.

Based on the suggestion by DocumentDB, I guess we should use readPreference=primary in this case to ensure read-after-write result. Assuming DocumentDB will manage it so that for reads after write will prioritize primary and at the time routes other reads to secondary to avoid overloading primary. Just curious why we need to configure this instead of being a default behavior.
https://docs.aws.amazon.com/documentdb/latest/developerguide/how-it-works.html#how-it-works.replication