Upgrade "Refresh post-upload associations" in 1.9.x causes error
-
Ok, that is fine to fix it temporarily. Also it doesn't have a very high priority. This is only a test install the actual one is already on 1.9.2. I just was wondering what was going on here to get a better understanding of NodeBB and especially MongoDB.
If you search for that error on the internet you find a lot of people saying that this is usually a timeout issue. A large query was executed without batching and thus the cursor timed out while being still used. This alone feels really strange to me as this seems like questionable design by MongoDB. But watching at the upgrade code this is actually batched in units of 100. So I don't see why this would cause any problems.
I also checked MongoDB memory status and it is not running out of it. Also NodeBB still has plenty of memory available. Sure the server is under heavy load because of this process but that shouldn't cause any problems? So what would be the most probable cause for this? Do I have a misconfiguration in the MongoDB?
-
@dravere Can you try to change the upgrade script from
async.each
to useasync.eachSeries
let me know if that fixes the issue. You can also try appling this commit as well https://github.com/NodeBB/NodeBB/commit/cba5aa975ea20b509c1766f3166d048103001afeNot sure why you are getting that error the upgrade script looks OK to me.
-
@baris Sure, sadly it didn't fix the problem.
Change Results async.eachSeries
Is much slower, but threw the same error after about 2 minutes at around 4%. async.eachSeries
& patchStill slower but faster than previous one. Threw the same error after about 2 minutes at around 5%. async.each
& patchHuge performance improvement. Would probably finish in 25 instead of 50 minutes. But still threw the error after about 3 minutes at 14% Edit: Will have to move this test environment tomorrow. I'll report back if I still can reproduce the error in the new test environment.
-
I can't send you the data. I think I would break several laws if I would do that at this point
Also I don't know how important that work is. I mean my production database is already on 1.9.2 since I migrated directly to that version from the old forum (phpbb2). So this really doesn't affect me at all. I just wanted to report it, perhaps try to understand it and learn something from it. Especially since my knowledge of MongoDB is still quite limited.
This error is happening on an earlier test migration to 1.8.2 that I have still around. So in theory I could just throw it away. Especially if you have no other reports with this problem.
I'll gladly try it with smaller batch sizes. But I kind of doubt that this will make a lot of difference. I have a batch process in my modified solr plugin but with a batch size of 1000 and it is working fine there (Yes, I'll work on the pull requests for solr, I know they are still open ).
-
@dravere can you make a change for me and run the upgrade script again?
Add
console.log('uploads', pid, uploads);
hereThe error is caused by a timeout and the only place I can think of halting execution is that while loop. The timeout seems to be 10minutes and if
cursor.next()
is not called for 10minutes you get that cursor not found error. -
Commenting out the fast path (these three lines) seems to be working so far (although extremely slowly).
My instance would previously get killed at around 15% (there were a lot of posts from two forum softwares ago when pasting files into a forum wasn't a thing), but now it's at 90%. It's been about 4 hours since I started this attempt.
-
@baris @Ben-Lubar @julian Found the problem. The solution Ben describes works, but it is painfully slow. And also made no sense. So I continued searching and found it.
It is not a problem in NodeBB. It seems to be a problem in the mongodb package. Downgrading it to version 3.0.4 and then only executing
./nodebb upgrade -s
so that the mongodb package isn't upgraded back to 3.0.8 makes the migration work. And it makes it work really fast. It took like 5 minutes.Found it via this issue in mongoose: https://github.com/Automattic/mongoose/issues/6504
And the issue going along in JIRA: https://jira.mongodb.org/browse/NODE-1482If someone has an account in this JIRA tracker, we perhaps should report there, that we are able to reproduce that bug.