Popular forum exporters
-
BTW after you finished with the phpBB export: I found that there's a bit of a bottleneck with regards to the redirect stuff in the importer, so if you don't care about writing redirects for SEO purposes then you can omit that and speed it up
-
I was going to bring up woltlab bb (I guess it's only popular in Germany), but actually first porting to phpbb sounds like a smart plan
-
Hey @psychobunny, took your advice and got mysql installed on Digital Ocean and imported it all that way, having a few issues with the final step of using the importer to actually get the data in. Following the steps, I get this error:
error: NodeBB automated setup didn't go too well.
I already have nodebb setup, I noticed that import.config.json has a setup script built in, do I have to either a) remove it, or b) fill it out with the contents of nodebbs config.json?
-
Oh awesome, congrats for getting that far! I think at this point you'll have to ask the creator of the importer, @bentael - I didn't run into that error myself. If the actual import worked but creating the config didn't, yeah maybe you could create one yourself and see what happens?
-
@psychobunny Think I got it, removing
--flush
and it's now going through users.Once I'm sure I know the process, I'll create a guide on the wiki page for anyone else that needs it.
-
awesome. hope it works out, I have no idea how long its going the import process is for something as big as yours
-
@psychobunny Users went over within about 5-10 minutes, however it's only transferred 5 of my 25,000 posts. So not really sure why that is.
5 threads, and only the OP, no replies.
-
yeah I found that posts take forever, about 500 posts took 3 minutes... so a rough extrapolation would suggest it would take you two and a half hours
Not sure if there are any optimizations that could be made in the importer, I imagine that @bentael has some ideas in mind for the future. For your sake I really hope it works the first try (he has some "time machine" feature in there that lets you import just the first 100 posts or something, which I really recommend you trying to test drive everything)
-
@psychobunny The posts are there, but got an error:
Importing Topics ...
[import][warn] [c:1] skipping topic:_tid:"2" --> _cid:valid: true _uid:valid: false
[import][warn] [c:2] skipping topic:_tid:"3" --> _cid:valid: true _uid:valid: false
[import][debug] [c:3] saving topic:_tid: 4
[import][debug] [c:4] saving topic:_tid: 6
[import][debug] [c:5] saving topic:_tid: 7
[import][debug] [c:6] saving topic:_tid: 8
[import][warn] [c:7] skipping topic:_tid:"9" --> _cid:valid: true _uid:valid: false
[import][debug] [c:8] saving topic:_tid: 10/home/a_5mith/nodebb/src/socket.io/index.js:199
var clients = io.sockets.clients();
^
TypeError: Cannot read property 'sockets' of undefinedTime taken I can live with, my forum is still running, this is a backup from last week, so have the time to check these things.
All the topics are there, it just seems that after 3 warnings it stops. I've even looked through 2 different topics one with and one without the warning, they look the same to me from a quick inspection.
-
TypeError: Cannot read property 'sockets' of undefined
hmm... my first guess is that your version of NodeBB isn't compatible with the importer. I know that it's good for 0.4.1 but not sure if anything new has broken it. Try downgrading to 0.4.1 (after the import, the upgrade command should take care of the rest)
What concerns me is the fact that an hour later you're still at tid 10 wonder what the bottleneck is in this process?
-
@psychobunny Same issue with both 0.4.1 & 0.4.0.
After tid_10 it aborts the process. Hopefully @bentael will be along soon to assist.
-
Mo progress, mo errors. Hadn't commented
Topics.pushUnreadCount();
Anyway, got through the process, now receiving this error:
[import][debug] restoring configs
[import][debug] undefined
[import][error] Something went wrong while restoring your nbb configs
[import][warn] here are your backed-up configs, you do it.
[import][warn] undefined/home/a_5mith/nodebb/node_modules/redis/index.js:516
throw callback_err;
^
Error: Error: ERR wrong number of arguments for 'hmset' commandconfig.json looks fine, so not sure what's going on.
-
@a_5mith sorry I wasn't home all day.
So, after reading this thread, i think I have an idea why this issue (in the comment) is happening, removing
--flush
will not flush your db and attempt to run NodeBB setup for you, this is designed to "pick up where you left off", however it assumes that you were successfuly able to run it at least once in the beginning for it to generatepath/to/storage/import.backedConfig
. But it looks like your initial issue is whatever caused thiserror: NodeBB automated setup didn't go too well.
can you tell me which OS you are using?
Also, say you do use--flush
back again, (this will flush your db), the logger should've printed out a much more detailed error, I hope, see this line
could you share that part of your logs?As for speeding post imports in, like @psychobunny mentioned, you can disable the redirect stuff, but still, there is many things that NodeBB is doing, at each record creation, my future plans is to skip the NodeBB modules usage, and hit the DB directly, but that would require mimicing almost everything that the core code does, minus the fancy unnecessary stuff for the import process.
-
@bentael Hey pal, thanks for the help thus far over on git, running Ubuntu 12.04.4 on Digital Ocean VPS. What I've put in this thread is pretty much all I got. Their is a log file inside the bin folder of your plugin, but all it contains is it skipping all the topics (as they're already in the DB) and then the error I displayed at the end. If this is the one you mean I can drop it in a pastebin for you to look through, but I don't recall seeing any other logs.
On a separate note, the forum is up and running, minus a few posts etc that I'm not too concerned about as they're from sub cats, so I can put the ones I want into the main cats before I do my next export if I want them. BUT, how would you go about translating all the code entries, as it's a mixture of UTF-8, html & bbcode, I know @psychobunny made a plugin that should solve this, but it doesn't seem to do much for me. I noticed in the readme that you include some form of html-ml and bbcode-ml section, would this have to be done prior to import? (I'm going to set up another clean forum tomorrow and have another attempt at importing the database files etc to make sure everything works before I attempt a proper transfer, but from what I'm looking at so far, it shouldn't be long before I'm using nodebb long term and I can go into the list of "Who's using NodeBB"
On another separate note, with all these posts, the forum seems a little, unstable. Seems to have a heart attack if I try and change anything. But
top
shows it as still responding. Could just be the amount of rubbish I've installed along side NodeBB in my "getting to know Ubuntu" phase, I'll create a new droplet when I've got the importing posts down to a tee, as I'm pretty sure I've got things installed I'm never going to use. (Ajenti being one of them) -
@a_5mith Id like to see the failed logs if you still have it, or can reproduce it again. I am missing the notifications from this thread, so i am going to paste what I said on the github issue
let's keep the conversation here, or there, but in one place please.so, .. I migrated all of your files, smoothly in 10.4 minutes, see the partial logs at the end here
As for the numbers mismatch, here what i came up with:
You have 23905p.*
files in storage, so that's 23905 posts, you also have 1788t.*
files so that's 1788 topics, however, when you create a TOPIC, NodeBB creates a POST for that TOPIC, as the main parent POST, which means add 1 post to each topic, hence the25693 = 23905 + 1788
which is represented by25.7
with the NodeBB stats widget.
This behavior is normal, however, it looks like PHPBB doesn't count the "parent" posts in its stats, that's all.
As for why have 23905 posts files and not 22.1k, I have no clue, that's pre-importer, that's exporter phase.About converting the content:
By default, the importer does not touch the content, however, there is a config setting called 'convert', you can set it to either '"convert": "bbcode-to-md"
OR
"convert": "html-to-md"` and it will convert accordingly, but keep in mind 2 things:- it will impact the importer performance, and make it slower than what it already is
- the conversion is not perfect, some thing might slip through,
In your case, since you would need
bbcode-to-md
and since @psychobunny didn't do much you can these 2 things together:- use the
"convert": "bbcode-to-md"
as my conversion function covers more ground then psychobunny's, no offense, but still doesn't cover everything... - go to http://localhost:4567/admin/plugins/markdown and disable the
Sanitize HTML
option, if that solves the HTML sanitization, you can use nodebb-plugin-sanitizehtml in production to stay safe. WARNING I haven't tested this plugin for a long time, and it's probably broken__ but I will try to get it up to date ASAP.
-
so I used the
bb-to-markdown
config and tested a run down on your data, took 10.9 minutes, then disabled the SanitizeHTML, and it looks like the content got much better, but not enough... looks like my bbToMarkdown function is not that great, ... it was based off of https://github.com/feralhosting/BBCode-To-Markdown-Converter anyways.I saw somewhere that you are using phpbb3, could by any chance the bb code be different in format? I am not a BBCode guy, so I would need some else's opinion on how to convert that to markdown
-
I saw somewhere that you are using phpbb3, could by any chance the bb code be different in format? I am not a BBCode guy, so I would need some else's opinion on how to convert that to markdown
I wrote this for phpBB2, not sure if its different for phpBB3. I could probably write another set of regexes for phpBB3 if I know all the possible cases
-
@bentael Hey buddy, I will try and find the logs, but I'm not sure where to look for them. I will try and repro today if I can't find. I remember what I did last time more or less. (Didn't uncomment that code for a start )
Turning off Sanitize HTML did fix a lot of the formatting, but as you suspected your plugin is out of date for latest version of NodeBB. Performance isn't a huge issue for me, I can leave it running if needs be, but I do have quite a few custom bbcode strings that I've added myself which I may tackle before I export from SMF.
I'm cracking on with a new node now, so I'll let you know how I get on with newer data. Few hundred more posts and a bit more knowledge.
I don't think the BBcode has changed from exporting from SMF to PHPBB3. Fortunately, most of my embeds are soundcloud based. And all I need to do with them is completely remove all the
[soundcloud]
bbcode in posts. Which I can do from phpmyadmin before I import it into PHPbb so I'm only left with the URL, which the nodebb plugin already supports. -
@a_5mith https://github.com/akhoury/nodebb-plugin-sanitizehtml now supports NodeBB 0.4.x,
you're welcomeThere is an "Advanced Option" that lets you write your own JavaScript to mutate the content of each post/signature, say you want to do something custom with each, enjoy the power.