@Alex-Alex A good idea.You can open an issue here
GDPR compliance and storage of IP addresses
Hi there @frgilb -- IP addresses are considered personal data, that is fine. We do collect that as part of administration and moderation data, and so registered users who post and share this information will need to provide their consent to have this information tracked.
I am not sure it is that easy. It think the GPDR requires, that the purpose of the collected data is clearly documented, and duration of the storage is "for no longer than is necessary for the purposes for which the personal data are processed".
Both I am currently not able to describe in the declaration of data protection of my forum. Can you help me out on that?
For a GDPR compliance, you must clearly obtain the user's consent. You cannot preselect the answer or "hide" the question in a bunch of text, it must be absolutly clear.
And a user who want to remove all data about him must ask for it or do it himself.
@frgilb We store and process the IP address data as part of moderation tooling, and to my knowledge there's no issue with it as long as users are aware that this data is stored.
The full list of what we do store in NodeBB (not counting plugins) is outlined in our DPA: https://nodebb.org/gdpr
All users upon registration (and existing users can be required to provide consent as well, via plugin, to be released tomorrow) will review their rights and consent before being granted an account.
@julian Many thanks for your answer and pointing out the DPA. I really appreciate, that the GDPR is professionally addressed by you guys !
Nevertheless, I am not an data protection expert. I also might want to "over-fullfill" the GDPR, but as IP addresses need to be considered as personal data, I feel the need to somehow have control over this data. Limiting the storage duration and/or remove this data on request, seem to me like a requirement to be fully GDPR compliant. Is it planned to make this possible?
@frgilb To my knowledge, we are compliant on that front insomuch that you are able to delete your account in order to remove consent. The process to remove your content in addition to your account would require an administrative step, but this is perfectly fine and in accordance with GDPR. An administrator removing your account and content will remove the IP addresses associated with your account and posts (as the posts themselves are scrubbed from the database).
@julian I still have some doubts, but I need to dig deeper into this topic . I will export the database of my forum and check what is stored there and do some code reading in the nodebb sources. I will come back to you, if I still have concerns after that.
Hey, i would also like to disable (or anonymize) ip storage and delete already logged ip's in my database. If you find the place in the source code/database where they get stored, it would be great if you let me know Unfortunately I don't have a lot of experience with mongodb and couldn't find anything yet.
@julian I inspected my database export as well as the source code in github. I found a couple of occurences where IP addresses are stored into the database. According to many articles (e.g. https://www.ctrl.blog/entry/gdpr-web-server-logs) this is critical and should be minimized. I can not judge if this is compliant to the GDPR or not. Maybe only a lawyer can finally clarify, but I would like to avoid any kind trouble and reduce the risk for my forum.
My analysis might not be right. Please correct me, if my view and understanding of the source code is not correct!
IP Address of visitors is recorded to calculate the total Visitor Count
IP address is stored in:
IP Address of registered users is logged on each Login
IP address is stored in:
Only on user deletion in https://github.com/NodeBB/NodeBB/blob/12337302a7e746e36cd4fb5bd0e48fbb3707fae6/src/user/delete.js#L206
IP Address is stored on events.log() call
IP address is stored in:
many lcode locations
Admin panel Event History
From Admin panel
Out of above IP address usage, the first seem to me as most critical regarding GDPR compliance, as the IP address of visitors is stored forever without consensus. The purpose (counting unique visitors) from my perspective does not justify the storage. As a solution, a hash function could be applied to the IP address and the hash is stored in the database. With this you can still calculate the unique visitors, while not storing the IP address for visitors at all.
The two other usages can justified to some extend, but from my understanding the storage duration should be limited to appropriate time which depends on the purpose. The current policy to delete on request only, does not look compliant to me. I propose to introduce a mechanism, which deletes the login IP addresses as well a the event log after a certain period of time, configurable via ACP.
What do you think?
As a solution, a hash function could be applied to the IP address and the hash is stored in the database. With this you can still calculate the unique visitors, while not storing the IP address for visitors at all.
While investigating this issue, in particular, I want to point out that the above strategy is not secure, as only constitutes a minor roadblock for a determined individual.
Assuming the IP addresses are hashed via md5, one could generate all of the hashes for every single IP address in short order to create a lookup table, thus invalidating the hashing altogether.
We could of course, add a salt, but as NodeBB is open-source, such a methodology is pointless as the...
As I am writing this, I realise that we could append the NodeBB secret (defined in
config.jsonas the salt), and that would be theoretically cryptographically secure.
Juan G. last edited by Juan G.
A website operator can store certain personal data relating to visitors of the site if it helps to identify the visitor and also to protect itself against cyber attacks, the EU's top court ruled on Thursday.
(Websites can store IP Addresses, rules European Union Court, 20 Oct. 2016)
On GDPR (General Data Protection Regulation 2016/679):
It was adopted on 14 April 2016, and after a two-year transition period, became enforceable on 25 May 2018.
Here are some additional clauses (thanks @Jay-Moonah for looking into this earlier this week):
“Processing shall be lawful only if and to the extent that at least one of the following applies: […] (f) processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child.”
-- Article 6, Paragraph 1, Point F
“The processing of personal data to the extent strictly necessary and proportionate for the purposes of ensuring network and information security, i.e. the ability of a network or an information system to resist, at a given level of confidence, accidental events or unlawful or malicious actions that compromise the availability, authenticity, integrity and confidentiality of stored or transmitted personal data, and the security of the related services offered by, or accessible via, those networks and systems, […] by providers of electronic communications networks and services and by providers of security technologies and services, constitutes a legitimate interest of the data controller concerned. This could, for example, include preventing unauthorised access to electronic communications networks and malicious code distribution and stopping ‘denial of service’ attacks and damage to computer and electronic communication systems.”
-- Recital 49 (excerpt)
That said, where IP address is used in a fashion that isn't exposed to anyone of significance (regular users or admins), then I see no reason to utilise the IP, keep it for any lengthy period of time, or at least secure it properly.
To that end, please see gh#6539 (attached) to see how I've addressed the first point.
I would argue that the storage of IP addresses per user (via
User.logIP()) is required in order to prevent unauthorized access or cyber-attacks, although I use that term fairly loosely. I've identified the following use cases:
- Admin approval for registration (if an IP is already associated with a uid) -- useful for combating sockpuppetry
- Get similar uids during admin approval stage -- again, sockpuppetry-mitigation
- Search by IP -- used by moderators to find existing sockpuppets.
With GDPR consent required for all users, this is no longer an issue as they would be consenting to their storage of IP addresses for this purpose, and we do delete on user deletion, so this satisfies the "Right to be Forgotten".