@jwildeboer Re: "data hoarders will collect whatever training data"
Wait, are you considering "OS models" made with non-public data? If the data isn't either in public domain or on a permissive license how do you want to legally distribute a derivative work generator with traces of that training data on such a license?