Why 80% of the projects on Github have no license?

Re: Why 80% of the projects on Github have no license?

Postby Wuzzy » 22 Sep 2016, 14:10

Truth is, GitHub has NO CLUE how the license situation looks.

The license detection methology is pretty poor. There is no real standard way for projects to declare a license. LICENSE and COPYING are pretty common, but so is writing it directly into the readme file. The conclusions drawn here are false.
GitHub {l Wrote}:To detect what license, if any, a project is licensed under, we used an open source Ruby gem called Licensee to compare the repository's LICENSE file to a short list of known licenses.


The real fact here is that 20% of projects have been DETECTED to have a license. We can only infer from that that AT LEAST 20% of the projects have a license, because there are always some projects which specify the license in a “non-standard” way and therefore were not detected. So the actual number is higher. Also, this detection algorithm fails to scan for the pretty common COPYING file, which is a very, very bad oversight. Readme files are, as far I know, not scanned either but this wouldn't be easy to implement, I fear (too hard for machines to interpret).

It does NOT follow that 80% of projects have no license. Actually, we can only infer that AT MOST 80% projects have no license, so the headline should be “0-80% of GitHub projects have no license”.
Yes, it's that bad, GitHub does not really have a clue about the license situation. :D
User avatar
Wuzzy
 
Posts: 989
Joined: 28 May 2012, 23:13

Who is online

Users browsing this forum: No registered users and 1 guest

cron