SakeTami
mrseeker
mrseeker

patreon


License issues

As some have noticed, I have released OPT-6B-Nerys-v2 on KoboldAI. I am seeing that BLOOM also has a modified license with restrictions sparked some thoughts. Why should I be doing finetunes on those models? And I want to write a clear message to the community about why I am in the process of creating my models from scratch.

I consider creating finetunes a form of art. Like the Fluxus movement in the 60s, I want to expand the potential of civilized culture, not restrict it by adding licenses that limit the artist's potential.

OPT License:

[...]a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free and limited license[...] We may terminate this License, in whole or in part, at any time upon notice (including electronic) to you.[...] 

This sentence means that at any time, Meta can pull the plug whenever they want, which means I need to remove all the OPT models (and the work put in them) offline. Essentially saying, they retain full ownership of these models. So, for this reason, I am not releasing any Shinen models on OPT models. I am debating if I actually should make more Pike/Nerys models (since the dataset does contain a "romance" genre that can get pretty NSFW).

[...]use, modify, copy, reproduce, create derivative works of, or distribute the Software Products (or any derivative works thereof, works incorporating the Software Products, or any data produced by the Software), in whole or in part, for (i) any commercial or production purposes[...]

This license means I am not allowed to sell the Model or use it for commercial activities. It also means that a company like goose.ai cannot use my models, nor can I use these models to do commercial things with them (I released one novel using AI, and I do want to create more of them).

In short: The models I produce under the "OPT" flag are in essential NC-CC-BY-SA. This license makes it so that "every model I produce is only usable for "personal research purposes only".

RAIL License:

This license is better by using an Apache 2.0 License as its base. Still, then it does a total 180 and becomes a somewhat restricted license instead:

[...]User-based restrictions as referenced in paragraph 5 MUST be included as an enforceable provision by You in any type of legal agreement (e.g. a license) governing the use and/or distribution of the Model or Derivatives of the Model, and You shall give notice to subsequent users You Distribute to, that the Model or Derivatives of the Model are subject to paragraph 5.[...] To the maximum extent permitted by law, Licensor reserves the right to restrict (remotely or otherwise) usage of the Model in violation of this License, update the Model through electronic means, or modify the Output of the Model-based on updates. You shall undertake reasonable efforts to use the latest version of the Model.[...] 

So, what does this mean for me? It means that if you create something from the Model, you need to follow a set of rules that restrict its usage in specific ways to prevent "unethical" use of the Model. Most of those restrictions are already illegal, but how they describe them makes it (again) impossible for me to build anything sensible with them. I cannot use it for a Turing Test because of the following clause:

[...]To generate or disseminate information or content, in any context (e.g. posts, articles, tweets, chatbots or other kinds of automated bots) without expressly and intelligibly disclaiming that the text is machine-generated;[...]

This sentence means that I cannot generate any content without giving you an upfront warning that you are not talking to a human. The second problem is that I should "take reasonable efforts to use the latest version of the Model". What it means for me is that if they update the Model, I need to update all my finetunes. And they retain the right to restrict the use cases even further whenever they feel like it.

OpenAI:

I will be very short on this one: It is impossible to finetune on them.

[...]You will not use the APIs or Content or allow any user to use the Application in a way that violates applicable law, including: [...] Sexual: content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness). [...]

Since my finetunes contain some sexual encounters, I must first get rid of the "romance" genre.

Future:

Having said all this, what does the future hold for me? I will keep building new models for KoboldAI since their community gives me great ideas and helps me improve who I am as an artist. However, the licenses on these models heavily restrict what I can and cannot use for finetuning. As seen with GPT-4chan, just having a Youtuber using GPT-J to create a viral video is enough for big companies and researchers to start restricting model usage. I am flying too low under the radar for them to get noticed.

ModronAI:

I will keep experimenting with ModronAI, hivemind and the possibility of using a distributed way of building models. I already have the source code ready and doing some testing, but my time is unfortunately limited. If you do want to help out, let me know. I have the training code almost ready. I am working on the way to load data in, but when that works, I can prepare it for a small-scale test. Note that it will use a mix of public and private data. Given the Model's scale, unfortunately, you won't be able to run it on a free Colab instance.

Comments

I have a tesla m40 (24GB) that I'd be willing to run code on if that would be helpful. I only use it occasionally for KoboldAI play, so during the night and while I'm at work it's mostly idle...

ebolam


More Creators