Artwork above bed in master bedroom

Take Control of Your Stolen Work Actionable Steps for Business Owners

By James Wan

What you need to know (in a nutshell)

  1. Tech companies use people’s data for free to develop their AI systems. This dynamic is particularly damaging with the new wave of generative AI programs that depend on users’ data to exist. However, data creators have the power to change this dynamic, and the article suggests four ways they can leverage this power: direct action, regulatory action, legal action, and market action. Immediate action includes individuals limiting web scraping by using their robots.txt file.
  2. In contrast, regulatory action involves lawmakers clarifying that “fair use” under copyright law does not allow for training a model on content without the content owner’s consent.
  3. A legal action includes communities adopting new data-licensing regimes or pursuing a lawsuit, and market action involves demanding large language models be trained only with data from consenting creators. Data creators have a tremendous amount of “data leverage” that can be used to create an AI ecosystem that generates new technologies and shares the benefits of those technologies somewhat with the people who made them.

Full Article

Creators can pressure courts, markets and regulators before it’s too late to prevent tech companies from exploiting their labour for free. AI researchers suggest users possess a considerable amount of data leverage which can be used through direct action (such as individuals banding together withholding or redirecting content), regulatory actions (pushing for data protection policies), legal action(adopting new licensing regimes or pursuing lawsuits) and market initiatives like demanding language models only use content with creators’ consent. Website owners could disrupt training pipelines by configuring robots.txt files. At the same time, user-generated sites such as Wikipedia, StackOverflow & Reddit should take advantage of opt-out mechanisms provided by AI firms whilst also being vocal when their work is used without permission - a similar social media uproar has been triumphant previously, resulting in one major generative player agreeing to honour requests collected via websites like ‘have been trained’. It may even be possible that mass protests against specific forms of art compel these businesses into ceasing activities perceived as theft. Meanwhile, lawmakers must get involved quickly, clarifying fair usage under copyright law does not apply here alongside anti-data laundering laws making sure where models are trained on the material without an agreement, they need retraining within an acceptable time frame, potentially building frameworks already seen elsewhere around Europe & California plus pushing forward “Data dividend” laws redistributing wealth fairly amongst those who create said information/content/work etc. A combination of all this will likely result in further demand from large institutions requiring full-consent LLMs to pay contributors appropriately, thus safeguarding healthy ecosystems essential for continued success across intelligent technologies going forwards!