Datafund

Datafund blog

Follow publication

OpenAI’s latest drama: What are the larger currents underneath?

--

Source: Dall-E, Datafund

Last week was turbulent for AI, courtesy of OpenAI. On 13 May they unveiled their latest creation: ChatGPT 4 Omni, a multimodal model that captured the public attention with its impressive capabilities. Just three days later, the company made waves again, announcing a partnership with Reddit that would feed the company real-time data from Reddit’s plethora of public forums. The announcement came hot on the heels of a similar deal struck with Stack Overflow at the beginning of May.

It didn’t stop there; the very next day, news broke that OpenAI was dissolving its superalignment team, dedicated to the safe development of AI. This unexpected turn of events was quickly followed by a very public departure of two senior figures from the company later that day, igniting the public debate.

We only had to wait a few more days for OpenAI to find itself at the centre of another controversy. This time, it involved Sky, one of the voices that narrates ChatGPT’s responses. The famous actress Scarlett Johansson alleged in a public statement that OpenAI had gone against her wishes by making Sky sound eerily similar to her own voice. After that, OpenAI decided to pause Sky.

Unpacking OpenAI’s latest moves: A negative shift in the AI industry

Source: Pixabay

The drama surrounding OpenAI has certainly grabbed headlines, but it is important because it is a reflection of a concerning shift in the AI industry. The latter appears to be going down the same “move fast and break things” path that social media giants had trodden. And we already know that the consequences of those choices had a questionable positive impact on society. At best.

It’s no wonder the voices cautioning against AI’s potential dangers, which is manifold more potent than social media, are growing louder.

Reckless building of something potentially dangerous

One of those voices is Jan Leike, OpenAI’s former superalignment lead. In his departure statement last week he wrote: “Building smarter-than-human machines is an inherently dangerous endeavor.” Despite this inherent and deeply concernig risk, the broader trend in the industry seems to prioritise getting products to market over heeding cautionary or disagreeing voices.

The disbandment of OpenAI’s superalignment team is just mirroring this larger trend. Last year Meta and Microsoft did the same thing, and from a corporate perspective, it’s easy to understand why. The race to dominate what could be one of the biggest markets ever is well and truly on. Ethics can only act as dead weight in the rat race.

However, from a societal perspective, this is the least sensible path to take. Two decades after the introduction of social media, we are still grappling with its consequences, and now we find ourselves at the dawn of an even more disruptive technology, which is evolving at breakneck speed.

Without dedicated teams focusing on a comprehensive and positive societal alignment, there is a genuine risk that AI models could behave unpredictably or even harmfully. With deep fakes, we can already see glimpses of where the current could be taking us: misinformation, inadvertent propagation of biases, production of harmful content, or dangerous decision-making in critical applications.

Even if companies are integrating alignment efforts into the entire development process, instead of making it a standalone department, it dilutes these efforts across multiple teams. As a consequence, this may result in insufficient attention and resources being allocated to address critical issues comprehensively.

Whether these concerning shifts will raise the eyebrows of regulatory authorities remains to be seen.

Reaching into real-time data streams

Source: Pixabay

These recent developments are also clearly showing a strategic shift among AI industry leaders regarding data acquisition. The OpenAI deals with Stack Overflow and Reddit show that they are actively looking for ways of tapping into real-time data flows and accessing exclusive data that is hard to obtain on the open web.

This trend (which we previously explored in our post, “Knowing Your Data is Knowing Your AI: Why Data Provenance Matters”) raises salient questions about the ethical implications of such data harvesting. As one disgruntled Stack Overflow user pointed out: “…anything you post on any of these platforms can and will be used for profit. It’s just a matter of time until all your messages on Discord, Twitter etc. are scraped, fed into a model and sold back to you.” It is the same mentality that social media corporations used when turning unpaid labour and attention into an advertising product.

Furthermore, the unfiltered and diverse nature of social media data introduces new points of contention, such as the potential for amplified biases and increased proliferation of bot-generated content, to name just a couple. In this context, rigorous data vetting becomes imperative to mitigate these issues. However, with the dissolution of superalignment teams, it remains uncertain whether these crucial processes will receive the necessary attention and implementation within organisations.

The Scarlett Johansson controversy: A symptom of larger issues

The recent incident involving Scarlett Johansson’s voice has only piled on the AI industry’s broken relationship with consent and ethical considerations (remember the Books3 battle), demonstrating its propensity for reckless behaviour.

The issue of consent, particularly when it comes to training AI models, has been a persistent sore point for Silicon Valley for over two decades. This pattern of disregarding individual consent, unfortunately, shows no signs of abating in the foreseeable future.

The case for open-source and verified data in mitigating the risks of reckless development

All of the above shows that the world is only slowly coming to grips with just how potent AI technology is, and nobody knows in what ways it will change global society. Its transformative potential is enormous: from drug discovery and materials science to education and beyond. So is its destructive potential.

Therefore, it’s imperative that AI development remains transparent and includes as diverse a range of contributors and stakeholders as possible. In this context, the value of verified datasets with proven provenance and open-source contributions becomes evident.

Transparency and community collaboration are pivotal in addressing the pitfalls of AI development. Open-source projects enable rigorous scrutiny, rapid innovation, and more equitable access to cutting-edge tools.

They facilitate standardisation and the establishment of best practices, enhancing the reliability and safety of AI models. Additionally, open-source initiatives promote reproducible research and enable independent ethical auditing, ensuring compliance with safety protocols and ethical standards.

Provenance in datasets is equally crucial, as it ensures the accuracy and representativeness of model outputs while also addressing the critical issue of consent and compensation. By prioritising transparency, collaboration, and ethical practices, we can better harness AI’s potential while mitigating its risks and ensuring its benefits are accessible to all.

Datafund’s focus on responsible technology

At Datafund, our commitment to ethical data practices has been a core value since our inception. We firmly believe in designing solutions that respect human rights and uphold the humane use of technology. As we move forward, our dedication to these principles will remain, with a particular focus on providing ethical data for AI.

Connecting with like-minded projects and individuals globally who share our values has been, and will continue to be, an integral part of our journey. Together, we can shape a future where data is utilised responsibly and ethically, especially in the rapidly evolving realm of artificial intelligence.

For more info, keep following us on:

X

Linkedin

or get in touch via:

info@datafund.io

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response