Microsoft may have finally made quantum computing useful

The dream of quantum computing has always been exciting: What if we could build a machine working at the quantum level that could tackle complex calculations exponentially faster than a computer limited by classical physics? But despite seeing IBM, Google and others announce iterative quantum computing hardware, they're still not being used for any practical purposes. That might change with today's announcement from Microsoft and Quantinuum, who say they've developed the most error-free quantum computing system yet.

While classical computers and electronics rely on binary bits as their basic unit of information (they can be either on or off), quantum computers work with qubits, which can exist in a superposition of two states at the same time. The trouble with qubits is that they're prone to error, which is the main reason today's quantum computers (known as Noisy Intermediate Scale Quantum [NISQ] computers) are just used for research and experimentation.

Microsoft's solution was to group physical qubits into virtual qubits, which allows it to apply error diagnostics and correction without destroying them, and run it all over Quantinuum's hardware. The result was an error rate that was 800 times better than relying on physical qubits alone. Microsoft claims it was able to run more than 14,000 experiments without any errors.

According to Jason Zander, EVP of Microsoft's Strategic Missions and Technologies division, this achievement could finally bring us to "Level 2 Resilient" quantum computing, which would be reliable enough for practical applications.

"The task at hand for the entire quantum ecosystem is to increase the fidelity of qubits and enable fault-tolerant quantum computing so that we can use a quantum machine to unlock solutions to previously intractable problems," Zander wrote in a blog post today. "In short, we need to transition to reliable logical qubits — created by combining multiple physical qubits together into logical ones to protect against noise and sustain a long (i.e., resilient) computation."

Microsoft's announcement is a "strong result," according to Aram Harrow, a professor of physics at MIT focusing on quantum information and computing. "The Quantinuum system has impressive error rates and control, so it was plausible that they could do an experiment like this, but it's encouraging to see that it worked," he said in an e-mail to Engadget. "Hopefully they'll be able to keep maintaining or even improving the error rate as they scale up."

Microsoft Quantum Computing
Microsoft

Researchers will be able to get a taste of Microsoft's reliable quantum computing via Azure Quantum Elements in the next few months, where it will be available as a private preview. The goal is to push even further to Level 3 quantum supercomputing, which will theoretically be able to tackle incredibly complex issues like climate change and exotic drug research. It's unclear how long it'll take to actually reach that point, but for now, at least we're moving one step closer towards practical quantum computing.

"Getting to a large-scale fault-tolerant quantum computer is still going to be a long road," Professor Harrow wrote. "This is an important step for this hardware platform. Along with the progress on neutral atoms, it means that the cold atom platforms are doing very well relative to their superconducting qubit competitors."

This article originally appeared on Engadget at https://www.engadget.com/microsoft-may-have-finally-made-quantum-computing-useful-164501302.html?src=rss

Microsoft may have finally made quantum computing useful

The dream of quantum computing has always been exciting: What if we could build a machine working at the quantum level that could tackle complex calculations exponentially faster than a computer limited by classical physics? But despite seeing IBM, Google and others announce iterative quantum computing hardware, they're still not being used for any practical purposes. That might change with today's announcement from Microsoft and Quantinuum, who say they've developed the most error-free quantum computing system yet.

While classical computers and electronics rely on binary bits as their basic unit of information (they can be either on or off), quantum computers work with qubits, which can exist in a superposition of two states at the same time. The trouble with qubits is that they're prone to error, which is the main reason today's quantum computers (known as Noisy Intermediate Scale Quantum [NISQ] computers) are just used for research and experimentation.

Microsoft's solution was to group physical qubits into virtual qubits, which allows it to apply error diagnostics and correction without destroying them, and run it all over Quantinuum's hardware. The result was an error rate that was 800 times better than relying on physical qubits alone. Microsoft claims it was able to run more than 14,000 experiments without any errors.

According to Jason Zander, EVP of Microsoft's Strategic Missions and Technologies division, this achievement could finally bring us to "Level 2 Resilient" quantum computing, which would be reliable enough for practical applications.

"The task at hand for the entire quantum ecosystem is to increase the fidelity of qubits and enable fault-tolerant quantum computing so that we can use a quantum machine to unlock solutions to previously intractable problems," Zander wrote in a blog post today. "In short, we need to transition to reliable logical qubits — created by combining multiple physical qubits together into logical ones to protect against noise and sustain a long (i.e., resilient) computation."

Microsoft's announcement is a "strong result," according to Aram Harrow, a professor of physics at MIT focusing on quantum information and computing. "The Quantinuum system has impressive error rates and control, so it was plausible that they could do an experiment like this, but it's encouraging to see that it worked," he said in an e-mail to Engadget. "Hopefully they'll be able to keep maintaining or even improving the error rate as they scale up."

Microsoft Quantum Computing
Microsoft

Researchers will be able to get a taste of Microsoft's reliable quantum computing via Azure Quantum Elements in the next few months, where it will be available as a private preview. The goal is to push even further to Level 3 quantum supercomputing, which will theoretically be able to tackle incredibly complex issues like climate change and exotic drug research. It's unclear how long it'll take to actually reach that point, but for now, at least we're moving one step closer towards practical quantum computing.

"Getting to a large-scale fault-tolerant quantum computer is still going to be a long road," Professor Harrow wrote. "This is an important step for this hardware platform. Along with the progress on neutral atoms, it means that the cold atom platforms are doing very well relative to their superconducting qubit competitors."

This article originally appeared on Engadget at https://www.engadget.com/microsoft-may-have-finally-made-quantum-computing-useful-164501302.html?src=rss

Apple Silicon has a hardware-level exploit that could leak private data

A team of university security researchers has found a chip-level exploit in Apple Silicon Macs. The group says the flaw can bypass the computer’s encryption and access its security keys, exposing the Mac’s private data to hackers. The silver lining is the exploit would require you to circumvent Apple’s Gatekeeper protections, install a malicious app and then let the software run for as long as 10 hours (along with a host of other complex conditions), which reduces the odds you’ll have to worry about the threat in the real world.

The exploit originates in a part of Apple’s M-series chips called Data Memory-Dependent Prefetchers (DMPs). DMPs make the processors more efficient by preemptively caching data. The DMPs treat data patterns as directions, using them to guess what information they need to access next. This reduces turnarounds and helps lead to reactions like “seriously fast,” often used to describe Apple Silicon.

The researchers discovered that attackers can use the DMP to bypass encryption. “Through new reverse engineering, we find that the DMP activates on behalf of potentially any program, and attempts to dereference any data brought into cache that resembles a pointer,” the researchers wrote. (“Pointers” are addresses or directions signaling where to find specific data.) “This behavior places a significant amount of program data at risk.”

“This paper shows that the security threat from DMPs is significantly worse than previously thought and demonstrates the first end-to-end attacks on security-critical software using the Apple m-series DMP,” the group wrote.

The researchers named the attack GoFetch, and they created an app that can access a Mac’s secure data without even requiring root access. Ars Technica Security Editor Dan Goodin explains, “M-series chips are divided into what are known as clusters. The M1, for example, has two clusters: one containing four efficiency cores and the other four performance cores. As long as the GoFetch app and the targeted cryptography app are running on the same performance cluster—even when on separate cores within that cluster — GoFetch can mine enough secrets to leak a secret key.”

The details are highly technical, but Ars Technica’s write-up is worth a read if you want to venture much further into the weeds.

But there are two key takeaways for the layperson: Apple can’t do much to fix existing chips with software updates (at least without significantly slowing down Apple Silicon’s trademark performance), and as long as you have Apple’s Gatekeeper turned on (the default), you won’t likely install malicious apps in the first place. Gatekeeper only allows apps from the Mac App Store and non-App Store installations from Apple registered developers. (You may want to be extra cautious when manually approving apps from unregistered developers in macOS security settings.) If you don’t install malicious apps outside those confines, the odds appear quite low this will ever affect your M-series Mac. 

This article originally appeared on Engadget at https://www.engadget.com/apple-silicon-has-a-hardware-level-exploit-that-could-leak-private-data-174741269.html?src=rss

The Morning After: Apple explains how third-party app stores will work in Europe

Apple is making major changes to the App Store in Europe in response to new European Union laws. Beginning in March, Apple will allow users in the EU to download apps and make purchases from outside its App Store. These changes are already being stress-tested in the iOS 17.4 beta.

Developers will be able to take payments and distribute apps from outside the App Store for the first time. Apple will still enforce a review process for apps that don’t come through its store, but it will be “focused on platform integrity and protecting users” from things like malware. The company warns it has less chance of addressing other risks like scams, abuse and harmful content.

Apple is also changing its commission structure, so developers will pay 17 percent on subscriptions and in-app purchases, reducing the fee to 10 percent for “most developers” after the first year. The company is tacking on a new three percent “payment processing” fee for transactions through its store, and there’s a new €0.50 “core technology fee” for all app downloads after the first million installations.

That’s a lot of new money numbers to process, and it could shake out differently for different developers. Apple says the new fee structure will result in most developers paying the company less, since the core technology fee will have the greatest impact on larger developers.

This all means that yes, Fortnite is returning.

— Mat Smith

​​

The biggest stories you might have missed

The FTC is investigating Microsoft, Amazon and Alphabet’s investments into AI startups

Budget retailer Newegg just started selling refurbished electronics

NASA’s Ingenuity Helicopter has flown on Mars for the final time

MIT researchers have developed a rapid 3D-printing technique that uses liquid metal

​​You can get these reports delivered daily direct to your inbox. Subscribe right here!

Microsoft launches its metaverse-styled virtual meeting platform

Mesh is a place for your avatars to float around.

TMA
Microsoft

Microsoft has announced the launch of Mesh, a feature for employees’ avatars to meet in the same place, even if the actual people are spread out. The virtual connection platform is powered through Microsoft Teams. Currently, Microsoft’s Mesh is only available on desktop PCs and Meta Quest VR devices (if employees want a more immersive experience). Microsoft is offering a six-month free trial to anyone with a business or enterprise plan. But no legs, it seems.

Continue reading.

The Ray-Ban Meta smart glasses’ new AI powers are impressive

And worrying.

When we first reviewed the Ray-Ban Meta smart glasses, multimodal AI wasn’t ready. The feature enables the glasses to respond to queries based on what you’re looking at. Meta has now made multimodal search available for “early access.” Multimodal search is impressive, if not entirely useful yet. But Meta AI’s grasp of real-time information is shaky at best.

We tried asking it to help pick out clothes, like Mark Zuckerberg did in a recent Instagram post, and were underwhelmed. Then again, it may work best for a guy who famously wore the exact same shirt every day for years.

Continue reading.

Elon Musk confirms new low-cost Tesla model

Coming in 2025.

Elon Musk has confirmed a “next-generation low-cost” Tesla EV is in the works and is “optimistic” it’ll arrive in the second half of 2025, he said in an earnings call yesterday. He also promised “a revolutionary manufacturing system” for the vehicle. Reuters reported that the new vehicle would be a small crossover called Redwood. Musk previously stated the automaker is working on two new EV models that could sell up to five million per year, combined.

Musk said the company’s new manufacturing technique will be “very hard to copy” because “you have to copy the machine that makes the machine that makes the machine... manufacturing inception.”

I just audibly groaned reading that.

Continue reading. 

Japan’s lunar spacecraft landed upside down on the moon

It collected some data before shutting down.

TMA
JAXA

This picture just makes me sad.

Continue reading.

This article originally appeared on Engadget at https://www.engadget.com/the-morning-after-apple-explains-how-third-party-app-stores-will-work-in-europe-121528606.html?src=rss

Why humans can’t use natural language processing to speak with the animals

We’ve been wondering what goes on inside the minds of animals since antiquity. Dr. Doolittle’s talent was far from novel when it was first published in 1920; Greco-Roman literature is lousy with speaking animals, writers in Zhanguo-era China routinely ascribed language to certain animal species and they’re also prevalent in Indian, Egyptian, Hebrew and Native American storytelling traditions.

Even today, popular Western culture toys with the idea of talking animals, though often through a lens of technology-empowered speech rather than supernatural force. The dolphins from both Seaquest DSV and Johnny Mnemonic communicated with their bipedal contemporaries through advanced translation devices, as did Dug the dog from Up.

We’ve already got machine-learning systems and natural language processors that can translate human speech into any number of existing languages, and adapting that process to convert animal calls into human-interpretable signals doesn’t seem that big of a stretch. However, it turns out we’ve got more work to do before we can converse with nature.

What is language?

“All living things communicate,” an interdisciplinary team of researchers argued in 2018’s On understanding the nature and evolution of social cognition: a need for the study of communication. “Communication involves an action or characteristic of one individual that influences the behavior, behavioral tendency or physiology of at least one other individual in a fashion typically adaptive to both.”

From microbes, fungi and plants on up the evolutionary ladder, science has yet to find an organism that exists in such extreme isolation as to not have a natural means of communicating with the world around it. But we should be clear that “communication” and “language” are two very different things.

“No other natural communication system is like human language,” argues the Linguistics Society of America. Language allows us to express our inner thoughts and convey information, as well as request or even demand it. “Unlike any other animal communication system, it contains an expression for negation — what is not the case … Animal communication systems, in contrast, typically have at most a few dozen distinct calls, and they are used only to communicate immediate issues such as food, danger, threat, or reconciliation.”

That’s not to say that pets don’t understand us. “We know that dogs and cats can respond accurately to a wide range of human words when they have prior experience with those words and relevant outcomes,” Dr. Monique Udell, Director of the Human-Animal Interaction Laboratory at Oregon State University, told Engadget. “In many cases these associations are learned through basic conditioning,” Dr. Udell said — like when we yell “dinner” just before setting out bowls of food.

Whether or not our dogs and cats actually understand what “dinner” means outside of the immediate Pavlovian response — remains to be seen. “We know that at least some dogs have been able to learn to respond to over 1,000 human words (labels for objects) with high levels of accuracy,” Dr. Udell said. “Dogs currently hold the record among non-human animal species for being able to match spoken human words to objects or actions reliably,” but it’s “difficult to know for sure to what extent dogs understand the intent behind our words or actions.”

Dr. Udell continued: “This is because when we measure a dog or cat’s understanding of a stimulus, like a word, we typically do so based on their behavior.” You can teach a dog to sit with both English and German commands, but “if a dog responds the same way to the word ‘sit’ in English and in German, it is likely the simplest explanation — with the fewest assumptions — is that they have learned that when they sit in the presence of either word then there is a pleasant consequence.”

Tea Stražičić for Engadget/Silica Magazine

Hush, the computers are speaking

Natural Language Programming (NLP) is the branch of AI that enables computers and algorithmic models to interpret text and speech, including the speaker’s intent, the same way we meatsacks do. It combines computational linguistics, which models the syntax, grammar and structure of a language, and machine-learning models, which “automatically extract, classify, and label elements of text and voice data and then assign a statistical likelihood to each possible meaning of those elements,” according to IBM. NLP underpins the functionality of every digital assistant on the market. Basically any time you’re speaking at a “smart” device, NLP is translating your words into machine-understandable signals and vice versa.

The field of NLP research has undergone a significant evolution in recent years, as its core systems have migrated from older Recurrent and Convoluted Neural Networks towards Google’s Transformer architecture, which greatly increases training efficiency.

Dr. Noah D. Goodman, Associate Professor of Psychology and Computer Science, and Linguistics at Stanford University, told Engadget that, with RNNs, “you'll have to go time-step by time-step or like word by word through the data and then do the same thing backward.” In contrast, with a transformer, “you basically take the whole string of words and push them through the network at the same time.”

“It really matters to make that training more efficient,” Dr. Goodman continued. “Transformers, they're cool … but by far the biggest thing is that they make it possible to train efficiently and therefore train much bigger models on much more data.”

Talkin’ jive ain’t just for turkeys

While many species’ communication systems have been studied in recent years — most notably cetaceans like whales and dolphins, but also the southern pied babbler, for its song’s potentially syntactic qualities, and vervet monkeys’ communal predator warning system — none have shown the sheer degree of complexity as the call of the avian family Paridae: the chickadees, tits and titmice.

Dr. Jeffrey Lucas, professor in the Biological Sciences department at Purdue University, told Engadget that the Paridae call “is one of the most complicated vocal systems that we know of. At the end of the day, what the [field’s voluminous number of research] papers are showing is that it's god-awfully complicated, and the problem with the papers is that they grossly under-interpret how complicated [the calls] actually are.”

These parids often live in socially complex, heterospecific flocks, mixed groupings that include multiple songbird and woodpecker species. The complexity of the birds’ social system is correlated with an increased diversity in communications systems, Dr. Lucas said. “Part of the reason why that correlation exists is because, if you have a complex social system that's multi-dimensional, then you have to convey a variety of different kinds of information across different contexts. In the bird world, they have to defend their territory, talk about food, integrate into the social system [and resolve] mating issues.”

The chickadee call consist of at least six distinct notes set in an open-ended vocal structure, which is both monumentally rare in non-human communication systems and the reason for the Chickadee’s call complexity. An open-ended vocal system means that “increased recording of chick-a-dee calls will continually reveal calls with distinct note-type compositions,” explained the 2012 study, Linking social complexity and vocal complexity: a parid perspective. “This open-ended nature is one of the main features the chick-a-dee call shares with human language, and one of the main differences between the chick-a-dee call and the finite song repertoires of most songbird species.”

Dolphin translation by Tea Stražičić
Tea Stražičić for Engadget/Silica Magazine

Dolphins have no need for kings

Training language models isn’t simply a matter of shoving in large amounts of data. When training a model to translate an unknown language into what you’re speaking, you need to have at least a rudimentary understanding of how the the two languages correlate with one another so that the translated text retains the proper intent of the speaker.

“The strongest kind of data that we could have is what's called a parallel corpus,” Dr. Goodman explained, which is basically having a Rosetta Stone for the two tongues. In that case, you’d simply have to map between specific words, symbols and phonemes in each language — figure out what means “river” or “one bushel of wheat” in each and build out from there.

Without that perfect translation artifact, so long as you have large corpuses of data for both languages, “it's still possible to learn a translation between the languages, but it hinges pretty crucially on the idea that the kind of latent conceptual structure,” Dr. Goodman continued, which assumes that both culture’s definitions of “one bushel of wheat” are generally equivalent.

Goodman points to the word pairs ’man and woman’ and ’king and queen’ in English. “The structure, or geometry, of that relationship we expect English, if we were translating into Hungarian, we would also expect those four concepts to stand in a similar relationship,” Dr. Goodman said. “Then effectively the way we'll learn a translation now is by learning to translate in a way that preserves the structure of that conceptual space as much as possible.”

Having a large corpus of data to work with in this situation also enables unsupervised learning techniques to be used to “extract the latent conceptual space,” Dr. Goodman said, though that method is more resource intensive and less efficient. However, if all you have is a large corpus in only one of the languages, you’re generally out of luck.

“For most human languages we assume the [quartet concepts] are kind of, sort of similar, like, maybe they don't have ‘king and queen’ but they definitely have ‘man and woman,’” Dr. Goodman continued. ”But I think for animal communication, we can't assume that dolphins have a concept of ‘king and queen’ or whether they have ‘men and women.’ I don't know, maybe, maybe not.”

And without even that rudimentary conceptual alignment to work from, discerning the context and intent of a animal’s call — much less, deciphering the syntax, grammar and semantics of the underlying communication system — becomes much more difficult. “You're in a much weaker position,” Dr. Goodman said. “If you have the utterances in the world context that they're uttered in, then you might be able to get somewhere.”

Basically, if you can obtain multimodal data that provides context for the recorded animal call — the environmental conditions, time of day or year, the presence of prey or predator species, etc — you can “ground” the language data into the physical environment. From there you can “assume that English grounds into the physical environment in the same way as this weird new language grounds into the physical environment’ and use that as a kind of bridge between the languages.”

Unfortunately, the challenge of translating bird calls into English (or any other human language) is going to fall squarely into the fourth category. This means we’ll need more data and a lot of different types of data as we continue to build our basic understanding of the structures of these calls from the ground up. Some of those efforts are already underway.

The Dolphin Communication Project, for example, employs a combination “mobile video/acoustic system” to capture both the utterances of wild dolphins and their relative position in physical space at that time to give researchers added context to the calls. Biologging tags — animal-borne sensors affixed to hide, hair, or horn that track the locations and conditions of their hosts — continue to shrink in size while growing in both capacity and capability, which should help researchers gather even more data about these communities.

What if birds are just constantly screaming about the heat?

Even if we won’t be able to immediately chat with our furred and feathered neighbors, gaining a better understanding of how they at least talk to each other could prove valuable to conservation efforts. Dr. Lucas points to a recent study he participated in that found environmental changes induced by climate change can radically change how different bird species interact in mixed flocks. “What we showed was that if you look across the disturbance gradients, then everything changes,” Dr. Lucas said. “What they do with space changes, how they interact with other birds changes. Their vocal systems change.”

“The social interactions for birds in winter are extraordinarily important because you know, 10 gram bird — if it doesn't eat in a day, it's dead,” Dr. Lucas continued. “So information about their environment is extraordinarily important. And what those mixed species flocks do is to provide some of that information.”

However that network quickly breaks down as the habitat degrades and in order to survive “they have to really go through fairly extreme changes in behavior and social systems and vocal systems … but that impacts fertility rates, and their ability to feed their kids and that sort of thing.”

Better understanding their calls will help us better understand their levels of stress, which can serve both modern conservation efforts and agricultural ends. “The idea is that we can get an idea about the level of stress in [farm animals], then use that as an index of what's happening in the barn and whether we can maybe even mitigate that using vocalizations,” Dr. Lucas said. “AI probably is going to help us do this.”

“Scientific sources indicate that noise in farm animal environments is a detrimental factor to animal health,” Jan Brouček of the Research Institute for Animal Production Nitra, observed in 2014. “Especially longer lasting sounds can affect the health of animals. Noise directly affects reproductive physiology or energy consumption.” That continuous drone is thought to also indirectly impact other behaviors including habitat use, courtship, mating, reproduction and the care of offspring. 

Conversely, 2021’s research, The effect of music on livestock: cattle, poultry and pigs, has shown that playing music helps to calm livestock and reduce stress during times of intensive production. We can measure that reduction in stress based on what sorts of happy sounds those animals make. Like listening to music in another language, we can get with the vibe, even if we can't understand the lyrics

This article originally appeared on Engadget at https://www.engadget.com/why-humans-cant-use-natural-language-processing-to-speak-with-the-animals-143050169.html?src=rss

Hitting the Books: Why AI won’t be taking our cosmology jobs

The problem with studying the universe around us is that it is simply too big. The stars overhead remain too far away to interact with directly, so we are relegated to testing our theories on the formation of the galaxies based on observable data. 

Simulating these celestial bodies on computers has proven an immensely useful aid in wrapping our heads around the nature of reality and, as Andrew Pontzen explains in his new book, The Universe in a Box: Simulations and the Quest to Code the Cosmos, recent advances in supercomputing technology are further revolutionizing our capability to model the complexities of the cosmos (not to mention myriad Earth-based challenges) on a smaller scale. In the excerpt below, Pontzen looks at the recent emergence of astronomy-focused AI systems, what they're capable of accomplishing in the field and why he's not too worried about losing his job to one.  

white background green wireframe of a box with orange scatterplot inside and around, black text
Riverhead Books

Adapted from THE UNIVERSE IN A BOX: Simulations and the Quest to Code the Cosmos by Andrew Pontzen published on June 13, 2023 by Riverhead, an imprint of Penguin Publishing Group, a division of Penguin Random House LLC. Copyright © 2023 Andrew Pontzen.


As a cosmologist, I spend a large fraction of my time working with supercomputers, generating simulations of the universe to compare with data from real telescopes. The goal is to understand the effect of mysterious substances like dark matter, but no human can digest all the data held on the universe, nor all the results from simulations. For that reason, artificial intelligence and machine learning is a key part of cosmologists’ work.

Consider the Vera Rubin Observatory, a giant telescope built atop a Chilean mountain and designed to repeatedly photograph the sky over the coming decade. It will not just build a static picture: it will particularly be searching for objects that move (asteroids and comets), or change brightness (flickering stars, quasars and supernovae), as part of our ongoing campaign to understand the ever-changing cosmos. Machine learning can be trained to spot these objects, allowing them to be studied with other, more specialized telescopes. Similar techniques can even help sift through the changing brightness of vast numbers of stars to find telltale signs of which host planets, contributing to the search for life in the universe. Beyond astronomy there are no shortage of scientific applications: Google’s artificial intelligence subsidiary DeepMind, for instance, has built a network that can outperform all known techniques for predicting the shapes of proteins starting from their molecular structure, a crucial and difficult step in understanding many biological processes.

These examples illustrate why scientific excitement around machine learning has built during this century, and there have been strong claims that we are witnessing a scientific revolution. As far back as 2008, Chris Anderson wrote an article for Wired magazine that declared the scientific method, in which humans propose and test specific hypotheses, obsolete: ‘We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.’

I think this is taking things too far. Machine learning can simplify and improve certain aspects of traditional scientific approaches, especially where processing of complex information is required. Or it can digest text and answer factual questions, as illustrated by systems like ChatGPT. But it cannot entirely supplant scientific reasoning, because that is about the search for an improved understanding of the universe around us. Finding new patterns in data or restating existing facts are only narrow aspects of that search. There is a long way to go before machines can do meaningful science without any human oversight.

To understand the importance of context and understanding in science, consider the case of the OPERA experiment which in 2011 seemingly determined that neutrinos travel faster than the speed of light. The claim is close to a physics blasphemy, because relativity would have to be rewritten; the speed limit is integral to its formulation. Given the enormous weight of experimental evidence that supports relativity, casting doubt on its foundations is not a step to be taken lightly.

Knowing this, theoretical physicists queued up to dismiss the result, suspecting the neutrinos must actually be traveling slower than the measurements indicated. Yet, no problem with the measurement could be found – until, six months later, OPERA announced that a cable had been loose during their experiment, accounting for the discrepancy. Neutrinos travelled no faster than light; the data suggesting otherwise had been wrong.

Surprising data can lead to revelations under the right circumstances. The planet Neptune was discovered when astronomers noticed something awry with the orbits of the other planets. But where a claim is discrepant with existing theories, it is much more likely that there is a fault with the data; this was the gut feeling that physicists trusted when seeing the OPERA results. It is hard to formalize such a reaction into a simple rule for programming into a computer intelligence, because it is midway between the knowledge-recall and pattern-searching worlds.

The human elements of science will not be replicated by machines unless they can integrate their flexible data processing with a broader corpus of knowledge. There is an explosion of different approaches toward this goal, driven in part by the commercial need for computer intelligences to explain their decisions. In Europe, if a machine makes a decision that impacts you personally – declining your application for a mortgage, maybe, or increasing your insurance premiums, or pulling you aside at an airport – you have a legal right to ask for an explanation. That explanation must necessarily reach outside the narrow world of data in order to connect to a human sense of what is reasonable or unreasonable.

Problematically, it is often not possible to generate a full account of how machine-learning systems reach a particular decision. They use many different pieces of information, combining them in complex ways; the only truly accurate description is to write down the computer code and show the way the machine was trained. That is accurate but not very explanatory. At the other extreme, one might point to an obvious factor that dominated a machine’s decision: you are a lifelong smoker, perhaps, and other lifelong smokers died young, so you have been declined for life insurance. That is a more useful explanation, but might not be very accurate: other smokers with a different employment history and medical record have been accepted, so what precisely is the difference? Explaining decisions in a fruitful way requires a balance between accuracy and comprehensibility.

In the case of physics, using machines to create digestible, accurate explanations which are anchored in existing laws and frameworks is an approach in its infancy. It starts with the same demands as commercial artificial intelligence: the machine must not just point to its decision (that it has found a new supernova, say) but also give a small, digestible amount of information about why it has reached that decision. That way, you can start to understand what it is in the data that has prompted a particular conclusion, and see whether it agrees with your existing ideas and theories of cause and effect. This approach has started to bear fruit, producing simple but useful insights into quantum mechanics, string theory, and (from my own collaborations) cosmology.

These applications are still all framed and interpreted by humans. Could we imagine instead having the computer framing its own scientific hypotheses, balancing new data with the weight of existing theories, and going on to explain its discoveries by writing a scholarly paper without any human assistance? This is not Anderson’s vision of the theory-free future of science, but a more exciting, more disruptive and much harder goal: for machines to build and test new theories atop hundreds of years of human insight.

This article originally appeared on Engadget at https://www.engadget.com/hitting-the-books-universe-in-a-box-andrew-pontzen-riverhead-books-153005483.html?src=rss

Hitting the Books: Why a Dartmouth professor coined the term ‘artificial intelligence’

If the Wu-Tang produced it in '23 instead of '93, they'd have called it D.R.E.A.M. — because data rules everything around me. Where once our society brokered power based on strength of our arms and purse strings, the modern world is driven by data empowering algorithms to sort, silo and sell us out. These black box oracles of imperious and imperceptible decision-making deign who gets home loans, who gets bail, who finds love and who gets their kids taken from them by the state

In their new book, How Data Happened: A History from the Age of Reason to the Age of Algorithms, which builds off their existing curriculum, Columbia University Professors Chris Wiggins and Matthew L Jones examine how data is curated into actionable information and used to shape everything from our political views and social mores to our military responses and economic activities. In the excerpt below, Wiggins and Jones look at the work of mathematician John McCarthy, the junior Dartmouth professor who single-handedly coined the term "artificial intelligence"... as part of his ploy to secure summer research funding.

White background with multicolored blocks streaming down from the top like a Tetris board to fill in
WW Norton

Excerpted from How Data Happened: A History from the Age of Reason to the Age of Algorithms by Chris Wiggins and Matthew L Jones. Published by WW Norton. Copyright © 2023 by Chris Wiggins and Matthew L Jones. All rights reserved.


Confecting “Artificial Intelligence”

A passionate advocate of symbolic approaches, the mathematician John McCarthy is often credited with inventing the term “artificial intelligence,” including by himself: “I invented the term artificial intelligence,” he explained, “when we were trying to get money for a summer study” to aim at “the long term goal of achieving human level intelligence.” The “summer study” in question was titled “The Dartmouth Summer Research Project on Artificial Intelligence,” and the funding requested was from the Rockefeller Foundation. At the time a junior professor of mathematics at Dartmouth, McCarthy was aided in his pitch to Rockefeller by his former mentor Claude Shannon. As McCarthy describes the term’s positioning, “Shannon thought that artificial intelligence was too flashy a term and might attract unfavorable notice.” However, McCarthy wanted to avoid overlap with the existing field of “automata studies” (including “nerve nets” and Turing machines) and took a stand to declare a new field. “So I decided not to fly any false flags anymore.” The ambition was enormous; the 1955 proposal claimed “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.” McCarthy ended up with more brain modelers than axiomatic mathematicians of the sort he wanted at the 1956 meeting, which came to be known as the Dartmouth Workshop. The event saw the coming together of diverse, often contradictory efforts to make digital computers perform tasks considered intelligent, yet as historian of artificial intelligence Jonnie Penn argues, the absence of psychological expertise at the workshop meant that the account of intelligence was “informed primarily by a set of specialists working outside the human sciences.” Each participant saw the roots of their enterprise differently. McCarthy reminisced, “anybody who was there was pretty stubborn about pursuing the ideas that he had before he came, nor was there, as far as I could see, any real exchange of ideas.”

Like Turing’s 1950 paper, the 1955 proposal for a summer workshop in artificial intelligence seems in retrospect incredibly prescient. The seven problems that McCarthy, Shannon, and their collaborators proposed to study became major pillars of computer science and the field of artificial intelligence:

  1. “Automatic Computers” (programming languages)

  2. “How Can a Computer be Programmed to Use a Language” (natural language processing)

  3. “Neuron Nets” (neural nets and deep learning)

  4. “Theory of the Size of a Calculation” (computational complexity)

  5. “Self-​improvement” (machine learning)

  6. “Abstractions” (feature engineering)

  7. “Randomness and Creativity” (Monte Carlo methods including stochastic learning).

The term “artificial intelligence,” in 1955, was an aspiration rather than a commitment to one method. AI, in this broad sense, involved both discovering what comprises human intelligence by attempting to create machine intelligence as well as a less philosophically fraught effort simply to get computers to perform difficult activities a human might attempt.

Only a few of these aspirations fueled the efforts that, in current usage, became synonymous with artificial intelligence: the idea that machines can learn from data. Among computer scientists, learning from data would be de-​emphasized for generations.

Most of the first half century of artificial intelligence focused on combining logic with knowledge hard-​coded into machines. Data collected from everyday activities was hardly the focus; it paled in prestige next to logic. In the last five years or so, artificial intelligence and machine learning have begun to be used synonymously; it’s a powerful thought-​exercise to remember that it didn’t have to be this way. For the first several decades in the life of artificial intelligence, learning from data seemed to be the wrong approach, a nonscientific approach, used by those who weren’t willing “to just program” the knowledge into the computer. Before data reigned, rules did.

For all their enthusiasm, most participants at the Dartmouth workshop brought few concrete results with them. One group was different. A team from the RAND Corporation, led by Herbert Simon, had brought the goods, in the form of an automated theorem prover. This algorithm could produce proofs of basic arithmetical and logical theorems. But math was just a test case for them. As historian Hunter Heyck has stressed, that group started less from computing or mathematics than from the study of how to understand large bureaucratic organizations and the psychology of the people solving problems within them. For Simon and Newell, human brains and computers were problem solvers of the same genus.

Our position is that the appropriate way to describe a piece of problem-​solving behavior is in terms of a program: a specification of what the organism will do under varying environmental circumstances in terms of certain elementary information processes it is capable of performing... ​Digital computers come into the picture only because they can, by appropriate programming, be induced to execute the same sequences of information processes that humans execute when they are solving problems. Hence, as we shall see, these programs describe both human and machine problem solving at the level of information processes.

Though they provided many of the first major successes in early artificial intelligence, Simon and Newell focused on a practical investigation of the organization of humans. They were interested in human problem-​solving that mixed what Jonnie Penn calls a “composite of early twentieth century British symbolic logic and the American administrative logic of a hyper-​rationalized organization.” Before adopting the moniker of AI, they positioned their work as the study of “information processing systems” comprising humans and machines alike, that drew on the best understanding of human reasoning of the time.

Simon and his collaborators were deeply involved in debates about the nature of human beings as reasoning animals. Simon later received the Nobel Prize in Economics for his work on the limitations of human rationality. He was concerned, alongside a bevy of postwar intellectuals, with rebutting the notion that human psychology should be understood as animal-​like reaction to positive and negative stimuli. Like others, he rejected a behaviorist vision of the human as driven by reflexes, almost automatically, and that learning primarily concerned the accumulation of facts acquired through such experience. Great human capacities, like speaking a natural language or doing advanced mathematics, never could emerge only from experience—​they required far more. To focus only on data was to misunderstand human spontaneity and intelligence. This generation of intellectuals, central to the development of cognitive science, stressed abstraction and creativity over the analysis of data, sensory or otherwise. Historian Jamie Cohen-​Cole explains, “Learning was not so much a process of acquiring facts about the world as of developing a skill or acquiring proficiency with a conceptual tool that could then be deployed creatively.” This emphasis on the conceptual was central to Simon and Newell’s Logic Theorist program, which didn’t just grind through logical processes, but deployed human-​like “heuristics” to accelerate the search for the means to achieve ends. Scholars such as George Pólya investigating how mathematicians solved problems had stressed the creativity involved in using heuristics to solve math problems. So mathematics wasn’t drudgery — ​it wasn’t like doing lots and lots of long division or of reducing large amounts of data. It was creative activity — ​and, in the eyes of its makers, a bulwark against totalitarian visions of human beings, whether from the left or the right. (And so, too, was life in a bureaucratic organization — ​it need not be drudgery in this picture — ​it could be a place for creativity. Just don’t tell that to its employees.)

This article originally appeared on Engadget at https://www.engadget.com/hitting-the-books-how-data-happened-wiggins-jones-ww-norton-143036972.html?src=rss

Is DALL-E’s art borrowed or stolen?

In 1917, Marcel Duchamp submitted a sculpture to the Society of Independent Artists under a false name. Fountain was a urinal, bought from a toilet supplier, with the signature R. Mutt on its side in black paint. Duchamp wanted to see if the society would abide by its promise to accept submissions without censorship or favor. (It did not.) But Duchamp was also looking to broaden the notion of what art is, saying a ready-made object in the right context would qualify. In 1962, Andy Warhol would twist convention with Campbell’s Soup Cans, 32 paintings of soup cans, each one a different flavor. Then, as before, the debate raged about if something mechanically produced – a urinal, or a soup can (albeit hand-painted by Warhol) – counted as art, and what that meant.

Now, the debate has been turned upon its head, as machines can mass-produce unique pieces of art on their own. Generative Artificial Intelligences (GAIs) are systems which create pieces of work that can equal the old masters in technique, if not in intent. But there is a problem, since these systems are trained on existing material, often using content pulled from the internet, from us. Is it right, then, that the AIs of the future are able to produce something magical on the backs of our labor, potentially without our consent or compensation?

The new frontier

The most famous GAI right now is DALL-E 2, Open AI’s system for creating “realistic images and art from a description in natural language.” A user could enter the phrase “teddy bears shopping for groceries in the style of Ukiyo-e,” and the model will produce pictures in that style. Similarly, ask for the bears to be shopping in Ancient Egypt and the images will look more like dioramas from a museum depicting life under the Pharaohs. To the untrained eye, some of these pictures look like they were drawn in 17th-century Japan, or shot at a museum in the 1980s. And these results are coming despite the technology still being at a relatively early stage.

Open AI recently announced that DALL-E 2 would be made available to up to one million users as part of a large-scale beta test. Each user will be able to make 50 generations for free during their first month of use, and then 15 for every subsequent month. (A generation is either the production of four images from a single prompt, or the creation of three more if you choose to edit or vary something that’s already been produced.) Additional 115-credit packages can be bought for $15, and the company says more detailed pricing is likely to come as the product evolves. Crucially, users are entitled to commercialize the images produced with DALL-E, letting them print, sell or otherwise license the pictures borne from their prompts.

Two images of bears in different styles as produced by Generative AI system DALL-E 2.
Open AI

These systems did not, however, develop an eye for a good picture in a vacuum, and each GAI has to be trained. Artificial Intelligence is, after all, a fancy term for what is essentially a way of teaching software how to recognize patterns. “You allow an algorithm to develop that can be improved through experience,” said Ben Hagag, head of research at Darrow, an AI startup looking to improve access to justice. “And by experience I mean examining and finding patterns in data.” “We say to the [system] ‘take a look at this dataset and find patterns,” which then go on to form a coherent view of the data at hand. “The model learns as a baby learns,” he said, so if a baby looked at a 1,000 pictures of a landscape, it would soon understand that the sky – normally oriented across the top of the image – would be blue while land is green.

Hagag cited how Google built its language model by training a system on several gigabytes of text, from the dictionary to examples of the written word. “The model understood the patterns, how the language is built, the syntax and even the hidden structure that even linguists find hard to define,” Hagag said. Now that model is sophisticated enough that “once you give it a few words, it can predict the next few words you’re going to write.” In 2018, Google’s Ajit Varma told The Wall Street Journal that its smart reply feature had been trained on “billions of Gmail messages,” adding that initial tests saw options like ‘I Love You’ and ‘Sent from my iPhone’ offered up since they were so commonly seen in communications.

Developers who do not have the benefit of access to a data set as vast as Google’s need to find data via other means. “Every researcher developing a language model first downloads Wikipedia then adds more,” Hagag said. He added that they are likely to pull down any, and every, piece of available data that they can find. The sassy tweet you sent a few years ago, or that sincere Facebook post, may have been used to train someone’s language model, somewhere. Even Open AI uses social media posts with WebText, a dataset which pulls text from outbound Reddit links which received at least three karma, albeit with Wikipedia references removed.

Guan Wang, CTO of Huski, says that the pulling down of data is “very common.” “Open internet data is the go-to for the majority of AI model training nowadays,” he said. And that it’s the policy of most researchers to get as much data as they can. “When we look for speech data, we will get whatever speech we can get,” he added. This policy of more data-is-more is known to produce less than ideal results, and Ben Hagag cited Riley Newman, former head of data science at Airbnb, who said “better data beats more data,” but Hagag notes that often, “it’s easier to get more data than it is to clean it.”

Grid of images created by CRAIYON's generative AI featuring the King visiting the seat of the Aztec empire.
Craiyon / Daniel Cooper

DALL-E may now be available to a million users, but it’s likely that people’s first experience of a GAI is with its less-fancy sibling. Craiyon, formerly DALL-E Mini, is the brainchild of French developer Boris Dayma, who started work on his model after reading Open AI’s original DALL-E paper. Not long after, Google and the AI development community HuggingFace ran a hackathon for people to build quick-and-dirty machine learning models. “I suggested, ‘Hey, let’s replicate DALL-E. I have no clue how to do that, but let’s do it,” said Dayma. The team would go on to win the competition, albeit with a rudimentary, rough-around-the-edges version of the system. “The image [it produced] was clear. It wasn’t great, but it wasn’t horrible,” he added. But unlike the full-fat DALLl-E, Dayma’s team was focused on slimming the model down so that it could work on comparatively low-powered hardware.

Dayma’s original model was fairly open about which image sets it would pull from, often with problematic consequences. “In early models, still in some models, you ask for a picture – for example mountains under the snow,” he said, “and then on top of it, the Shutterstock or Alamy watermark.” It’s something many AI researchers have found, with GAIs being trained on those image libraries public-facing image catalogs, which are covered in anti-piracy watermarks.

Dayma said that the model had erroneously learned that high-quality landscape images typically had a watermark from one of those public photo libraries, and removed them from his model. He added that some early results also output not-safe-for-work responses, forcing him to make further refinements to his initial training set. Dayma added that he had to do a lot of the sorting through the data himself, and said that “a lot of the images on the internet are bad.”

But it’s not just Dayma who has noticed the regular appearance of a Shutterstock watermark, or something a lot like it, popping up in AI-generated art. Which begs the question, are people just ripping off Shutterstock’s public-facing library to train their AI? It appears that one of the causes is Google, which has indexed a whole host of watermarked Shutterstock images as part of its Conceptual Captions framework. Delve into the data, and you’ll see a list of image URLs which can be used to train your own AI model, thousands of which are from Shutterstock. Shutterstock declined to comment on the practice for this article.

A Google spokesperson said that they don’t “believe this is an issue for the datasets we’re involved with.” They also quoted from this Creative Commons report, saying that “the use of works to train AI should be considered non-infringing by default, assuming that access to the copyright works was lawful at the point of input.” That is despite the fact that Shutterstock itself expressly forbids visitors to its site from using “any data mining, robots or similar data and/or image gathering and extraction methods in connection with the site or Shutterstock content.”

Alex Cardinelli, CEO at AI startup Article Forge, says that he sees no issue with models being trained on copyrighted texts, “so long as the material itself was lawfully acquired and the model does not plagiarize the material.” He compared the situation to a student reading the work of an established author, who may “learn the author’s styles or patterns, and later find applicable places to reuse those concepts.” He added that so long as a model isn’t “copying and pasting from their training data,” then it simply repeats a pattern that has appeared since the written word began.

Dayma says that, at present, hundreds of thousands, if not millions of people are playing with his system on a daily basis. That all incurs a cost, both for hosting and processing, which he couldn’t sustain from his own pocket for very long, especially since it remains a “hobby.” Consequently, the site runs ads at the top and bottom of its page, between which you’ll get a grid of nine surreal images. “For people who use the site commercially, we could always charge for it,” he suggested. But he admitted his knowledge of US copyright law wasn’t detailed enough to be able to discuss the impact of his own model, or others in the space. This is the situation that Open AI also perhaps finds itself dealing with given that it is now allowing users to sell pictures created by DALL-E.

The law of art

The legal situation is not a particularly clear one, especially not in the US, where there have been few cases covering Text and Data Mining, or TDM. This is the technical term for the training of an AI by plowing through a vast trove of source material looking for patterns. In the US, TDM is broadly covered by Fair Use, which permits various forms of copying and scanning for the purposes of allowing access. This isn’t, however, a settled subject, but there is one case that people believe sets enough of a precedent to enable the practice.

Authors Guild v. Google (2015) was brought by a body representing authors, which accused Google of digitizing printed works that were still held under copyright. The initial purpose of the work was, in partnership with several libraries, to catalog and database the texts to make research easier. Authors, however, were concerned that Google was violating copyright, and even if it wasn’t making the text of a still-copyrighted work available publicly, it was prohibited from scanning and storing it in the first place. Eventually, the Second Circuit ruled in favor of Google, saying that digitizing copyright-protected work did not constitute copyright infringement.

Rahul Telang is Professor of Information Systems at Carnegie Mellon University, and an expert in digitization and copyright. He says that the issue is “multi-dimensional,” and that the Google Books case offers a “sort of precedent” but not a solid one. “I wish I could tell you there was a clear answer,” he said, “but it’s a complicated issue,” especially around works that may or may not be transformative. And until there is a solid case, it’s likely that courts will apply the usual tests for copyright infringement, around if a work supplants the need for the original, and if it causes economic harm to the original rights holder. Telang believes that countries will look to loosen restrictions on TDM wherever possible in order to boost domestic AI research.

The US Copyright Office says that it will register an “original work of authorship, provided that the work was created by a human being.” This is due to the old precedent that the only thing worth copyrighting is “the fruits of intellectual labor,” produced by the “creative powers of the mind.” In 1991, this principle was affirmed by a case of purloined listings from one phone book company by another. The Supreme Court held that while effort may have gone into the compilation of a phone book, the information contained therein was not an original work, created by a human being, and so therefore couldn’t be copyrighted. It will be interesting to see if there are any challenges made to users trying to license or sell a DALL-E work for this very reason.

Rob Holmes, a private investigator who works on copyright and trademark infringement with many major tech companies and fashion brands, believes that there is a reticence across the industry to pursue a landmark case that would settle the issue around TDM and copyright. “Legal departments get very little money,” he said. “All these different brands, and everyone’s waiting for the other brand, or IP owner, to begin the lawsuit. And when they do, it’s because some senior VP or somebody at the top decided to spend the money, and once that happens, there’s a good year of planning the litigation.” That often gives smaller companies plenty of time to either get their house in order, get big enough to be worth a lawsuit or go out of business.

“Setting a precedent as a sole company costs a lot of money,” Holmes said, but brands will move fast if there’s an immediate risk to profitability. Designer brand Hermés, for instance, is suing an artist named Mason Rothschild, who is producing MetaBirkins NFTs. These are styled images on a design reminiscent of Hermés’ famous Birkin handbag, something the French fashion house says is nothing more than an old-fashioned rip-off. This, too, is likely to have ramifications for the industry as it wrestles with philosophical questions of what work is sufficiently transformational as to prevent an accusation of piracy.

Artists are also able to upload their own work to DALL-E and then generate recreations in their own style. I spoke to one artist, who asked not to be named or otherwise described for fear of being identified and suffering reprisals. They showed me examples of their work alongside recreations made by DALLl-E, which while crude, were still close enough to look like the real thing. They said that, on this evidence alone, their livelihood as a working artist is at risk, and that the creative industries writ large are “doomed.”

Article Forge CEO Alex Cardinelli says that this situation, again, has historical precedent with the industrial revolution. He says that, unlike then, society has a collective responsibility to “make sure that anyone who is displaced is adequately supported.” And that anyone in the AI space should be backing a “robust safety net,” including “universal basic income and free access to education,” which he says is the “bare minimum” a society in the midst of such a revolution should offer.

Trained on your data

AIs are already in use. Microsoft, for instance, partnered with OpenAI to harness GPT-3 as a way to build code. In 2021, the company announced that it would integrate the system into its low-code app-development platform to help people build apps and tools for Microsoft products. Duolingo uses the system to improve people’s French grammar, while apps like Flowrite employs it to help make writing blog posts and emails easier and faster. Midjourney, a DALL-E 2-esque GAI for art, which has recently opened up its beta, is capable of producing stunning illustrated art – with customers charged between $10-50 a month if they wish to produce more images or use those pictures commercially.

For now, that’s something Craiyon doesn't necessarily need to worry about, since the resolution is presently so low. “People ask me ‘why is the model bad on faces’, not realizing that the model is equally good – or bad – at everything,” Dayma said. “It’s just that, you know, when you draw a tree, if the leaves are messed up you don’t care, but when the faces or eyes are, we put more attention on it.” This will, however, take time both to improve the model, and to improve the accessibility of computing power capable of producing the work. Dayma believes that despite any notion of low quality, any GAI will need to be respectful of “the applicable laws,” and that it shouldn’t be used for “harmful purposes.”

And artificial intelligence isn’t simply a toy, or an interesting research project, but something that has already caused plenty of harm. Take Clearview AI, a company that scraped several billion images, including from social media platforms, to build what it claims is a comprehensive image recognition database. According to The New York Times, this technology was used by billionaire John Catsimatidis to identify his daughter’s boyfriend. BuzzFeed News reported that Clearview has offered access not just to law enforcement – its supposed corporate goal – but to a number of figures associated with the far right. The system has also proved less than reliable, with The Times reporting that it has led to a number of wrongful arrests.

Naturally, the ability to synthesize any image without the need for a lot of photoshopping should raise alarm. Deepfakes, a system that uses AI to replace someone’s face in a video has already been used to produce adult content featuring celebrities. As quickly as companies making AIs can put in guardrails to prevent adult-content prompts, it’s likely that loopholes will be found. And as open-source research and development becomes more prevalent, it’s likely that other platforms will be created with less scrupulous aims. Not to mention the risk of this technology being used for political ends, given the ease of creating fake imagery that could be used for propaganda purposes.

Of course, Duchamp and Warhol may have stretched the definitions of what art can be, but they did not destroy art in and of itself. It would be a mistake to suggest that automating image generation will inevitably lead to the collapse of civilization. But it’s worth being cautious about the effects on artists, who may find themselves without a living if it’s easier to commission a GAI to produce something for you. Not to mention the implication for what, and how, these systems are creating material for sale on the backs of our data. Perhaps it is time that we examined if it’s necessary to implement a way of protecting our material – something equivalent to Do Not Track – to prevent it being chewed up and crunched through the AI sausage machine.

Is DALL-E’s art borrowed or stolen?

In 1917, Marcel Duchamp submitted a sculpture to the Society of Independent Artists under a false name. Fountain was a urinal, bought from a toilet supplier, with the signature R. Mutt on its side in black paint. Duchamp wanted to see if the society would abide by its promise to accept submissions without censorship or favor. (It did not.) But Duchamp was also looking to broaden the notion of what art is, saying a ready-made object in the right context would qualify. In 1962, Andy Warhol would twist convention with Campbell’s Soup Cans, 32 paintings of soup cans, each one a different flavor. Then, as before, the debate raged about if something mechanically produced – a urinal, or a soup can (albeit hand-painted by Warhol) – counted as art, and what that meant.

Now, the debate has been turned upon its head, as machines can mass-produce unique pieces of art on their own. Generative Artificial Intelligences (GAIs) are systems which create pieces of work that can equal the old masters in technique, if not in intent. But there is a problem, since these systems are trained on existing material, often using content pulled from the internet, from us. Is it right, then, that the AIs of the future are able to produce something magical on the backs of our labor, potentially without our consent or compensation?

The new frontier

The most famous GAI right now is DALL-E 2, Open AI’s system for creating “realistic images and art from a description in natural language.” A user could enter the phrase “teddy bears shopping for groceries in the style of Ukiyo-e,” and the model will produce pictures in that style. Similarly, ask for the bears to be shopping in Ancient Egypt and the images will look more like dioramas from a museum depicting life under the Pharaohs. To the untrained eye, some of these pictures look like they were drawn in 17th-century Japan, or shot at a museum in the 1980s. And these results are coming despite the technology still being at a relatively early stage.

Open AI recently announced that DALL-E 2 would be made available to up to one million users as part of a large-scale beta test. Each user will be able to make 50 generations for free during their first month of use, and then 15 for every subsequent month. (A generation is either the production of four images from a single prompt, or the creation of three more if you choose to edit or vary something that’s already been produced.) Additional 115-credit packages can be bought for $15, and the company says more detailed pricing is likely to come as the product evolves. Crucially, users are entitled to commercialize the images produced with DALL-E, letting them print, sell or otherwise license the pictures borne from their prompts.

Two images of bears in different styles as produced by Generative AI system DALL-E 2.
Open AI

These systems did not, however, develop an eye for a good picture in a vacuum, and each GAI has to be trained. Artificial Intelligence is, after all, a fancy term for what is essentially a way of teaching software how to recognize patterns. “You allow an algorithm to develop that can be improved through experience,” said Ben Hagag, head of research at Darrow, an AI startup looking to improve access to justice. “And by experience I mean examining and finding patterns in data.” “We say to the [system] ‘take a look at this dataset and find patterns,” which then go on to form a coherent view of the data at hand. “The model learns as a baby learns,” he said, so if a baby looked at a 1,000 pictures of a landscape, it would soon understand that the sky – normally oriented across the top of the image – would be blue while land is green.

Hagag cited how Google built its language model by training a system on several gigabytes of text, from the dictionary to examples of the written word. “The model understood the patterns, how the language is built, the syntax and even the hidden structure that even linguists find hard to define,” Hagag said. Now that model is sophisticated enough that “once you give it a few words, it can predict the next few words you’re going to write.” In 2018, Google’s Ajit Varma told The Wall Street Journal that its smart reply feature had been trained on “billions of Gmail messages,” adding that initial tests saw options like ‘I Love You’ and ‘Sent from my iPhone’ offered up since they were so commonly seen in communications.

Developers who do not have the benefit of access to a data set as vast as Google’s need to find data via other means. “Every researcher developing a language model first downloads Wikipedia then adds more,” Hagag said. He added that they are likely to pull down any, and every, piece of available data that they can find. The sassy tweet you sent a few years ago, or that sincere Facebook post, may have been used to train someone’s language model, somewhere. Even Open AI uses social media posts with WebText, a dataset which pulls text from outbound Reddit links which received at least three karma, albeit with Wikipedia references removed.

Guan Wang, CTO of Huski, says that the pulling down of data is “very common.” “Open internet data is the go-to for the majority of AI model training nowadays,” he said. And that it’s the policy of most researchers to get as much data as they can. “When we look for speech data, we will get whatever speech we can get,” he added. This policy of more data-is-more is known to produce less than ideal results, and Ben Hagag cited Riley Newman, former head of data science at Airbnb, who said “better data beats more data,” but Hagag notes that often, “it’s easier to get more data than it is to clean it.”

Grid of images created by CRAIYON's generative AI featuring the King visiting the seat of the Aztec empire.
Craiyon / Daniel Cooper

DALL-E may now be available to a million users, but it’s likely that people’s first experience of a GAI is with its less-fancy sibling. Craiyon, formerly DALL-E Mini, is the brainchild of French developer Boris Dayma, who started work on his model after reading Open AI’s original DALL-E paper. Not long after, Google and the AI development community HuggingFace ran a hackathon for people to build quick-and-dirty machine learning models. “I suggested, ‘Hey, let’s replicate DALL-E. I have no clue how to do that, but let’s do it,” said Dayma. The team would go on to win the competition, albeit with a rudimentary, rough-around-the-edges version of the system. “The image [it produced] was clear. It wasn’t great, but it wasn’t horrible,” he added. But unlike the full-fat DALLl-E, Dayma’s team was focused on slimming the model down so that it could work on comparatively low-powered hardware.

Dayma’s original model was fairly open about which image sets it would pull from, often with problematic consequences. “In early models, still in some models, you ask for a picture – for example mountains under the snow,” he said, “and then on top of it, the Shutterstock or Alamy watermark.” It’s something many AI researchers have found, with GAIs being trained on those image libraries public-facing image catalogs, which are covered in anti-piracy watermarks.

Dayma said that the model had erroneously learned that high-quality landscape images typically had a watermark from one of those public photo libraries, and removed them from his model. He added that some early results also output not-safe-for-work responses, forcing him to make further refinements to his initial training set. Dayma added that he had to do a lot of the sorting through the data himself, and said that “a lot of the images on the internet are bad.”

But it’s not just Dayma who has noticed the regular appearance of a Shutterstock watermark, or something a lot like it, popping up in AI-generated art. Which begs the question, are people just ripping off Shutterstock’s public-facing library to train their AI? It appears that one of the causes is Google, which has indexed a whole host of watermarked Shutterstock images as part of its Conceptual Captions framework. Delve into the data, and you’ll see a list of image URLs which can be used to train your own AI model, thousands of which are from Shutterstock. Shutterstock declined to comment on the practice for this article.

A Google spokesperson said that they don’t “believe this is an issue for the datasets we’re involved with.” They also quoted from this Creative Commons report, saying that “the use of works to train AI should be considered non-infringing by default, assuming that access to the copyright works was lawful at the point of input.” That is despite the fact that Shutterstock itself expressly forbids visitors to its site from using “any data mining, robots or similar data and/or image gathering and extraction methods in connection with the site or Shutterstock content.”

Alex Cardinelli, CEO at AI startup Article Forge, says that he sees no issue with models being trained on copyrighted texts, “so long as the material itself was lawfully acquired and the model does not plagiarize the material.” He compared the situation to a student reading the work of an established author, who may “learn the author’s styles or patterns, and later find applicable places to reuse those concepts.” He added that so long as a model isn’t “copying and pasting from their training data,” then it simply repeats a pattern that has appeared since the written word began.

Dayma says that, at present, hundreds of thousands, if not millions of people are playing with his system on a daily basis. That all incurs a cost, both for hosting and processing, which he couldn’t sustain from his own pocket for very long, especially since it remains a “hobby.” Consequently, the site runs ads at the top and bottom of its page, between which you’ll get a grid of nine surreal images. “For people who use the site commercially, we could always charge for it,” he suggested. But he admitted his knowledge of US copyright law wasn’t detailed enough to be able to discuss the impact of his own model, or others in the space. This is the situation that Open AI also perhaps finds itself dealing with given that it is now allowing users to sell pictures created by DALL-E.

The law of art

The legal situation is not a particularly clear one, especially not in the US, where there have been few cases covering Text and Data Mining, or TDM. This is the technical term for the training of an AI by plowing through a vast trove of source material looking for patterns. In the US, TDM is broadly covered by Fair Use, which permits various forms of copying and scanning for the purposes of allowing access. This isn’t, however, a settled subject, but there is one case that people believe sets enough of a precedent to enable the practice.

Authors Guild v. Google (2015) was brought by a body representing authors, which accused Google of digitizing printed works that were still held under copyright. The initial purpose of the work was, in partnership with several libraries, to catalog and database the texts to make research easier. Authors, however, were concerned that Google was violating copyright, and even if it wasn’t making the text of a still-copyrighted work available publicly, it was prohibited from scanning and storing it in the first place. Eventually, the Second Circuit ruled in favor of Google, saying that digitizing copyright-protected work did not constitute copyright infringement.

Rahul Telang is Professor of Information Systems at Carnegie Mellon University, and an expert in digitization and copyright. He says that the issue is “multi-dimensional,” and that the Google Books case offers a “sort of precedent” but not a solid one. “I wish I could tell you there was a clear answer,” he said, “but it’s a complicated issue,” especially around works that may or may not be transformative. And until there is a solid case, it’s likely that courts will apply the usual tests for copyright infringement, around if a work supplants the need for the original, and if it causes economic harm to the original rights holder. Telang believes that countries will look to loosen restrictions on TDM wherever possible in order to boost domestic AI research.

The US Copyright Office says that it will register an “original work of authorship, provided that the work was created by a human being.” This is due to the old precedent that the only thing worth copyrighting is “the fruits of intellectual labor,” produced by the “creative powers of the mind.” In 1991, this principle was affirmed by a case of purloined listings from one phone book company by another. The Supreme Court held that while effort may have gone into the compilation of a phone book, the information contained therein was not an original work, created by a human being, and so therefore couldn’t be copyrighted. It will be interesting to see if there are any challenges made to users trying to license or sell a DALL-E work for this very reason.

Rob Holmes, a private investigator who works on copyright and trademark infringement with many major tech companies and fashion brands, believes that there is a reticence across the industry to pursue a landmark case that would settle the issue around TDM and copyright. “Legal departments get very little money,” he said. “All these different brands, and everyone’s waiting for the other brand, or IP owner, to begin the lawsuit. And when they do, it’s because some senior VP or somebody at the top decided to spend the money, and once that happens, there’s a good year of planning the litigation.” That often gives smaller companies plenty of time to either get their house in order, get big enough to be worth a lawsuit or go out of business.

“Setting a precedent as a sole company costs a lot of money,” Holmes said, but brands will move fast if there’s an immediate risk to profitability. Designer brand Hermés, for instance, is suing an artist named Mason Rothschild, who is producing MetaBirkins NFTs. These are styled images on a design reminiscent of Hermés’ famous Birkin handbag, something the French fashion house says is nothing more than an old-fashioned rip-off. This, too, is likely to have ramifications for the industry as it wrestles with philosophical questions of what work is sufficiently transformational as to prevent an accusation of piracy.

Artists are also able to upload their own work to DALL-E and then generate recreations in their own style. I spoke to one artist, who asked not to be named or otherwise described for fear of being identified and suffering reprisals. They showed me examples of their work alongside recreations made by DALLl-E, which while crude, were still close enough to look like the real thing. They said that, on this evidence alone, their livelihood as a working artist is at risk, and that the creative industries writ large are “doomed.”

Article Forge CEO Alex Cardinelli says that this situation, again, has historical precedent with the industrial revolution. He says that, unlike then, society has a collective responsibility to “make sure that anyone who is displaced is adequately supported.” And that anyone in the AI space should be backing a “robust safety net,” including “universal basic income and free access to education,” which he says is the “bare minimum” a society in the midst of such a revolution should offer.

Trained on your data

AIs are already in use. Microsoft, for instance, partnered with OpenAI to harness GPT-3 as a way to build code. In 2021, the company announced that it would integrate the system into its low-code app-development platform to help people build apps and tools for Microsoft products. Duolingo uses the system to improve people’s French grammar, while apps like Flowrite employs it to help make writing blog posts and emails easier and faster. Midjourney, a DALL-E 2-esque GAI for art, which has recently opened up its beta, is capable of producing stunning illustrated art – with customers charged between $10-50 a month if they wish to produce more images or use those pictures commercially.

For now, that’s something Craiyon doesn't necessarily need to worry about, since the resolution is presently so low. “People ask me ‘why is the model bad on faces’, not realizing that the model is equally good – or bad – at everything,” Dayma said. “It’s just that, you know, when you draw a tree, if the leaves are messed up you don’t care, but when the faces or eyes are, we put more attention on it.” This will, however, take time both to improve the model, and to improve the accessibility of computing power capable of producing the work. Dayma believes that despite any notion of low quality, any GAI will need to be respectful of “the applicable laws,” and that it shouldn’t be used for “harmful purposes.”

And artificial intelligence isn’t simply a toy, or an interesting research project, but something that has already caused plenty of harm. Take Clearview AI, a company that scraped several billion images, including from social media platforms, to build what it claims is a comprehensive image recognition database. According to The New York Times, this technology was used by billionaire John Catsimatidis to identify his daughter’s boyfriend. BuzzFeed News reported that Clearview has offered access not just to law enforcement – its supposed corporate goal – but to a number of figures associated with the far right. The system has also proved less than reliable, with The Times reporting that it has led to a number of wrongful arrests.

Naturally, the ability to synthesize any image without the need for a lot of photoshopping should raise alarm. Deepfakes, a system that uses AI to replace someone’s face in a video has already been used to produce adult content featuring celebrities. As quickly as companies making AIs can put in guardrails to prevent adult-content prompts, it’s likely that loopholes will be found. And as open-source research and development becomes more prevalent, it’s likely that other platforms will be created with less scrupulous aims. Not to mention the risk of this technology being used for political ends, given the ease of creating fake imagery that could be used for propaganda purposes.

Of course, Duchamp and Warhol may have stretched the definitions of what art can be, but they did not destroy art in and of itself. It would be a mistake to suggest that automating image generation will inevitably lead to the collapse of civilization. But it’s worth being cautious about the effects on artists, who may find themselves without a living if it’s easier to commission a GAI to produce something for you. Not to mention the implication for what, and how, these systems are creating material for sale on the backs of our data. Perhaps it is time that we examined if it’s necessary to implement a way of protecting our material – something equivalent to Do Not Track – to prevent it being chewed up and crunched through the AI sausage machine.

Mozilla made a Firefox plugin for offline translation

Mozilla has created a translation plugin for Firefox that works offline. Firefox Translations will need to download some files the first time you convert text in a specific language. However, it will be able to use your system's resources to handle the translation, rather than sending the information to a data center for cloud processing.

The plugin emerged as a result of Mozilla's work with the European Union-funded Project Bergamot. Others involved include the University of Edinburgh, Charles University, University of Sheffield and University of Tartu. The goal was to develop neural machine tools to help Mozilla create an offline translation option. "The engines, language models and in-page translation algorithms would need to reside and be executed entirely in the user’s computer, so none of the data would be sent to the cloud, making it entirely private," Mozilla said.

One of the big limitations of the plugin as things stand is that it can only handle translations between English and 12 other languages, according to TechCrunch. For now, Firefox Translations supports Spanish, Bulgarian, Czech, Estonian, German, Icelandic, Italian, Norwegian Bokmal and Nynorsk, Persian, Portuguese and Russian.

Mozilla and its partners on the project have created a training pipeline through which volunteers can assist out by helping train new models so more languages can be added. They're looking for feedback on existing models too, so Firefox Translations is very much a work in progress.

For the time being, though, the plugin can't hold a candle to the 133 languages that Google Translate supports. Apple and Google both have mobile apps that can handle offline translations as well.

On the surface, it's a little odd that a browser, which is by definition used to access the web, would need an offline translation option. But translating text on your device and avoiding the need to transfer it to and from a data center could be a boon for privacy and security.