Even a toddler can figure out the right way to put together a pizza: you roll out the dough, add some sauce, sprinkle on cheese, put the toppings on, then pop the whole thing in the oven.
It’s a much trickier task for a computer to grasp, however. How does it know what to do first? Whether cheese should go on before or after sauce? Is there a right way to arrange toppings? And what about that whole baking thing?
Researchers at MIT and the Qatar Computing Research Institute set out to answer these questions with a recent project in which they taught artificial intelligence to, well, not exactly make a pizza but, more precisely, figure out the order in which it should be constructed. Essentially, the researchers built an AI system that can look at a photo of pizza and deduce what ingredients should go on which layer of the pie. The researchers presented a paper on their work last week at an AI conference in Long Beach, California.
It might sound silly, but there’s a bigger point than creating AI that knows whether pepperoni should be placed on top of cheese.
Computers can already learn how to identify specific objects in images, but when some of those objects are partially hidden (say, arugula laid atop prosciutto), it gets harder for them to figure out what they’re looking at. And with food, which often has many different layers (think a lattice-topped pie or a salad), it can be particularly tricky for a computer to figure out what should go where. To see a picture and say it’s a pizza is easy. To be able to break it down into its various parts and reassemble it is a bit closer to understanding.
Dimitrios Papadopoulos, a postdoctoral researcher at MIT who led the project, told CNN Business that if a computer can determine the essential ingredients and how they should be layered on a pizza, for instance, it may be more easily able to figure out the various parts of other kinds of food images, too.
“Food is a big aspect of our lives, and also cooking, so we wanted to have a model that could understand food in general,” Papadopoulos said.
Why start with pizza, though? Papadopoulos said that he and his fellow researchers knew they wanted to work on an AI project related to food. And when they started thinking about building AI that could mirror a recipe’s procedure and deconstruct an image into layers, pizza immediately sprang to mind.
Also, it’s incredibly easy to find photos of pizza online, and they tend to be pretty uniform: many of them consist of a picture of a round pie, shot from the top, with dough, sauce, and toppings.
The researchers collected thousands of pizza photos from Instagram, then had workers from Amazon’s Mechanical Turk service label ingredients such as tomatoes, olives, basil, cheese, pepperoni, peppers, and a few types of sauce. After that, they used these labeled photos to train a bunch of ingredient-specific generative adversarial networks, or GANs, which consist of two neural networks competing with each other to come up with something new based on the data set. In this case, each of these GANs can look at a photo of a pizza and generate a new image of the pizza that either adds an ingredient that wasn’t on it previously or subtracts one that was already on the pie.
For instance, there is a GAN for adding or subtracting pepperoni: show it a picture of a pepperoni pizza, and it should be able to generate a new pizza that is identical but has no pepperoni on it, and vice versa, as the researchers illustrate here. Others can do things such as add or subtract arugula or make the pizza appear baked or unbaked.
Papadopoulos believes this research could lead to non-food applications as well, such as a digital shopping assistant that uses AI to figure out how to put together a fashionable outfit.
“It’s exactly the same idea: you don’t try to add pepperoni; you try to add a jacket,” he said.