Oriole Networks: a paradigm shift for AI interconnect
/As companies continue to train larger and larger AI models, the amount of data that needs to be moved between GPUs is growing extremely rapidly - due to highly synchronous workloads. This places huge strain on the networking technology that connects thousands or even hundreds of thousands of GPUs together, the ‘interconnect’. As much as 90% of the time training a large AI model is spent moving data around, which is an inefficient use of hugely expensive GPUs as well as a contributor to the substantial energy footprint of AI clusters.To put this in context, the most recent xAI cluster constructed by Elon Musk is reported to have 100,000 H100 GPUs, costing $6b which requires 100MW of power for the cluster - enough energy for 50,000 homes, all generated by burning natural gas and the associated CO2 emissions. The same story is playing out across the AI industry.
In the UK in 1952 a revolutionary new technology was invented - fibre optical cable. Instead of sending information using electrons along metal wire, it became possible to send it using photons along glass fibres. There were huge benefits in doing this: you can move the information at the speed of light, substantially increasing the fidelity of information transmission, increasing the bandwidth and lowering the energy required.
Fast forward to today and you see the echoes of this revolution in our domestic lives - fibre to the home is now commonplace across London and scars on roads across the city show the signs of a huge infrastructure overhaul - we dug up the ground to lay pipes filled with glass to move information at the speed of light.
The photonic revolution has continued beyond telecoms into ever smaller photonic systems and the rise of ‘photonic integrated circuits’, analogous to the rise of electronic integrated circuits.
Professor George Zervas at University College London is one of the world’s leading experts in optical networked systems. Remarkably he has been winner or runner up in the Fabio Neri Award for the best paper in optical switching and networking for 4 of the last 10 years. He has spent his career pushing the limits of how we connect large networks with photonics. His mission has been to find a way to make a wholesale shift from a datacentre networked via electronic packet switching to one that is fully optical. This requires a whole system solution integrating software and hardware to connect many thousands of GPUs to each other in a way that has never been done previously. He and his team members have made the key technical breakthroughs, and proved that they will deliver extraordinary gains in simulation. With Oriole’s network, George believes it will be possible to unlock up to a 10x gain in AI cluster performance - so you could train a model like GPT-4 10 times faster, or with fewer GPUs. As importantly, the proposed optical network consumes a fraction of the power compared to current AI networks.
Simulation is great, but to convince the industry to make such a paradigm shift you need to actually build it, so George formed a spinout and recruited James Regan to join him as CEO and co-founder. While George has spent his career pioneering research into optical networking, James has spent his career industrialising photonics. From Nortel, to Agility Communications to Isca Photonics to EFFECT Photonics, James has lived the photonics revolution and deeply understands what it takes to build these products. He has seen the photonic supply chain mature to the point where a startup like Oriole can be ‘fabless’ and take advantage of foundries that produce photonic integrated circuits.
I’m deeply motivated by the potential climate impact of Oriole. If society continues to scale these AI systems, we urgently need a way to reduce their carbon footprint. This is a longstanding area of focus for us at Plural with our investments in Proxima Fusion (stellarator fusion), Isometric (carbon removal verification) and Field Energy (grid scale batteries) among others.
I’m also keen to see a UK start-up play a real leadership role in the future of the AI compute stack. Nvidia recognised the importance of better interconnect with their $6.9b acquisition of Mellanox (creators of Infiniband) in 2019 - roughly 4% of their market cap at the time. At today’s market cap of $3.3 trillion, this would be over $130b - and if you scan GPU list, Infiniband’s dominance is obvious. The most important AI startup to come out of UCL, DeepMind, failed to find the support it deserved from Europe’s venture capital ecosystem and had to go to Silicon Valley and Hong Kong to find ambitious enough early investors, and as I wrote in 2018, DeepMind’s early sale to Google was a hugely significant branching event for the UK.
With Oriole, it’s time for the next generation of UCL founders to get better support. The company has accelerated out of the blocks, raising almost $40m in the first year and a half since founding. They are hiring some of the top experts in optical networking and sprinting to get this network built and to put an alpha product in the hands of customers next year. They’ve also brought on strategic investors like XTX Markets, who operate one of the largest clusters in Europe, and have a longstanding commitment to tackling their climate footprint.
We are excited to support Oriole in their mission to bring a paradigm shift in AI interconnect. Thanks to Jeremy at Fortune for digging into the story more here.
UK stand up tall!