Building Procedural Manhattan in Marvel's Spider-Man
Artifacts is a new YouTube series made possible thanks to the AI and Games crowdfunding community on Patreon, YouTube and right here with paid subscriptions on Substack.
Support the show to have your name in video credits, contribute to future episode topics, watch content in early access and receive exclusive supporters-only content.
Did you know, that the vast majority of Manhattan that you swing through in Insomniac's Spider-Man games is actually procedurally generated?
Yup, that's right. The buildings, the shop fronts, the hot dog stands, the NPCs shootin' hoops, the traffic, the civilians, and yeah even the crime, is all built using a procedural generation system.
Marvel's Spider-Man was the biggest creative challenge that developers Insomniac Games had ever faced, and to help them get the game to market back in 2018, they built a suite of procedural systems that would take on the bulk of the work in crafting the friendly and sometimes not-so-friendly neighborhoods of Manhattan.
I'm Tommy Thompson, and welcome to Artifacts here on AI and Games. In this episode, we're swinging through the streets of New York to find out just how exactly this procedural pipeline works, and critically the requirements it needed to satisfy, given it's doesn't just create a pretty looking game world, it's also responsible for ensuring Spider-Man's web-swinging and movement mechanics actually work in moment-to-moment gameplay.
WARNING: This article contains spoilers for Marvel’s Spider-Man and Marvel’s Spider-Man: Miles Morales. With references to specific characters, plot elements and story missions.
Spider-Man's Manhattan
To understand the rationale for having this procedural system in place, it's worth taking a moment to highlight the size and scale of New York City, in the 2018 Spider-Man game. This fictionalised version of the city is 3km wide and 6km long, spanning all 9 districts of Manhattan: Chinatown, the Financial District, Greenwich Village, Hell's Kitchen, Midtown, Central Park, the Upper East Side, Harlem and the Upper West Side. Plus on top of all of this there's a dedicated play space for Ryker's Island, given that's home to the superhuman prison The Raft, that's critical to the storyline.
And of course, each district has its own style: be it distinct recognisable buildings, from the likes of the Empire State Building to Madison Square Garden, or easter eggs for fans like the Sanctum Sanctorum, the Daily Bugle or the Avengers Tower. But those aren't the bulk of the city. The final statistics of Spider-Man's Manhattan are pretty staggering, with over 8300 buildings, 544 roads, 350 store fronts, over 1200 alleys, not to mention all of the events and activities that can take place in those spaces. Each of these elements needs to be placed in locations that make sense, and feed towards the overall design goals.
While Insomanic has many a talented artist, during the 4-year development of the 2018 Spider-Man game, they were also spread across multiple projects, including the likes of the 2016 Ratchet & Clank remake for PS4. It would be simply impossible to build the entire city within that timeframe, even if they weren't working on other titles. Plus they subsequently rebuilt portions of the same map for 2020's Spider-Man: Miles Morales, both to reflect changes to the city that had occurred courtesy of the previous games narrative, but also redecorating it to fit the winter weather and the holiday theme. Not to mention the expansion of the map for 2023's Spider-Man 2. All of this dictated that the team needed some sort of automation to speed up the process.
Now it's worth stating, that when I say that New York is built procedurally, I do not mean at runtime. The Manhattan you play the first time you boot up the game, is the same as it is the 100th time you play (even as the maps change slightly to accomodate each games narrative). There is no procedural generation engine in the shipped game. Insomniac built a procedural generation workflow into their development process. Meaning that the development team would focus on making a simple blocked out version of the map that felt fun to move around in, and then the procedural system would be responsible for implementing their vision and creating the city by placing the buildings, and adding all of the art assets to the world that art teams had been crafting as modular components. After the city was generated, artists and level designers would go back in and observe the generated environment, make some changes by hand and then finally approve it, and lock it all down. But critically, during development the team mandated that the map had to be playable at all times, so that other parts of the development team could implement features they were working on.
So let's explain the workflow for building the city, and how the procedural systems work around the developers needs.
Tiles for Miles
As stated already, Spider-Man's Manhattan is a 3km x 6km play space, but in Insomanic's proprietary game engine, it's actually a series of tiles. The entire map is broken up into around 700 tiles, each 128 x 128m. This allowed art and design teams to isolate and focus on segments of the map, which also worked for the procedural system. Given it could focus on generating on individuals tiles. In fact, each tile is essentially a JSON representation of all the prefabs, models, textures and other assets that reflect that region. As we'll see in a second, the procedural generator helped determine this information. Meaning it can be saved and then reloaded into the game when needed.
As stated by Josue Benavidez, a senior designer on Marvel's Spider-Man they wanted it look and feel like a natural city, but also be fun and intuitive for players to navigate across. Hence the use of point launches, parkour and wall running. So they started out by building a rough first-pass of what the map might look like, using key landmarks to help define it, but also defining metrics for buildings and regions that would help give character. Ensuring streets are wide enough to make them fun to swing across, standardising building heights so you can tell that you're no longer swinging through Hell's Kitchen, you're clearly in Midtown. As we'll see in a second, all of this was then fed into the procedural systems, so that it could generate everything the designers needed. And would run generation repeatedly to get the desired outputs and then edit them to suit.
In fact, this workflow meant that the team could still be making adjusting to fundamental layout and structure of the city in the closing months of development, given the procedural tools helped speed things up. As noted by David Santiago, a principal technical artist at the studio, the road layout was still being adjusted in January of 2018, less than 9 months prior to the games launch. Meanwhile the final edit to the game map was in June, at a time when the game was being locked down for its gold master candidate.
So now let's dig a little deeper, into how the procedural workflow was designed.
Procedural Workflow
The procedural workflow, designed to interface between popular animation software houdini and Insomniac's game engine, is built into three steps: layout, generation, and review. The system could operate on one or more tiles at once, and respect the constraints applied to it. The algorithm deployed for this process isn't clear, but it's heavily constraint-based: given designers would define metrics for street sizes, kerb lengths, building heights, road configurations, city districts, and more. All of which the system had to respect during generation.
Phase 1: The initial layout,
Level designers carve out the shape of the map: where roads, and alleyways are placed, where key buildings will be, and areas that are going to be important mission spaces. Then the procedural system could actually draw in the roads and ensure they all fit together correctly. Buildings are placed as rough structures using simple 3d shapes, if anything to indicate the size, shape and rough space it may take up.
Phase 2: The Heavy Lifting.
The procedural workflow takes the simple building shapes and replace them with a fully generated and decorated set of buidlings. All the while it also generates a lot of additional decoration, such as street lamps, subway entrances, man hole covers, trees, piles of trash and more.
Phase 3: Polish
While this system could generate tiles quickly, that didn't mean they were ready to ship. Instead, it was used as a first-pass to get the bulk of work completed, and allow design and art teams to then work on each tile, make further changes, fix things the generator got wrong, and ultimately use their skills to make these spaces as interesting as possible.
The generation process could take anywhere from 5-15 minutes to run on a full tile, depending on the amount of work involved (it could regenerate the entire tile, or areas that designers asked for it to focus on). By comparison, it was estimated that a human would take anywhere from 30 minutes to 4 hours to complete the same workload. Hence a huge cost saving, even if the quality wasn't good enough to ship. If needed, the entire city could be rebuilt procedurally overnight, running on a server in the studio and ready for developers to return to the following day. But as we'll see throughout, it was more than just building creation, light and reflection probes would be placed into each area such that the game could calculate in-game lighting correctly. The same for audio occlusion and attenuation. Typically this is done by hand by a level designer, but the procedural system would figure out strong placements for relevant assets in each tile. Plus as we'll see in a second, it has a huge impact on Spider-Man's web swinging mechanics. Note that this process is not used on every single tile, given specific buildings are crafted by hand and central park is treated very much as a separate entity.
But while this was a technological marvel (sorry), level design was far from a trivial process. The procedural system was continually being updated and improved throughout Spider-Man's four-year development cycle, be it adding new features or improving quality. But also it took a long time for any one tile in the map to receive sign off, given missions would change, the needs of the environment would change, or art assets would change. So the team was constantly re-generating the map to try and fit their goals.
Ensuring Gameplay
Now while we've focussed on the placement of art and props, as well as the other critical engine features such as lighting, there is one critical part of all of the Spider-Man games that the procedural system was responsible for: it ensured that Spider-Man can actually navigate the city. So let's take a little detour from procedural city generation, to talk about how Spider-Man's web-swinging works...
Insomniac's titles capture perhaps the most critical element of Spider-Man games by having the web-swinging feel fluid, engaging and behave in context of the local environment. While some games have historically avoided this given the technical complexity of it, all of Insomniac's Spider-Man games require that any webs the player swing from to phyiscally attach to nearby buildings. While it adds some authenticity to the experience it also creates a huge technical challenge for knowing where Spidey's webs are allowed to connect, not just for regular web swinging, but for web-zips and point launches as well.
Given the complexity of the buildings, in terms of the sheer number of surfaces and edges on them, it would be too computationally expensive for the game to check for available surfaces for webs to connect to on a frame-by-frame basis. There's simply too much geometry in the region, and it's all moving around the player really quickly. Players can achieve a top-speed of around to 68mph (or 30 meters per second) as Spider-Man depending on the swing, and so checking for valid geometry to swing from needs some optimisation. And that's where once again the procedural workflow came in handy.
Each building in Manhattan is actually covered in an invisible grid of square panels. When Spider-Man initiates a web-swing, the game runs a quick series of raycasts to find a particular one of these panels on which it should stick the web. Each panel is scored using a utility function, with higher scores given to tiles that would ensure the resulting web will help maintain our current direction, not require significant input changes by the player and also based on ideal range and angle such that the swing is as smooth and cool looking as possible. Once a panel is selected, the game then looks for an actual point in the building represented by the panel where the web will actually stick to.
Note that this isn't used for all web swinging, given that there are a lot of unique objects, such as cranes, tunnels, bridges and other geometry where this was placed manually by designers on prefabs. A really good example of this is in Central Park, given the trees dont have any flat surfaces to put the invisible panels onto.
But this is just one part of the gameplay requirements of the procedural system, Spidey's movement and navigation is also heavily reliant on recognising edges to zip around or over, or knowing where are valid locations to point launch from. While many of the props such as water towers and the like can have that pre-built into them as part of the prefabs stored in the engine, the rest of the map is also analysed by the generation system when the buildings are constructed so that the game knows of available launch points nearby which are highlighted for the player to select in the heat of the moment.
The Big Apple
In addition to the web-swinging, the generative system had other responsibilties for ensuring New York operates as an interesting gameplay space.
As mentioned already, the game map is broken up into tiles. Now to maintain performance, the engine only allows for around 9 tiles can be active at any given time. And given the aforementioned top speed of 68 miles per hour, the game aggressively loads and unloads of tiles and all their assets such that all of this looks smooth. This is known as a level-of-detail or LOD system, and is very common in game engines to prioritse rendering jobs based on perceived distance and relevance to the player. In fact, over in the episode of AI and Games all about Spider-Man, I talked about how the civilians use level of detailing to dictate their overall visual fidelity.
The system prioritises loading in new tiles based on player trajectory and even how you're navigating through the city influences it. If you're on the ground, then the shop fronts and street props are prioritised. Meanwhile if you're zipping high across the city, then the rooftops are prioritised and the shopfronts will be loaded in a second or so later. In fact, Spider-Man's top speed is capped in-game to ensure that the tile streaming system can keep up.
Now you might have realised this raises an interesting question: if the game is only holding around 9 tiles in memory, how can you see the entirety of the map at any point in time? If I climb up the Empire State Building, I can see everything from Columbia University in Morningside Heights all the way down to Wall Street. So how does that work?
Each tile in the map has a lower-than-low LOD level called 'imposter', which is an incredibly simple low-poly representation of that art asset. During generation of the map, the system generates the imposter for each tile. So when that tile isn't in memory and active in-game, the imposter being used instead. In fact, the imposters never load out of the game when you're in the open world map. Hence ensuring that at any time, you can see the entirety of the city, even at a really low-quality level.
Plus you'll notice that props shift depending on the time of day or the act of each story. Notably the third act of Spider-Man after the raft breakout. This is all handled by the procedural engine, given it saves different prop and lighting configs for different phases of the game.
The City That Never Sleeps
Now there's still one last thing to cover: how this system also influences the AI characters in the game world. The procedural system is responsible for how civilians spawn, with traffic volumes for NPCs that dictate how densely a pavement can be populated. These are then connected to one another, so that civilians know how to cross the road, but also go to specific points to get on the subway, or try and hail a cab. But also, there's dozens of fixed activities you'll find civilians doing. These activities are all known as vignettes, there are around 25 different types that exist in the original game, with over 3000 locations for them to appear, all placed by the procedural system. Similarly, the same approach is applied to the random crimes that pop up on the map. There is again, over 3000 possible locations for 29 different types of crime. Both vignettes and crimes have a number of constraints the dictate whether they spawn, such as the story act, time of day, level geometry and much more.
Lastly, when roads are spawned, they come with built-in templates that dictate how vehicles can move along them and what traffic that can appear: be it the types of vehicles, traffic density and whether there are lanes dedicated for parked cars.
It's worth saying the navigable space for both traffic and AI continues to change throughout each game, as the events of the story can influence the layout of a given location. Be it due to weather, trash, crimes or Sable security checkpoints. So each tile stores numerous versions of its navigation data, such that traffic, civilians and enemies know what space is safe for them to move around in during each act of the story.
Closing
Making games is hard, we all know this. But sometimes, we can find ways to automate parts of the process to help handle the sheer volume of content that needs to be created. It's worth saying that Insomanic's games are not an outlier, there are many other automated processes used by game developers to speed up content creation. And in fact these tools are used in a myriad of different projects. With versions of these tools also used in other games. But when you consider the sheer scale of Spider-Man's New York maps, and the overall quality of the playable experience, it speaks to how important it was not just to build the tools to support developers, but craft an iterative process on using and improving said tools to help achieve the overall design goals.
Thanks for watching this episode of Artifacts here on AI and Games. If you enjoyed this article, then be sure to catch the accompanying episode of the main series all about Marvel's Spider-Man if you haven't already.
References
The following talks were used as part of my Spider-Man episodes, you might enjoy digging into these to find out more information:
"Marvel's Spider-Man: AI Postmortem",
Adam Noonchester, GDC 2019"Building Traversal in Marvel's Spider-Man",
Doug Sheahan, GDC 2019"Conquering the Creative Challenges in Marvel's Spider-Man",
Bryan Intihar, GDC 2019"Marvel's Spider-Man: A Technical Postmortem",
Elan Ruskin, GDC 2019"Marvel's Spider-Man, meet Houdini",
David Santiago, GDC 2019"Procedurally Crafting Manhattan for Marvel's Spider-Man",
David Santiago, GDC 2019"Building New York in Marvel's Spider-Man: It's Still Just Level Design",
Josue Benavidez, GDC 2019"Marvel's Spider-Man: Procedural Lighting Tools" Xray Halperin, GDC 2019
"Marvel's Spider-Man: A Deep Dive Into the Look Creation of Manhattan", Matt McAuliffe, Brian Mullen and Ryan Benn, GDC 2019