TEK2049

Major Challenges For A Metaverse In 2030

A challenging journey building a virtual environment by the end of this decade

The Oxford definition of the Metaverse is “a virtual-reality space in which users can interact with a computer-generated environment and other users”. It's set to be a unified virtual universe that will connect communities, products, commerce, workspaces, entertainment, and allow creators to program this virtual environment.

A few attempts to create such a platform have been tried in the past. A well known example is Second Life, an open-source project started 19 years ago back in 2003, and is still active to this day, with the last version released just a month ago.

Over the years we did see a few other projects, mostly games, that implement some of the Metaverse concepts. World of Warcraft allowed thousands of players to inhabit a virtual fantasy world, it allowed groups of 10-25 players to band together going to adventures called raids. It also implemented a Battlegrounds feature that allowed up to 80 players to fight each other in the same environment.


Wrong Market Predictions, Ignoring Challenges

Estimates say the Metaverse industry is estimated to reach $8 to $13 trillion by 2030. But I call bullshit on this, this market cap can perhaps reached at some point, but certainly not by 2030. Just take the following information into consideration: The video gaming market is expected to reach $340 billion by 2027, today its market capital is around $200 billion. So do you really expect the Metaverse to reach x40-x65 in 8 years? It's possible, but at a very low probability. Nevertheless, even reaching a market of $1-$2 trillion by then would be impressive, and might provide income for many people around the world.

So, how is it going so far?

Games like Minecraft focused more on the creativity aspect, allowing multiple players to build structures in a cooperative manner. A great example of such creation is The Uncensored Library, a map containing a large library structure. It acts as a platform for censored journalists to publish content in a safe way, avoiding government censorship.

While Fortnite focused more on community, social, and live events. And the new Microsoft Flight Simulator is all about a lot of real world inputs. It has live airplane traffic, live weather, you can fly with everyone else in the world. It focuses on a lot of outside inputs, and like you would find out later, that is a crucial aspect of a successful Metaverse. Of course it also has amazing graphics, but what is really important about that is the seamlessness of the world.


Technological Challenges of The Metaverse

File Formats

There are actually few file standards that can be valid candidates powering the Metaverse. There is Pixar USD (Universal Scene Description), it does a pretty good job of describing scene graphs. There is the MDL (Material Description Language) and MaterialX, a shading graph language which describes their operation of shaders for physically based realistic materials. Finally there are formats like glTF by The Khronos Group, a royalty-free specification for the efficient transmission and loading of 3D scenes and models by engines and applications.

However, we are still lacking standards in other areas like: character animation, faces, particle systems, description for object physics and their interactions with other objects. We are also missing standards for describing how live servers and clients interact together using real-time network streams. You don’t want to send everybody entire scene graphs every time something changes in the world, and you need to have highly optimized protocols for efficient synchronization of this information.

How does code interact with the scene graph to make interesting things happen in a dynamic world? How does audio work in this environment? It's not just a sound file, It’s also 3D geospatial location and physical interactions and interface user interface components on top of that because this future 3D medium will also have forms of 2D interaction in it. – Tim Sweeney

So we do have a few formats that pave the way for the Metaverse, but still, there are lots of small gaps that need to be filled. File formats and standards move slowly, it might actually take a few decades until we will have all of this standardized. Of course, here we are referring to an “Open Metaverse”, we could see few companies develop their own proprietary standards that would close most of the gaps. Don’t forget it took a web standard like HTML five iterations until reaching its current state.

Moreover, data formats like XML and JSON do not solve the problem of conveying data and information about any topic losslessly, for this to work efficiently inside a streaming environment we’ll need binary based formats. Yes I’m, looking at you Apache Avro. Eventually we will need a unified binary file format that would allow transmission of data of USD, MDL, glTF and others over streams.

In the Metaverse you would want objects to be able to move around between servers, and be conveyable losslessly even between software products written using different code bases and different engines. We are quite far from having such unified file formats and software infrastructure, but some relevant technologies are out there.

Identity

We’ll definitely need some standards representing entities in the world. We need to identify people, to allow secure voice, text and video communications. Also, an additional layer to represent social interactions and social relationships between people, with this data you can create social graphs.

Next, you need this identity layer to be secure, also acting also like a session layer. How then can your social graph be secure? There are some promising standards like OAuth 2.0 (RFC 6749) but the question on how to implement such a mechanism on the blockchain is still an open one.

Tim Sweeney, CEO of Epic Games, mentioned they already have quite an experience with the topic, while ... implementing Fortnight cross-platform play and cross-platform item ownership across all seven platforms. This makes Fortnight the first game that runs on all seven platforms and enables cross-platform play and cross-platform item ownership across all seven platforms.

So while we seem to have the basic ingredients and infrastructure needed for federated identity across many different ecosystems, that is still just the starting point, and more standards are required. There is no workable solution for social graphs, and technologies that did try to provide a solution, were demolished by Facebook and other companies with interests that such technology won’t be around as open-source and standard. The OpenSocial technology was developed initially by Google and MySpace, but development of the standard ended with version 2.5.1 in 2013. Later the initiative continued as a foundation and a W3 Working Group, however this group stopped operations in 2018 as well.

The economy of the Metaverse needs standards for ownership of digital objects. And you would think well we have ERC-721 contracts for that (NFTs)... And indeed platforms like OpenSea are great as a marketplace. You really need all of this to happen in the Metaverse, which means the ability for creators to create things, for them to sell the things to users, and for users to be able to prove and establish that they own things, wherever they go, whatever server they’re on, and whoever they communicate with. – Tim Sweeney

Then you need some standards for character appearance, you need the ability to create unique avatars, I would imagine something like the caracter creator of Fallout or Skyrim, but in a form that can be transmitted over different services and servers. These types of standards are not yet around, however, some companies are working on item ownership solutions for the gaming industry based on the blockchain.

And finally, when it comes to the items themselves, we should not forget 3D objects in a virtual world are not 2D images of monkeys. They are made of complex 3D that are vectoric by nature, meaning you can scale them, and then they are covered with a material, sometimes the material is a texture, and that doesn’t scale well. It perhaps makes sense that the materials would be procedurally generated instead of texture images and that of course complicates stuff even more.

Networking

For networking to work properly in the Metaverse work needs to be done in several areas. We are talking here about a system that might have 1 billion or more concurrent users, yet still feel like at least a Minecraft session. Server technologies of today support anything between 10-120 concurrent plays tops. You want more than that at low latency? There is simply no solution at the movement, although we do have companies like RP1 that try to solve this issue exactly. They don’t have anything you can try yourself, but their closed demo seemed to allow 4000 entities to be in the same environment, with audio and everything.

Okay, it's time to make some calculations. In theory a single server instance, with the right hardware should be able to support 65535 connections (ports actually). But many ports are registered with IANA, you do however have the 10000-20000 range dedicated for VoIP. Additionally you have the 49152-65535 these are private and ephemeral ports that are not designated to anyone. So we end up with 10,000 ports (or connections) for VoIP and another 16,383 ports for other connections. To simplify things, let’s say you created a server software that can accommodate 16,000 concurrent plays, and 10,000 of them can also voice talk to each other. Okay we calculate that for 1b players you would need 62,500 such servers.

Here is something you can compare to, although Facebook doesn’t disclose how many servers it has, they do publish their kilowatt hours for all their data centers. Again some quick calculations we come up with somewhere between 170k-190k servers. So we are talking here one third of Zuck’s infrastructure just to facilitate the user's communications capabilities. The servers won’t be in the same data center, they will be distributed around the world, and we need somehow to sync everything together at reasonable lag.

Moreover, when servers are acting as trusted authoritative decision-makers, authoritatively deciding how interactions happen in the world, it cannot scale anyway. In another world we need a networking technology that would allow 1 billion users to be present in the same world environment.

This of course has to include some kind of sharding, and you might think well Ethereum will have sharded chains soon. Yeah but these shades are not for accommodating live plays in a virtual world at 100ms latency, they are for transactions, actually the network should be able to support up to 100,000 transactions per second. But for the Metaverse we are talking about update speeds 30/15/10 times a second, slower than that, and this would demolish the user experience. And you realize that 100k/second global transaction rate is not even close to the Metaverse requirements. A lot more work needs to be done in this area for Blockshains to provide a good solution, and a practical one is out of sight at the moment. Blockchains are also single-threaded bottlenecks, you cannot do computing concurrently on the Blockchain. On the other hand, Blockchains are scalable and don’t have the “lot of servers in one place” issue - it would probably be the way to go for the long term, but right now the solution is probably a federated architecture.

Programming Model

In the Metaverse you need to enable different user experiences: games, entertainment, and others. You will need a lot of user created assets, but you will also have huge amounts of user written code. Yes, you heard that right, lots of user written code, running in the same virtual world environment. Just imagine what would happen if you would run the whole NPM repository inside the same browser window. The horrors of that scenario.

The web is a much more controlled environment, its scope usually being a single website, and in other cases just some other official components from other websites as well. But in the Metaverse you need to deal with random user code being injected, and that code might interact with you. There is a big issue with scope, and code execution. While today’s web is based on a closed programming model, the Metaverse will need to use an open programming model. When use is participating in an experience, the experience code controls almost everything in your environment.

Some kind of code consensus mechanism should be created. This mechanism should be able to take all available codes at some radius from the user, and choose what codes can run on the user client. For example, when a user takes part of some unique visual experience, you cannot have other code that interacts with rendering to be able to execute while you’re in the experience, perhaps other codes can.

In the Metaverse the programming models should allow everybody’s objects to interact safely, sensibly with other objects. For example, if two users built a car object. First they both should be able to interact with the road objects in the same way. Secondly, they should be able to interact with each other like you would expect of cars. Finally, they should interact with players in a world that is common, predictable, and makes sense.

Another thing is expanding code, and making versions of code modules. When a user releases a code module with some interface that is a set of classes, methods and variables. There are a few ways he can later extend that module by adding methods, properties, and optional function arguments. This module has to not break compatibility with hundreds of other code modules. We should have mechanics in place for module interaction stability.

Perhaps each code module should be versioned, and some kind of consensus mechanism would keep a graph of what code modules can interact with each other. This can be done if the system would take new code modules and try to integrate them into the module dependency graph. At first it can try the module’s closest children, if that works, it can try expanding further. It would do this until some kind of consensus would be reached if this module can be used, or is broken.

Today if you would look at eco-systems like JavaScript, or others, the best the industry came up with is some audit mode that checks if your code dependencies have any security issues. There is nothing that checks code compatibility at large scale. Compatibility is checked by developers running the dependent modules manually with the new module version, if it breaks they would open an issue on github, hopeful the maintainer would fix it.

There is no automatic mechanism, or software for code compatibility for an open environment. We only have solutions for closed environments, in the form of deployment and testing pipelines created by individual companies with varying standards. Such software infrastructure has to be built yet, it would be one of the mode challenging problems to solve on our way to the real Metaverse.

We need to have a language for the task. Intuition says it would have to be dynamic, but maybe a static solution can work better in many scenarios. Perhaps it should support both ways. And it has to be a class-first approach. It's hard to imagine how you would describe the world, interactions and environment with something else. By the way, maybe such language already exists, and just needs to be adjusted, but I wouldn’t put my bets on JavaScript.


Product and Social Challenges of The Metaverse

World Environment

The world of the Metaverse cannot be made of disconnected shards like modern MMO games implement. It has to be a single shared environment. Translation in such an environment should be seamless, and resources should be loaded in the background in a smart way.

When thinking how such a world can interact with the outside environment, once again we are reaching the limits of a federated model. The input into the world model would be enormous, and can only be handled by using multiple threads. That is, both for federated models, but the real challenge is enabling this on a decentralized blockchain model. We also should take into consideration how messages will be transmitted between the different nodes on the Metaverse. A binary streaming solution and Apache Avro format comes to mind.

If two world objects are located on two different servers, we want that interactions between objects would have consistency on both ends, as they send each other messages. For example, when you buy something in the world, you need to have a guarantee that either you’ll get what you paid for, or your money back. If you would pay money, but the ‘buy’ message got lost in the network, it would not work.

These issues are solved pretty well for some distributed database systems, but as they are, they do not fit the needs of the Metaverse yet. A transaction is an atomic operation that modifies state in a number of different places and occurs atomically. A transaction has to be atomic, isolated, durable, and consistent. That means everybody in the world can see that a transaction occurred or not, and nobody sees a conflicting view of that transaction.

There is, and was quite a lot of research done on Software Transactional Memory (STM), that means running a lot of threads, and being able to update shared state without any data race conditions. Intel CPUs now have something called Transactional Synchronization Extension (TSX) that allows to facilitate some of this. We will need to develop higher level protocols for negotiating transactions between multiple servers, at a low resource cost. Yes that is the blockchain, but at a much lower frequency than what is required for the Metaverse.

In other words, we need a software library and framework that would be used on all software modules involved, which implements these protocols and framework. It also needs to be programmer level friendly. We will certainly need C++ or Rust for the heavy metal software modules, but at the higher level we can use a scripting language. I think Lua might be the perfect fit here, it works well with C++, and is fast because it’s JIT.

Virtual Economy

The real promise of the Metaverse, at least based on the Snow Crash novel, is to have an actual working virtual economy where people who create content in the world (assets, code, or both) can make money for their creations. You need to be able match buyers and sellers, and have item discoverability mechanisms in place. You’re gonna have millions and probably billions (at some point) of items in the world, you don’t want to show them all to the player in a single list, like the real work it probably needs to be location based. This means players would be able to buy items that are close to their location, and see items only at some radius from their location. A lot of concepts can be taken from the game EVE where we have this kind of mechanisms.

Another aspect is buildings in the Metaverse. Not every world asset should be something you can just buy and sell. If someone built a large and beautiful building, and he wants everybody to be able to come visit it without any restrictions, however, he still wants to make money from it, we should have mechanics for him to do so. For example, maybe he can get paid per someone who visits the building (but by who?). Or alternatively, maybe he should be able to put ads on the building and if people look at it for more than a few seconds he gets paid by the advertisers.

Having said that, what is just tip of the iceberg, we will need to develop new innovative monetization models for the Metaverse. We need also to ensure that whoever doesn't have to have funds, can still enjoy a lot of the world he is actually contributing to by just being connected.

For the virtual economy to function, we need a high level of trust. This is maybe where Zero Knowledge Proof technology can come into play. However, full ZK implementations are just being made, and they require specialized ASIC hardware. In the actual Metaverse players will probably need some specialized hardware to process ZK proofs, like the GPU cards we have today for graphics (later, integrated in CPU/GPU).

You would also need some kind of governing body, you need someone to recognise things like reputations of sellers, qualities of items, or it can all go the wrong way. We don’t want to have a fake review system in the Metaverse. Again we are down to mechanisms, algorithms, and protocols, eventually implemented as software libraries.

Moderation and Empathy

Let’s not forget the lesson we learned from the Internet and social networks. Bad things will be made, we gonna have abusive content, pornography, trolling. I think we need to be very aware of the things that have gone terribly wrong with existing social platforms as we think about what a Metaverse should be when it grows up. – Tim Sweeney

We need to realize that this does not happen because a lot of people are bad. This happens because algorithms are coded to prefer this kind of content for the sake of user engagement. Engagement can be both positive and negative, but many humans have this tendency to prefer engaging with the negatives. The social platforms today are made to maximize engagement. This means if a port gets more likes and comments it will get more promoted, even if the comments are negative. We are paying the social platforms for eye-balls and the incentive is money and advertising. We shouldn’t be surprised by it, but what we should do is we should think about designing platforms that don’t fall victim to this set of problems. – Tim Sweeney

It seems for games like Fortnite the ability to voice chat has a lot of contribution to the general positivity. Yeah, you still have some negative people around, but it is much easier being negative by trying text than voice chatting. Voice carries much much higher empathetic bandwidth than text. The tone of someone’s voice carries a lot of their intentions. We are innately trained to respect people when we’re interacting with them personally, far far more than when we’re interacting with them with text messages.

This is where the Metaverse can really shine over other social media platforms. Moreover, when we will have commercial technology available for identifying facial expressions, that would contribute even more to the general positivity of people in the verse.

But that is a huge challenge to overcome. This means an hour on the Metaverse needs to be better than an hour on Facebook, or Instagram, or an hour on YouTube, or an hour on Netflix. It needs to be better than an hour in Fortnite, or an hour in Minecraft, or an hour in Roblox, or GTA, or any of the world’s best games. – Tim Sweeney


Conclusion

So as demonstrated there are a lot of complicated challenges on our way to the Metaverse. These technologies, formats, standards, software, and infrastructure are not things that can be accomplished by a single developer, or company. And even if a company will try to develop most of what is still required, there is still the question if you actually have the talent for that. And even if they do, this would require many years to build. Oh you will always have competition, and the bar is high.

Just having lots of user-created assets in the world is not enough. It's just gonna be a huge mess. And except your technological capabilities, you would have big product wise challenges as well. You have to have such a platform that would allow every band to have its own way, in a trusted manner.

We need to develop mechanics that would escape an advertising based business model, users will watch and engage with commercial products only if they should choose to do so. We’ll have new type of competition among brands, and this means we need to develop new types of business models.

Big tech companies should be very careful putting out Metaverse products, Failure is forever. Once a brand, a new platform, or experience is rejected it tends not to come back. On the other hand the competition is big, the big players are already in, and if you don’t come up with something, you might find out the train has left the station.

I also think we are going to see quite a lot of failures before we’ll see the success. If a big-tech company comes out with something, I don’t recommend running the socket market and buying their stock. We’ll need to wait and see what happens.

The industry needs to move away from proprietary binaries and file formats, and this would take time, some companies will not be in favor of this. In that sense,we also want to see more cooperation between different products today, for example, common file formats and protocols between Fortnite and Minecraft. Then when players move between products they can keep their well earned identity.

There are some areas in the article I didn’t cover much, I didn’t expand too much on the current market, simply because there are no actual products yet. I didn’t talk about many security aspects. There are also big technological challenges with implementing physics and rendering in such environments, perhaps I will publish more articles on these subjects when time comes.


I have taken a lot of great ideas from a lecture by Tim Sweeney back from 2019. I have tried to formalize his ideas into a well written form, with some of my own insights on the topics. I have not covered all that is possible, but I did try to focus on the important things.

I will end this article with a quote by Tim, I hope you have enjoyed reading it and got this far!

It’s going to be you and your actual friends hanging out having a great time together. It’s going to be small groups. It’s going to be a largely positive experience not affected by global politics to any greater extent than you want it to be and it’s going to be based on open standards that we can all participate in and contribute to.