This is Part II of our series on humanoids: general purpose robots designed to look, move, and behave like humans.
We’re studying this domain via a deep dive on Beyond Imagination (BI). While BI is a lesser-known player in this race, it’s also one of the earliest, with contributions from some of the biggest names in tech, e.g. Ray Kurzweil (‘The Singularity’), Dean Kamen (Segway), Paul Jacobs (Qualcomm), and Dr. Robert Hariri (Celularity). They’re also one of the only companies to demonstrate real-world humanoid use cases outside of a controlled lab environment.
If you haven’t read Part I, pause… Click this link and start here. It will give you a necessary appreciation for the humanoid mission and provide important context around the ‘why’.
Here in Part II, we cover the ‘what & how’, diving deeper into Beyond Imaginations strategy to compete against juggernauts like Tesla and veterans like Boston Dynamics.
It’s a classic case study in capturing uncontested market space via a ‘Blue Ocean Strategy’ (a must-read book for anyone building a company or product). It’s also a shining example of Spatial Computing + AI being two sides of the same coin; a way to keep humans in the loop, and to give AI eyes, ears, and agency.
Needless to say, humanoids will be a technology breakthrough unlike anything we’ve ever seen. The implications are difficult to comprehend, especially for productivity and the economy.
One thought experiment… what was the economic impact of humanity itself?
On its face, it’s a bizarre question. But if the founders of these companies have their way… every household will have a humanoid, and one day, perhaps every person (just like a smart phone).
In which case… what does society look/feel like with billions of synthetic humans strolling around?
Do they become their own ‘race’ of sorts? Or as Raoul Pal recently described, recognized as an entirely new type of ‘demographic’?
This might seem like a life time away… but it’s not.
This tech is experiencing exponential progress at every layer of the stack: hardware, software, AI, batteries/power, etc.
Most of us will see the beginnings of this future within our life time. The time to get smart on this space is now.
So grab your coffee/tea, settle in, and enjoy…
The Strategy
Winning the humanoid race requires a unique and daunting set of ingredients: access to massive computing power, mountains of real-world training data, world-class AI talent, top notch engineering & design, the best neural nets and AI data engines, and of course, some serious manufacturing prowess.
Fortunately for Elon, Tesla has all of these in spades. Not to mention, the ultimate backyard in which to build, learn, and train their robots: Tesla’s own factories; the ultimate first 'customer' and proving ground. Success here will boost profit margins, and free up cash to re-invest back into the Optimus R&D effort; one helluva internal fly wheel in the making.
Both Figure and Sanctuary are also proving formidable. Figure’s CEO, Brett Adcock, has done a remarkable job raising money, attracting world class talent, and building momentum towards an MVP. He’s also done ‘mission impossible’ before in the world of advanced hardware, with his VTOL company, Archer.
Sanctuary isn’t too far behind, with an equally talented team, and impressive progress with arguably the most important thing to get right first: robotic hands.
Some of these companies are armed to the teeth with capital and resources. But what Beyond Imagination might lack in capital, it makes up for with what Dr. Kloor and Ray Kurzweil believe is their fundamental competitive advantage: the right team and the right vision
A vision that balances function with form. Humanoid aesthetic & behavior is going to have psychological effects on their human colleagues. How they make us ‘feel’ will be a determining factor in mass adoption.
A vision that seeks to assist humans, not replace them. ‘Hired help’ doesn’t have to remain a luxury for the elite. Dr. Kloor sees a future in which every household has a humanoid, freeing up time to do more of what they love, with the people they love.
A vision fueled principally by altruism. For Dr. Kloor, money is just a means to an end. And that end is human flourishing.
This vision serves as the wind in Beyond Imagination’s sails. But to win, Beyond must complement this vision with a unique strategy and thoughtful tack. They must zig where others zag, operating within a quintessential 'blue ocean' space (i.e. where competitive force/spilled blood is minimized and unique advantages/niches can be established)
Between Tesla, Figure, Boston Dynamics, and Sanctuary... much of the ocean appears stained red. But the lanes do exist and Beyond Imagination has them in sight.
Let's break 'em down.
The Strategy Kernel
The kernel of a good strategy stems from a sound diagnosis of the competitive landscape and the key challenges in Beyond’s path.
Here’s a simple analysis of the top competitors:
Boston Dynamics: Boston Dynamics is the OG in the robotics space. They’ve largely been an R&D shop, focused on pushing the technical limits of what robots can physically do, e.g. speed, power, agility, balance. As a result, they tend to focus on ‘dangerous’ and physical demanding tasks, e.g. inspections in small, dark, or risky locations. Or charging into war or crisis/disaster zones, such as military reconnaissance, earthquake recovery, or fighting forest fires. They are less focused on intelligence, dexterity, and human interaction, and more focused on functional power & prowess.
Tesla Optimus: Whereas Boston Dynamics is striving for ‘robot athleticism’, Optimus is focused on the basics of human agency, from legs, to hands dexterity, to semantic understanding/reasoning about the world & objects. All important things, of course, but over the fullness of times, Beyond Imagination believes these things will become off-the-shelf commodities. As for use cases, Tesla will likely focus on industrial manufacturing. They’re currently in the R&D phase, figuring out how to leverage Tesla’s existing tech stack and computer vision IP.
Figure: Figure is also in the prototyping phase, and appears to be going head-to-head with Tesla Optimus with a similar product & general strategy. A ‘bold strategy Cotton. Let’s see if it pays off for em'’.
That said, it does appear they’ll ‘zag’ a bit on the use cases and GTM front, focusing on less nuanced tasks like warehouse sorting/picking and moving/organizing shipping containers.
Sanctuary: Sanctuary is also striving for a full humanoid, complete with hardware & software. But first & foremost, they’re focused on robotic hands, getting deep into the nitty/gritty of dexterity with a robust sensor array and set of algorithms for touch, feel, pressure, etc. Within this context, they’re developing neural networks trained not on text and next word prediction, aka an LLM (Large Language Model) but ‘experiential’ data and next action prediction, aka an LBM (Large Behavior Model).
As for the key challenges, Beyond Imagination’s strategy must take into account the following:
Resources: The top competitors have massive war chests. This can't be a head-to-head R&D investment battle. How can Beyond get to utility and revenue fast? What product tradeoffs have to get made? This goes hand in hand with....
Time: Time is always the enemy within startups, but even more so when racing against Musk. First mover advantage will matter more than ever. What tradeoffs can be made for speed?
Manufacturing & Distribution: You just can't compete head-to-head with Tesla on manufacturing and GTM. How can Beyond find alternative sources of leverage? Whose shoulders can they stand on?
Data: Powerful AI is a function of data quality & scale/volume. How can Beyond take a unique tack to data collection?
Beyond Imagination is not ignorant to these realities and their strategy is designed to address. It consists of three key pillars, each one predicated on speed to market and immediate utility:
An iterative approach to product development
A focus on proprietary AI Synthetic brain architecture and software, and…
An ecosystem of strategic partners (for manufacturing and distribution)
An iterative approach
Productizing humanoids is going to take time and Beyond Imagination indeed plans to play the long game. But they also know they can't take a moonshot approach with a binary, black or white outcome, i.e. where the business doesn't work until it fully works (in some holy-grail end state).
The key to success will be shaping the business and product roadmap such that wins and revenue can be incrementally achieved along the way. Their humanoid, which they call Beomni, needs to be useful out of the gate, in some sort of minimalist way, with the right mechanisms in place to improve and gain a unique advantage over time: the right data engine, the right feedback loops, robust telemetry metrics and analysis, the right customer/use case evaluation techniques, etc.
Doing so is also critical for employee and investor moral. Their credence can’t be, ‘don’t worry, we'll change the world one day’, and then hope people stick around for 10 years to find out. Everyone needs some dopamine hits along the way in the form of visible progress/impact.
Towards this end, Beyond is doing a few things to achieve immediate utility and get quick wins.
First, they're not going to wage war against physics and the plight of mobility/balance in the real world.
Sure, human-like legs is the holy grail. And one day Beyond will get there. But robotic legs will take investment and time that they don't have. Dr. Kloor also believes the tech for legs will become commoditized within five years. So why not let the giants do all the R&D?
Instead, Beyond is simplifying and opting for wheels. Yes, in the beginning, their humanoid won't be able to operate in as many environments. Largely those with stairs. But that’s okay, because the other aspect of their 'iterative approach' is to target use cases and environments where legs & stairs won't be a factor. Not to mention, in the Western World, many work environments are wheels friendly, largely due to ADA compliance (Americans with Disabilities Act).
As for initial use cases, it's likely Tesla will nail the more traditional manufacturing tasks. Especially with the advantage of being their own first customer. So Beyond is focusing elsewhere, opting for tasks that require a particular skill set and a unique type of semantic awareness. Those first two use cases are surgical assistance and eldercare.
Today, surgeons have to instruct a human to find and hand them certain tools. It's a relatively high skilled task that involves identifying, picking, and delivering, all while predicting what instrument the doctor will need next. There's also the task of organizing, cleaning, and putting tools away; all things an AI-powered humanoid could do better than a human.
As for eldercare, the Beomni humanoid has already demonstrated many of the tasks needed to take care of our elderly from cooking and cleaning, to delivery of items, to monitoring and delivering pills, to acting as their shot-term memory. And most importantly, being ever vigilant, 24/7.
Of course, the crux here is training the AI engine to do these tasks at par with the average human. To do so, Beyond is turning to virtual reality, largely for two reasons.
First, VR allows for immediate utility. Similar to how Tesla is starting with ‘humans in the loop’ for their cars, Beomni users will be able to ‘avatar in’ and become the robot, remotely controlling it from the luxury of their own home. This will create a new class of 'remote workers', allowing one person with a niche skill set to work for not just one, but potentially dozens of Beomni customers around the globe.
The second and primary reason for VR? To gather higher-quality, first-person data sets for training their ‘Expert Mind’ AI.
Like Tesla and Figure, Beyond will also train its AI via 3rd-person video, image, text, and sound data libraries. But Beyond believes the best AI training will come from the humanoid actually doing the activity, not just observing. The effectiveness of first vs. third party training data remains to be proven. But this does make intuitive sense; you can watch all the golf videos you want, but there’s no substitute for swinging the actual club.
In short, ‘VR avatars’ should help Beyond address the speed-to-market problem, while also placing a longer-term bet on data advantage via an eventual marketplace of remote workers contributing to ‘AI skill development’. Which is a nice parlay into pillar 2 of their strategy: software and AI.
Software & AI
Competitive hardware is obviously mission critical, and the overarching goal is to build a business that looks & feels like Apple, albeit, with a twist.
Like Apple, Beyond Imagination (BI) is striving for simple, minimalist hardware tightly coupled with elegant software, minus the closed system approach. Beyond’s software strategy has four vectors:
The ‘Beomni Synthetic Brain’ (an AI architecture with ‘multilobe’ AI engines)
Openness and modularity
The VR avatar interface for remote control & skills training, and…
An application & worker marketplace
The Beomni Synthetic Brain is where most of the magic happens. It’s also how they intend to derive a competitive advantage. Whereas competitors are focused today on the humanoid form, Beomni is focused on the ‘humanoid mind’. The vision is to enable 3rd parties to build ‘expert minds’, complete with the awareness and skills to complete any human task.
These neural nets will stand in stark contrast to general purpose LLMs.
They’re more acutely focused on a single occupation and its associated nuance; including sight, sound, touch, motion, planning, and natural language. The result is an ‘LBM’: a large behavior model.
These look/feel like an LLM, but rather than relying on pure text data, it ingests ‘experience’ data. And rather than predicting the next word (which is effectively all an LLM is doing), it’s predicting a future action based upon actions the robot has experienced in the past.
These predictions, largely based upon object recognition and proprioception (the sense/control of our body parts in space), are sent in real-time to an array of motors and actuators; all quite similar to neurons in our brain firing signals to an array of muscles and tendons.
Stitching all these ‘senses’ together is where the real challenge & opportunity lies. Doing so requires a unique approach to data collection & training, while also creating a ‘mind’ that looks/feels a lot like the actual human brain.
This ‘brain’ is a dynamic real-time AI stack with a similar lobe-like structure. Each lobe is designed to handle senses like touch, sight, sound, audio, motion planning, and spatial awareness. It also includes functions like natural speech and gestures to mimic how humans communicate. At the core of this sits a ‘cognitive AI’: a command & control system designed to properly stitch this all together within the context of an associated task.
This synthetic brain will operate what Dr. Kloor calls an Expert Mind: a network of AI models working in harmony to execute tasks at an expert level.
If it sounds daunting and complex, it is. I mean, we’re talking about creating intelligence & agency from scratch.
But Dr. Kloor and team are uniquely suited to pull this off. Not only are Kloor and Kurzweil both experts in AI (Kurzweil wrote the book on how build an AI mind), but Kloor also sits on the board of the Brain Mapping & Therapeutics society. This allows him to sanity check his team’s approach with neurologists, brain mapping experts, and cognitive psychologists; all yielding unique insight into how the brain works and how to model it.
As for modularity & openness, Beyond wants their software to run on numerous hardware platforms. They also don’t want to be overly opinionated about how developers build apps. They plan to provide a platform and SDK that is open and modular, allowing developers to integrate 3rd party tools and AI capabilities.
If a company has already developed in-house the right computer vision software for object detection, or a particular LLM with a medical expertise, these will be easy to swap in and out. The vision here is to have an AI ‘brain’ with specific lobes, e.g. lobes for vision, for language, for navigation, and for the skill itself (e.g. grabbing a knife and slicing up sushi). Each lobe will come with a set of APIs that will make it easy to create ‘cross-lobe’ communication.
To provide immediate utility, Beomni is built with a UI/UX that enables anyone, from anywhere in the world to control it via their AI enhanced VR system. This was demonstrated earlier this year between Romania and LA, with the Government CTO of Microsoft at the controls. The system allows for natural control. As you move, it moves. You see through its eyes, and hear through its ears. Users say they feel like the robot is their own body.
The goal is to make the learning curve near zero, such that an existing workforce can jump in and immediately interact with the world. Ensuring this UI/UX is easy and delightful will be key to attracting developers to build Expert Minds for the Beomni Synthetic Brain.
These Expert Minds will be distributed via a ‘Beyond App Store’ that Dr. Kloor hopes will create an ecosystem of makers/sellers and buyers for all kinds of tasks and odd-jobs that the broader market is best suited to satisfy.
The Beyond App Store marketplace will be unique in a few ways. First, it will allow organizations to submit a ‘task’ or ‘job’ to be done. From there, users can download Beyond AI control software to their VR headset, contribute ‘skills data’ to a training set, and then get paid for the number of hours/amounts of training data contributed.
Alternatively, numerous developers can team up to build a certain skill and then recruit a large group of people to contribute training data to that skill. As data accumulates, people will earn small equity stakes in that skill/application. Once the app is commercialized and accrues value/revenue, anyone who contributed data can get a portion of that payout (an example of automation opening up new type of jobs).
Partnerships
The final pillar of Beyond’s strategy is strategic partnerships, largely on two fronts: manufacturing and B2B distribution.
For manufacturing, Beyond has struck a deal with its first large manufacturer: Dreamtech. They have deep experience in developing advanced hardware products, with expertise in modular / finished solutions for mobile, medical devices, and robotics. The plan is to scale up slowly, with development runs of 5-10 humanoids at a time on a per order basis. As volume increases, Dreamtech will take on mass production. This approach allows both Beyond Imagination and Dreamtech to learn and iterate on the process in lockstep on their way to scale.
For, distribution (aka sales & marketing) Beyond Imagination is building out a dealer network that looks similar to retail models in the automotive and large equipment space (e.g. John Deere and Caterpillar).
The plan is start with a handful of ‘dealers’ that have an expertise in specific verticals, such as healthcare/medical devices, eldercare, hospitality, agriculture, mining, etc. With this approach, each dealer can bring the specific domain expertise that will be needed to properly sell, market, and support the humanoids in the field. Beyond Imagination plans to incentivize these early dealers with equity stakes in the core business.
The Future
Alright, the obvious big question… what does this all lead to?
Some believe humanoids are our best shot at creating AGI (artificial general intelligence), and eventually, super intelligence, as we need real-world data and all its edge cases to fully bring AI to life.
Others believe this is how we’re going to explore and uncover secrets of the cosmos, or perhaps our inner cosmos (aka consciousness and all the mysteries of the mind).
There are also those who feel humanoids will become our friends, our lovers, and heck… even one day, our societal peers: as alluded in the beginning, an entire race and demographic unto themselves.
In either case… one thing is certain: humans will have taken a huge leap towards playing the role of ‘God’ (or gods, The Creator, the aliens who programmed the simulation; whatever story you subscribe to). We’ll have become what author Yaval Noah Harari calls Homo Deus (another must read book if this stuff interests you). At which point, perhaps we’ll pull back the curtains on this whole charade we call life…
Numerous technologies flirt with the notion of ‘homo deus’, like nuclear or gene editing. All with insane implications, no doubt. But to what extent do they impact what most humans do, moment to moment, day to day? To what extent do they produce the scarcest resource known to ‘mankind’: time?
This line of inquiry is a slippery slope towards a vortex of questions with no objective answers; a philosopher’s nirvana.
To me, the question most worth sitting with is this: What do we do with all of the extra time?
And how does that effect our deeper desires? Our desire to matter, to have purpose, to maximize pleasure over pain?
Or perhaps the inverse, which many believe is the true route to happiness, aka: choosing the struggles/pain that matter to us most, and coming out the other side having endured e.g. being a parent, competing in that iron man, writing that book, climbing that mountain, surviving that silent meditation retreat.
Paradoxically, these are the things that give us the most meaning and satisfaction. Not happiness. Happiness is a fleeting and fickle beast. It’s a mood. Moods come and go.
And when we do grasp it, it’s certainly not from ‘a job’. It can be seized by doing good ‘work’ perhaps, but not a ‘job’ (a fixed role with a fixed mandate with a fixed employer with a fixed schedule/clock to be punched).
By that definition… Heck, I don’t want a job. Do you?
I say… give humanoids the jobs.
Let’s go find good work, with good people, and better ways to facilitate the engagement between those who have work to offer and work to be done.
This quest matters because humans are and can offer so much more, with all kinds of skills that go ‘under employed’; as leaders in a family, planners on a trip, writers for a wedding speech, tinkerers in a garage. The unique human super power underpinning all of these things?
Intuition.
As I said in my last essay on this topic, ‘Finding Solace in the Age of AI’:
“Feeling is our superpower. It's how we produce happiness and fulfillment. At our core, everything that we do is in search of a feeling. The feeling of fun, of being in flow, of being in love, of being connected, of being in awe. Feeling is what it means to be alive.
And when you study how AI works, I think you'll find some solace. You'll quickly realize that feeling is something these machines will never be able to do. It's a neurochemical and physiological phenomenon, based upon some sort of biological intelligence that remains a mystery in many ways”
So let’s embrace that mystery. Let’s feel more, let’s think less, and let’s let humanoids have the ‘jobs’; good riddance.
Thanks for taking the time to read this essay! One last authors note and a key takeway from this series: this is not about literally ‘giving’ humanoids our jobs… its about filling a current, and soone to be massive, labor shortage.
Some data to drive the point home, from this eye opening article, “A New Time Bomb: An Explosion of Skilled Worker Shortages”
In 2021, 44% of small businesses had job openings they could not fill, a record 22% higher than the 48-year average for this survey.
92% of businesses seeking workers reported few or no qualified applicants. The U.S. Bureau of Labor Statistics reported that there were a record 8.1 million job openings at the end of March 2021. We estimate the true number to be over 11 million.
‘Job Shock’ will have a major economic impact in the United States and globally. In 2030 estimated U.S unfilled jobs range from 25 to 30 million.
Globally over 95 million jobs could be vacant. The financial costs for individuals., businesses, and nations will be staggering. By 2030 U.S. GDP loss could be over $2.5 trillion. Global losses might reach $18 trillion.