Layered Computing Drives Industry Consolidation
Advertisements
The demand for computational power has surged dramatically, driven predominantly by advancements in artificial intelligence and generative content technologies, commonly referred to as AIGC (Artificial Intelligence Generated Content). Recent insights from the "China Computing Power Development Report (2024)" reveal the ongoing expansion of this market, with computation capabilities skyrocketing to an astounding 910 EFLOPS by the end of 2023, illustrating a 40% increase year-on-yearNotably, the realm of intelligent computing power has experienced an even more remarkable growth rate, surging by 136% compared to previous figures.
In this ever-evolving landscape, traditional service providers, such as cloud computing companies, have doubled down on their investments in this sectorThe trend has spurred nearly 40 publicly listed companies, including the likes of Hongbo Co., Lotus Holdings, and Hubei Broadcasting, to pivot into the realm of computing power leasing within the latter half of 2023. This rapid influx of new players, however, paints a mixed picture as the climate appears to cool as swiftly as it escalated, illustrating a rush to capitalize on a booming market that some underestimated.
Within less than a year, several enterprises sought to withdraw from their cross-sector ventures, signaling a stark transition from the initial enthusiasm to a more tempered market reality
The fervor surrounding computing power leasing is dissipating, primarily due to miscalculations surrounding the financial investments and technical barriers inherent in this field.
Examining the current dynamics, three fundamental changes characterize the landscape of computing power centered around large modelsAccording to Pu Wei, CFO of Hongbo Coand CEO of Yingbo Data Science, there has been a dramatic shift in the domestic intelligent computing sector, akin to a complete turnaroundFrom their perspective at Yingbo, the demand for training clusters for general large models is evolving toward higher capacities, while specialized models are transitioning from fixed capacities to more flexible implementations.
Initially, the focus on the training of general large models reveals a trend towards a "smaller yet sharper" approachLeading companies such as OpenAI and xAI are working on clusters with tens of thousands of cards, while domestic titans like ByteDance and Alibaba are ramping up toward similar scales
- U.S. Key Data Released: Is December Rate Cut Inevitable?
- Biotech Explores Funding Options Beyond Deals
- Why the Dollar Is Bucking the Trend of Rate Cuts
- ECB Rate Cut Warnings Mount
- Layered Computing Drives Industry Consolidation
The gradual move from smaller clusters to clusters reaching tens of thousands indicates a significant ceiling for what can be achieved at current resource levels.
There’s an ongoing concerted effort to shift from smaller data centers, which have historically operated at a few thousand cards or fewer, to build extensive new intelligent computing centers capable of operating at these larger scalesAs Pu Wei aptly puts it, "If you don't have a large-scale general model, the iterative advancements become a bottleneck.”
The concentration of computing demands has also shifted with elite tech giants driving the development of foundational modelsIn contrast, vertical large models, which cater to specific sectors like research, finance, and retail, showcase a more fragmented approach, marked by temporary and fluctuating demandsThis evolution has led to a transition from a grand-scale quantitative approach to a more scalable and elastic usage model that balances costs with user experience.
These insights reveal a trend where companies are progressively honing in on industry-specific models, adapting foundational models to suit their tailored needs
This necessitates flexibility in computing needs, especially when only requiring power for specific tasks like model fine-tuning, outlining a fresh opportunity in the market.
Finally, the commercial rollout of large models alongside the rise of multimodal models is resulting in an explosion in inference needsProjections indicate that from 2022 to 2027, the proportion of inference workloads within AI servers in China is expected to jump from 58.4% to 72.6%, marking a pivotal shift from training to inference as predominant workloads.
As multimodal models gain traction, applications in text generation, image synthesis, and video creation are proliferating, pushing AI from generalized applications to more niche market deployments, akin to code generation and 3D transformation from imagesThe rising demand for inference underscores the necessity for quicker and more adaptable computing capabilities, particularly in high-frequency commercial tasks such as customer service dialogues and marketing generation, driving the ongoing expansion of the inference market.
Addressing the stratified needs for computing power, the landscape remains considerably stable among large model businesses due to the significant financial commitments required
Many companies are now focused on executing commercial transformations and consolidating their operationsWhile some giants are trimming their pre-training scope due to financial regulations, there remains no evidence to suggest a cooling demand for core computation requirementsThis indicates that enterprises involved in large model training are ramping up their scales instead of retracting their investments.
The success of the computing power industry does not hinge on contraction but rather on adaptability, with companies needing to remain agile and responsive to the rapidly shifting landscapeYingbo Cloud, for instance, is emphasizing two key areas: it seeks to cater to significant clients needing expansive pre-training capacities, drawing from its own experience in building and managing clusters to provide bespoke solutionsConcurrently, it’s offering elastic Kubernetes cluster services that mix GPU and CPU resources for smaller clients with varied computation needs.
In contrast to traditional cloud service providers, Yingbo is capitalizing on the GPU-centric cloud market, delivering customized services while maintaining a rigorous focus on cost management
“We have filled a niche within the GPU computing cloud segment,” said Song Chen, Vice President of Yingbo Data Science.
Furthermore, innovative cost evaluation frameworks, such as the "Unit Effective Computing Power Cost," are now being toutedThese metrics comprehensively assess quality based on the ratio of investment to efficacy, considering elements like equipment costs and utilization ratesSuch frameworks could redefine service quality expectations within the sector.
The infrastructure surrounding computing centers presents its own set of complexities, notably the challenge of linking multiple power clusters to perform at a large scaleCTO Li Shaopeng emphasizes the strategic need for superior physical setups that unite all GPU servers within a singular operational framework to achieve the goal of one thousand or more parallel training units effectively.
As the strata of long-tail clients emerge as primary consumers of computing services, there’s a pivot towards elastic computing offerings, transitioning the industry towards more efficient, cost-effective operational frameworks