Benchmarking Nano Banana Pro: Alibaba and ByteDance release image generation models on the same day. Will AI-generated images usher in a large-scale application market?

robot
Abstract generation in progress

US-China AI Race: The Competition Is Heating Up

On February 10th, Alibaba and ByteDance released new image generation models on the same day, both targeting Google’s Nano Banana Pro.

Alibaba’s Qwen-Image-2.0 focuses on semantic understanding and practical editing, unifying generation and editing architecture, enhancing Chinese character rendering capabilities, and better understanding long and complex practical instructions. ByteDance’s Seedream 5.0 Preview emphasizes retrieval-based image creation and fine control, significantly improving prompt understanding accuracy, supporting more detailed texture generation and controllable adjustments, and deeply integrating into the entire content creation process.

On February 11th, a computing power operator told the “Daily Economic News” reporter that AI is currently applied in many e-commerce scenarios, including digital humans consuming tokens (the smallest text units) and AI-generated images for e-commerce.熊撼天, senior solutions architect for the Qwen large model, stated in an interview that the latest Qwen-Image-2.0 model has been optimized for e-commerce scenarios, focusing on product detail generation, text controllability, and secondary editing.

Notably, AI-generated images are no longer limited to visual creation but are moving further into enterprise-scale, large-scale applications. By 2025, image generation technology is expected to gradually penetrate e-commerce and manga markets. With ongoing advancements, will 2026 see a scaled-up application market?

Competing with Nano Banana Pro: Domestic AI Image Generation Models Evolve Further

On February 10th, Alibaba and ByteDance both released image generation models. Alibaba’s Qwen released the next-generation image generation and editing model Qwen-Image-2.0, while ByteDance’s platforms, including Jianying and XiaoYunQue, launched the Seedream 5.0 Preview, both models targeting Google’s Nano Banana Pro.

According to Alibaba’s Qwen team, Qwen-Image-2.0 supports ultra-long text input of 1K tokens and 2K high resolution, accurately rendering complex instructions and easily generating professional PPTs and infographics. Additionally, Qwen-Image-2.0 has been fully upgraded based on Qwen-Image and Qwen-Image-Edit, integrating image generation and editing into a single, lightweight model architecture, significantly boosting performance in both image creation and modification.

ByteDance stated that Seedream 5.0 Preview supports 2K and 4K resolution outputs, and users can currently experience 2K output for free on the Jiyun platform.

A senior R&D executive from a listed company mentioned that AI image generation has been used for PPTs and corporate product images, but issues like text detail accuracy and image consistency still exist.

Wu Chenfei, head of visual generation at Qwen, explained that the main reasons for text detail collapse in AI-generated images are twofold. First, most current image generation models use Variational Autoencoders (VAE), which significantly impact the handling of text within images, especially small text. This is because VAE essentially compresses images, making it more difficult to generate images with many textual elements, limited by VAE’s processing capacity. Second, the modeling ability of the image generation model itself is limited; VAE determines the upper limit of the model’s capacity, and the modeling ability directly affects the realism and detail restoration of generated images.

How Far Is AI Image Generation from Enterprise-Scale Applications?

Currently, as AI image generation models iterate, their applications in e-commerce and manga markets attract increasing attention, with the AI manga concept gaining momentum in the capital markets.

On February 11th, Zhang Yi, CEO and chief analyst at iiMedia Research, stated that the main approach to AI manga production is generating images via AI, then converting them into videos, combined with AI voiceovers and subtitles to produce finished content. This is the industry’s current standard.

Dongwu Securities pointed out in a research report that AI can reduce manga production costs by up to 90%. Zhou Liqiang, general manager of AI animation at Chinese Online, previously said that AI simplifies the traditional 11-step manga production process into four steps: script creation, image generation, image-to-video conversion, and post-production, greatly increasing production speed.

One core issue is that AI manga heavily relies on the “drawing card” generation mode. The biggest problem with this mode is that the final output depends almost entirely on the AI’s understanding and reasoning ability, with the “drawing card” artist only able to refine prompts.

According to iMedia Research, most users recognize AI technology’s value in reducing production costs (51.2%) and accelerating IP transformation (47.7%). Nearly half also pointed out significant shortcomings in visual style consistency (47.1%) and voice acting emotional expression (46.7%).

Zhang Yi believes that the AI manga market is experiencing explosive growth. The impact of AI technology on the manga industry presents both opportunities and challenges: it promotes efficiency and content upgrades by lowering costs and speeding up IP transformation, but also pressures the industry to improve content creativity and quality due to issues like style uniformity, voice quality, and character development.

Beyond AI manga, AI image generation is quietly transforming another major market—e-commerce.

There is a high demand for images in e-commerce shopping scenarios. On February 11th, a computing power operator told reporters that AI-generated images are currently one of the most significant token-consuming needs for e-commerce clients besides digital humans.

熊撼天 stated that e-commerce is one of the main scenarios for the deployment of the Qwen-Image model. The latest Qwen-Image-2.0 is based on e-commerce applications, having been upgraded and optimized from previous models, and is expected to promote enterprise-level deployment.

For example, in e-commerce, product detail images and model styling effects can be better generated using the new image generation model. Unlike previous models that required secondary editing for controllable product images, the new iteration merges image editing and generation into a single model, improving efficiency for e-commerce sellers.

(Source: Daily Economic News)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)