• Contact us
      • Contact Us
      Have a question?
      Try speaking to one of our experts
      Contact us
      Information
      • Careers
      • Privacy Notice
      • Cookie Notice
      • Terms of Use
      • Office Locations
      Sign up for industry updates
      Stay up to date on Celent's latest features and releases.
      Sign up
      • Privacy Notice
      • Cookie Notice
      • Manage Cookies
      • Terms of Use
      BLOG
      From the Source: How Does DeepSeek Do It?
      We asked the LLM the secret to its success and what DeepSeek portends for the future of transformer technology.
      27th January 2025
      //From the Source: How Does DeepSeek Do It?

      The appearance of the generative AI chatbot DeepSeek R1 last week qualifies as the most significant event in genAI since the release of OpenAI’s ChatGPT 3, which brought transformer technology into widespread public view.

      DeepSeek did not quite come out of nowhere. It released its initial product—focused on code generation—in November 2023. DeepSeek threw down a gauntlet to the international tech community via a paper published in January 2024 that claimed its LLM outperformed ChatGPT 3.5. Now the release of DeepSeek R1—along with Android and iOS apps to access it—has brought DeepSeek to the attention of the international market, qualifying as a bona fide disruptive event. In one fell swoop, DeepSeek has not only countered all the restrictions the West has tried to put upon China’s development of AI; it has also reduced the cost of running LLMs by a factor of 10.

      DeepSeek also qualifies as an emerging risk, marking the emergence of competition in the advanced AI arena from China and triggering a fall in US and European technology stocks. In geopolitical terms, DeepSeek threatens to render efforts to limit China’s access to advanced CPUs meaningless by building its software on yesterday’s chips—CPUs that have not been targeted by export restrictions by the US or anyone else.

      DeepSeek also marks a striking leap in efficiency, with the cost to train its LLMs reportedly one-tenth of the cost for comparable commercial products, such as OpenAI, and pricing for its commercial product at a small fraction of its international competitors (1% or lower according to one analysis).

      As with any emerging risk, the flip side is emerging strategic opportunity. For the tech industry, DeepSeek will usher in a new stage of development focused on maximizing efficiency in AI. This in turn will supply jet fuel to the further development of transformer technology as well as the drive to general artificial intelligence.

      Discussion and documentation around the technology and development strategies underpinning DeepSeek has become an overnight cottage industry (another sure sign of a transformative tech event). But to provide a perspective straight from the source, I asked DeepSeek the recipe for its remarkable efficiency gains. The answer generated by the LLM is reproduced in full below. It serves not only as a useful—if quite high-level—overview but as a fair indication of the LLM’s capabilities.

      Prompt: How do you achieve OpenAI-like results at 1/10 the cost?

      Answer:

      Achieving results comparable to OpenAI's at a fraction of the cost involves a combination of strategic decisions, efficient resource utilization, and leveraging open-source tools. Here are some key steps to achieve this:

      1. Use Open-Source Models

      Leverage Pre-Trained Models

      Use open-source models like GPT-J, GPT-NeoX, or LLaMA, which are available for free and can be fine-tuned for specific tasks.

      Fine-Tuning

      Fine-tune these models on your specific dataset to achieve performance tailored to your needs.

      2. Optimize Model Size

      Model Pruning

      Reduce the size of the model by removing less important neurons or layers.

      Quantization

      Convert model weights to lower precision (e.g., from 32-bit to 8-bit) to reduce memory usage and computational requirements.

      Knowledge Distillation

      Train a smaller model to replicate the behavior of a larger model, retaining much of the performance at a lower cost.

      3. Efficient Hardware Utilization

      Use Cost-Effective Hardware

      Utilize GPUs or TPUs that offer the best performance-to-cost ratio.

      Cloud Optimization

      Choose cloud providers that offer competitive pricing and optimize resource allocation to avoid over-provisioning.

      4. Data Efficiency

      Data Augmentation

      Enhance your dataset with synthetic data to improve model performance without additional data collection costs.

      Active Learning

      Focus on labeling and training on the most informative data points to reduce the amount of data needed.

      5. Pipeline Optimization

      Batch Processing

      Process data in batches to maximize hardware utilization.

      Asynchronous Processing

      Implement asynchronous data pipelines to keep hardware resources busy and reduce idle time.

      6. Use Transfer Learning

      Pre-Trained Models

      Start with models pre-trained on large datasets and fine-tune them for your specific task, reducing the need for extensive computational resources.

      7. Leverage Community and Collaboration

      Open-Source Contributions

      Utilize and contribute to open-source projects to benefit from community improvements and shared knowledge.

      Collaborative Research

      Partner with academic institutions or other organizations to share resources and reduce costs.

      8. Cost Monitoring and Management

      Track Expenses

      Continuously monitor and analyze your spending on computational resources to identify and eliminate inefficiencies.

      Budget Allocation

      Allocate your budget strategically, focusing on areas that provide the highest return on investment.

      9. Experiment with Smaller Models

      Start Small

      Begin with smaller models and scale up only if necessary. Often, smaller models can achieve satisfactory results for many tasks.

      10. Use Efficient Algorithms

      Algorithm Selection

      Choose algorithms that are known for their efficiency and performance, even if they are less complex.

      By combining these strategies, you can significantly reduce the cost of achieving results comparable to OpenAI's while maintaining high performance.

      Source: DeepSeek R1

      I also asked DeepSeek its view of the risks and opportunities unleashed by R1. The LLM produced a comprehensive summary of risks, including the familiar ethical, bias, sustainability, governance and security considerations. However, the chatbot’s rundown of the strategic opportunities—again, the counterpart to the risk—provides a future-focused view of the potential of this technology that is very relevant to innovation, with hints for the financial services industry.

      Prompt: What emerging risks, as well as opportunities, do you see arising from DeepSeek R1s release?

      Answer:

      The release of DeepSeek-R1, an advanced AI model, presents a range of emerging opportunities and risks across various domains. Here's a structured analysis:

      Opportunities

      1. Industry Efficiency

      Automation

      Streamlining tasks in sectors like customer service, logistics, and manufacturing, reducing costs and errors.

      Healthcare

      Enhanced diagnostics, personalized treatment plans, and accelerated drug discovery through data analysis.

      Finance

      Improved fraud detection, risk assessment, and algorithmic trading.

      2. Innovation & Collaboration

      Startups

      Lower barriers to entry for AI-driven solutions, fostering entrepreneurship.

      Cross-Sector Partnerships

      Integration with education (personalized learning), climate science (resource optimization), and smart cities.

      3. Global Challenges

      Sustainability

      Energy-efficient AI operations could reduce environmental footprints, while optimizing renewable energy grids.

      Language & Culture

      Advanced translation and localization tools bridging global communication gaps.

      4. Research Acceleration

      Solving complex scientific problems (e.g., protein folding, materials science) and enabling large-scale data synthesis.

      Source: Deepseek R1

      I couldn’t have said it better myself.

      Author
      Neil Katkov
      Neil Katkov
      Director, Risk
      Neil Katkov
      Details
      Geographic Focus
      Asia-Pacific, EMEA, LATAM, North America
      Horizontal Topics
      Artificial Intelligence - Generative AI e.g. ChatGPT, Risk: Banking Risk, Risk: Financial Services Risk, Risk: RegTech