DeepSeek AI - UPSC Current Affairs 2025

Context:

DeepSeek, a Chinese AI startup, is revolutionizing the market with its cost-effective, open-source models such as DeepSeek-V3, challenging industry standards and offering advanced AI capabilities.

It was founded by Liang Wenfeng in Hangzhou in 2023, and quickly rose to prominence with its innovative chatbot, surpassing established models like ChatGPT in popularity.

About DeepSeek AI

DeepSeek stands out for its high-performing, open-source AI models, like DeepSeek-V3, which was trained with just $5 million—far less than the hundreds of millions invested by companies like OpenAI, Meta, and Google.
It surpassed models like GPT-4 and Claude 3.5 Sonnet in benchmarks and uses a unique Mixture-of-Experts (MOE) architecture, where multiple specialized models collaborate on tasks.
It’s trained on 14.8 trillion tokens, enhancing language understanding and task-specific skills, while a new technique, Multi-Head Latent Attention (MLA), boosts efficiency and reduces training costs.
The Chinese company recently unveiled its new model, DeepSeek-R1.
Features of DeepSeek-R1
- The new model features the ability to “think,” a capability referred to as test-time compute.
- The R1 model uses the same Mixture-of-Experts (MOE) architecture as DeepSeek-V3, allowing specialized models to collaborate on tasks.
- It matches or even surpasses the performance of OpenAI’s frontier models in areas like math, coding, and general knowledge.
- It is reportedly 90-95% more affordable than OpenAI’s O1 model.

Key Features of DeepSeek AI

Open-Source Technology: It stands out with its open-source AI models, including DeepSeek-V3 and DeepSeek-R1, allowing developers and researchers to freely access the source code, collaborate, and drive faster breakthroughs through shared improvements.
Cost Efficiency: It focuses on cost-effectiveness, building its models for just $5.6 million—one-tenth of the cost of OpenAI’s models. This highlights the potential for lower-priced strategies to disrupt the AI market and raises questions about the sustainability of high-priced models.
Performance Capabilities: It is designed for complex reasoning and benchmarks against top models, showing strong performance in mathematics, programming, and natural language processing.
Strategic Development Amid Restrictions: It has adapted to U.S. sanctions by using a mix of high-performance and affordable chips, allowing it to develop powerful AI solutions without relying on expensive hardware like Nvidia’s A100 series.
Market Impact: It has significantly impacted the tech sector, highlighting concerns about the dominance of American AI companies and the shifting dynamics due to competition from Chinese firms.

Difference Between DeepSeek and ChatGPT:

Feature	DeepSeek	ChatGPT
Performance	Competitive with ChatGPT, excels at technical questions and code generation	Strong overall performance, versatile across various tasks
Query Type	Text-based queries only	Text-based, with multimodal capabilities (e.g., AI image generation, voice interaction)
Custom Features	None	Custom GPTs for personalized tasks
Cost	Free, no query limits	Free version available, but advanced features require payment
API Pricing	$0.55 per million input tokens, $2.19 per million output tokens	$15 per million input tokens, $60 per million output tokens
Usage	Ideal for cost-conscious users and developers	Ideal for users seeking diverse, multimodal features

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30