Thursday, January 30, 2025

DeepSeek: How a Small Chinese AI Company is Shaking up US Tech Heavyweights



 January 29, 2025
Facebook

Chinese artificial intelligence (AI) company DeepSeek has sent shockwaves through the tech community, with the release of extremely efficient AI models that can compete with cutting-edge products from US companies such as OpenAI and Anthropic.

Founded in 2023, DeepSeek has achieved its results with a fraction of the cash and computing power of its competitors.

DeepSeek’s “reasoning” R1 model, released last week, provoked excitement among researchers, shock among investors, and responses from AI heavyweights. The company followed up on January 28 with a model that can work with images as well as text.

So what has DeepSeek done, and how did it do it?

What DeepSeek Did

In December, DeepSeek released its V3 model. This is a very powerful “standard” large language model that performs at a similar level to OpenAI’s GPT-4o and Anthropic’s Claude 3.5.

While these models are prone to errors and sometimes make up their own facts, they can carry out tasks such as answering questions, writing essays and generating computer code. On some tests of problem-solving and mathematical reasoning, they score better than the average human.

V3 was trained at a reported cost of about US$5.58 million. This is dramatically cheaper than GPT-4, for example, which cost more than US$100 million to develop.

DeepSeek also claims to have trained V3 using around 2,000 specialised computer chips, specifically H800 GPUs made by NVIDIA. This is again much fewer than other companies, which may have used up to 16,000 of the more powerful H100 chips.

On January 20, DeepSeek released another model, called R1. This is a so-called “reasoning” model, which tries to work through complex problems step by step. These models seem to be better at many tasks that require context and have multiple interrelated parts, such as reading comprehension and strategic planning.

The R1 model is a tweaked version of V3, modified with a technique called reinforcement learning. R1 appears to work at a similar level to OpenAI’s o1, released last year.

DeepSeek also used the same technique to make “reasoning” versions of small open-source models that can run on home computers.

This release has sparked a huge surge of interest in DeepSeek, driving up the popularity of its V3-powered chatbot app and triggering a massive price crash in tech stocks as investors re-evaluate the AI industry. At the time of writing, chipmaker NVIDIA has lost around US$600 billion in value.

How DeepSeek Did It

DeepSeek’s breakthroughs have been in achieving greater efficiency: getting good results with fewer resources. In particular, DeepSeek’s developers have pioneered two techniques that may be adopted by AI researchers more broadly.

The first has to do with a mathematical idea called “sparsity”. AI models have a lot of parameters that determine their responses to inputs (V3 has around 671 billion), but only a small fraction of these parameters is used for any given input.

However, predicting which parameters will be needed isn’t easy. DeepSeek used a new technique to do this, and then trained only those parameters. As a result, its models needed far less training than a conventional approach.

The other trick has to do with how V3 stores information in computer memory. DeepSeek has found a clever way to compress the relevant data, so it is easier to store and access quickly.

What It Means

DeepSeek’s models and techniques have been released under the free MIT License, which means anyone can download and modify them.

While this may be bad news for some AI companies – whose profits might be eroded by the existence of freely available, powerful models – it is great news for the broader AI research community.

At present, a lot of AI research requires access to enormous amounts of computing resources. Researchers like myself who are based at universities (or anywhere except large tech companies) have had limited ability to carry out tests and experiments.

More efficient models and techniques change the situation. Experimentation and development may now be significantly easier for us.

For consumers, access to AI may also become cheaper. More AI models may be run on users’ own devices, such as laptops or phones, rather than running “in the cloud” for a subscription fee.

For researchers who already have a lot of resources, more efficiency may have less of an effect. It is unclear whether DeepSeek’s approach will help to make models with better performance overall, or simply models that are more efficient.The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Tongliang Liu is Associate Professor of Machine Learning and Director of the Sydney AI Centre at University of Sydney

Trump’s Return to Office Order: the Opposite of DOGE?



 January 29, 2025
Facebook

The logo for the Department of Government Efficiency as of November 14, 2024 – Public Domain

In a Wall Street Journal op-ed last November (“The DOGE Plan to Reform Government”), Elon Musk and Vivek Ramaswamy asserted that “[r]equiring federal employees to come to the office five days a week would result in a wave of voluntary terminations that we welcome: If federal employees don’t want to show up, American taxpayers shouldn’t pay them for the Covid-era privilege of staying home.”

With Donald Trump’s inauguration as president,  that recommendation from Musk’s and Ramaswamy’s “DOGE” project — a powerless advisory mill disguised as a “Department” of Government Efficiency — actually got accepted. In a day-one executive order, Trump directed department and agency heads to “take all necessary steps to terminate remote work arrangements and require employees to return to work in-person at their respective duty stations on a full-time basis.”

So, how “efficient” is that idea, really?

I’m a fan of terminating government employment, whether through resignations or firings. So long as those employees aren’t replaced, it’s a win for America. Not on “efficiency” grounds, though. I don’t want the government doing what it does more “efficiently,” I just want it doing less of what it does.

I’m also a fan of remote work in the private sector. If the work actually gets done, it saves employers money, saves employees time, and saves everyone unnecessary inconvenience.

In the government sector, well, see above — I prefer government employment inconvenient, unpleasant, and expensive so that fewer people are willing to accept it.

But from a “government efficiency” standpoint, the “return to office” mandate is a disaster in conception and will likely prove a disaster in execution. Let us count the ways.

First of all, “efficient” employees are highly motivated to get the job done rather than mess around. The kind of person who will take on an unnecessary commute just to sit all day in an uncomfortable office is probably only motivated to collect a paycheck. In other words, the most “efficient” employees will be the ones most likely to self-terminate and return to the productive sector.  I like that outcome, but “government efficiency” fans shouldn’t.

Secondly, to the extent the departing “efficient” employees get replaced, they’ll be replaced by the same kind of inefficient holders down of chairs who remain, lowering overall “efficiency” even more.

Thirdly, consider the costs to the taxpayer. Every government employee who works from home means less money spent on electricity, building maintenance, security screening at office building entrances, etc. Every government employee who comes to the office means more money spent on all those things. Not very “efficient.”

Finally, consider the inconvenience to everyone, government employee or not. Traffic in Washington, DC and surrounding areas has been the subject of constant complaint for as long as I can remember. It’s about to get much worse. A whole bunch of cars that came off the beltway and sat in the driveway starting in 2020 are about to start moving around again, gumming up the works and slowing everyone down.

Overall, none of that sounds very “efficient” to me.

Thomas L. Knapp is director and senior news analyst at the William Lloyd Garrison Center for Libertarian Advocacy Journalism (thegarrisoncenter.org) He lives and works in north central Florida.



Trump’s Taxes (Tariffs) on Imports and Sales Taxes on Stocks (FTT)



 January 29, 2025
FacebookTwitter

A Liberia-flagged vehicle cargo ship on the Columbia River, transporting cars from South Korea to the West Coast of the US. Photo: Jeffrey St. Clair.

I have been pushing for financial transactions taxes (FTT) for more than three decades. The logic is straightforward. We have an enormous volume of transactions in the financial sector that serve no productive purpose. Hedge funds and other big actors can buy millions of dollars of stock or other financial assets and then sell them off five minutes or even five seconds later.

While these trades can make some people very rich, they serve no economic purpose. It is important that we have well-working financial markets where businesses can raise capital and people can invest their savings, but these short order trades do not advance these ends. The total volume of trading of stock is now more than $150 trillion a year, more than five times GDP. Trading in bonds would also be in the tens of trillions, while the notional value of trading in options, futures, and other derivative instruments is in the thousands of trillions.

Given the incredible volume of trading, even a modest tax could raise an enormous amount of money, as can be seen with simple arithmetic. If we taxed $150 trillion in stock trades at a 0.1 percent rate (ten cents on one hundred dollars), it would raise $150 billion a year. If we applied scaled taxes to trades of bonds and derivatives we could get to twice this amount, or $300 billion.

However, this would hugely overstate the amount the tax would raise, since there would be a large reduction in trading volume. Most estimates of the impact of higher costs on trading volume find that the reduction in trading volume is roughly proportionate to the increase in trading costs. If the tax doubles trading costs, which this rate roughly would, then we can expect trading volume to be cut in half. That means that this sort of tax could raise roughly $150 billion a year or a bit more than 0.5 percent of GDP.

However, the neat aspect to this tax is the reduction in trading volume caused by the tax is actually a good thing. If we were to tax housing or health care, and people reduced the amount of housing or health care they consumed, that would be a bad story since people value housing and health care. But no one values trading in the same way. If we eliminated $150 billion in trading expenses, this would effectively make the financial sector more efficient, unless there was some reason to believe that it would be less capable of allocating capital or keeping savings secure.

Since even a 50 percent reduction in trading volume would still mean we had very high volumes, and much higher than in prior decades, it is hard to believe that the operations of the financial markets would be seriously impeded. We would just see many fewer people making big fortunes by beating the market by a few hours or seconds. That is bad news for these would be billionaires, but not the sort of thing the rest of us need to worry about. They can look for more productive jobs elsewhere.

So why don’t we have financial transactions taxes? The main reason is that the billionaires who make big bucks on short-term trades make large campaign contributions to politicians to ensure they never get enacted. But special tax treatment of stock sales, as opposed to sales of items like shoes and furniture, in order to protect billionaires’ money, is not a very good political argument.

So instead, we have people jumping up and down yelling about how a FTT would be a tax on the savings of ordinary people. The Wall Street shills tell us that if we imposed a tax of 0.1 percent on stock trades, middle-income people would be nailed on their 401(k)s.

Let’s look at the arithmetic on that. The median 401(k) balance is roughly $140,000. Let’s say 15 percent of this turns over each year or $21,000. If there were a 0.1 percent tax on these trades, that would cost this person $21 a year. Even this is an overstatement, since we would expect that they would reduce their trading volume roughly in proportion to the amount of the tax.

While individuals typically aren’t trading stocks directly in their 401(k)s, we would expect their fund managers to reduce their trading roughly in proportion to the size of the tax. That would mean that their funds would reduce their trading costs by an amount roughly equal to the $21 that the typical 401(k) holder would pay in taxes. The net in this story would be close to zero, with the savings on trading costs offsetting the tax.

But let’s take the $21 tax bill that is supposedly a big concern for politicians who say they otherwise might be interested in an FTT. President Trump has repeatedly talked about his plans for big taxes on imports or tariffs. While he constantly changes the amount of the taxes he wants to impose and the imports on which he would impose them, the Center for American Progress recently estimated that Trump’s import taxes would cost the typical family $3,300 a year.

There are many reasons for thinking these taxes are bad policy, but it is worth just making the comparison of the size of the tax burden that scares ostensibly progressive politicians away from supporting a financial transaction tax with the burden that Trump’s import taxes would impose, as shown below.


As can be seen, the burden of Trump’s import taxes is more than 150 times as large as the burden from a financial transaction tax on the median 401(k) holder. However, for some reason this burden does not appear to be a major obstacle to putting Trump’s import taxes into effect. Draw your own conclusions.

This first appeared on Dean Baker’s Beat the Press blog.

Dean Baker is the senior economist at the Center for Economic and Policy Research in Washington,