Virulent Word of Mouse

August 28, 2014

Baking the Data Layer

chocolate-chip-cookieThe cookie turned 20 just the other day. More than a tasty morsel of technology, two decades of experimentation have created considerable value around its use.

The cookie originated with the ninth employee of Netscape, Lou Montulli. Fresh out of college in June 1994, Montulli sought to embed a user’s history in a browser’s functions. He added a simple tool, keeping track of the locations users visited. He called his tool a “cookie” to relate it to an earlier era of computing, when systems would exchange data back and forth in what programmers would call “magic cookies.” Every browser maker has included cookies ever since.

The cookie had an obvious virtue over many alternatives: It saved users time, and provided functionality that helped complete online transactions with greater ease. All these years later, very few users delete them (to the disappointment of many privacy experts), even in the browsers designed to make it easy to do so.

Montulli’s invention baked into the Web many questions that show up in online advertising, music, and location-based services. Generating new uses for information requires cooperation between many participants, and that should not be taken for granted.

The cookie’s evolution

Although cookies had been designed to let one firm track one user at a time, in the 1990s many different firms experimented with coordinating across websites in order to develop profiles of users. Tracking users across multiple sites held promise; it let somebody aggregate insights and achieve a survey of a user’s preferences. Knowing a user’s preferences held the promise of more effective targeting of ads and sales opportunities.DoubleClick

DoubleClick was among the first firms to make major headway into such targeting based on observation at multiple websites. Yet, even its efforts faced difficult challenges. For quite a few years nobody ever targeted users with any precision, and overpromises fueled the first half-decade of experiments.

The implementation of pay-per-click and the invention of the keyword auction—located next to an effective search engine—brought about the next great jump in precision. That, too, took a while to ripen, and, as is well known, Google largely figured out the system after the turn of the millennium.

Today we are awash in firms involved in the value chain to sell advertising against keyword auctions. Scores stir the soup at any one time, some using data from cookies and some using a lot more than just that. Firms track a user’s IP addresses, and the user’s Mac address, and some add additional information from outside sources. Increasingly, the ads know about the smartphone’s longitude and latitude, as well as an enormous amount about a user’s history.

nsaAll the information goes into instantaneous statistical programs that would make any analyst at the National Security Agency salivate. The common process today calculates how alike one individual is to another, assesses whether the latest action alters the probability the user will respond to a type of ad, and makes a prediction about the next action.

Let’s not overstate things. Humans are not mechanical. Although it is possible to know plenty about a household’s history of surfing, such data can make general predictions about broad categories of users, at best. The most sophisticated statistical software cannot accurately predict much about a specific household’s online purchase, such as the size of expenditure, its timing, or the branding.

Online ads also are still pretty crude. Recently I went online and bought flowers for my wedding anniversary and forgot to turn off the cookies. Not an hour later, a bunch of ads for flowers turned up in every online session. Not only were those ads too late to matter, but they flashed later in the evening after my wife returned home and began to browse, ruining what was left of the romantic surprise.

Awash in metadata

Viewed at a systemic level, the cookie plays a role in a long chain of operations. Online ads are just one use in a sizable data-brokerage industry. It also shapes plenty of the marketing emails a typical user receives, as well as plenty of offline activities, too.

To see how unique that is, contrast today’s situation with the not-so-distant past.telephone

Consider landline telephone systems. Metadata arises as a byproduct of executing normal business processes. Telephone companies needed the information for billing purposes—for example, the start and stop time for a call, area codes and prefix to indicate originating and ending destination, and so on. It has limited value outside of the stated purpose to just about everyone except, perhaps, the police and the NSA.

Now contrast with a value chain involving more than one firm, again from communications, specifically, cellular phones. Cell phone calls also generate a lot of information for their operations. The first generation of cell phones had to triangulate between multiple towers to hand off a call, and that process required the towers to generate a lot of information about the caller’s location, the time of the call, and so on.

Today’s smartphones do better, providing the user’s longitude and latitude. Many users enable their smartphone’s GPS because a little moving dot on an electronic map can be very handy in an unfamiliar location (for example). That is far from the only use for GPS.

Cellular metadata has acquired many secondary values, and achieving that value involves coordination of many firms, albeit not yet at an instantaneous scale suggestive of Internet ad auctions. For example, cell phone data provides information about the flow of traffic in specific locations. Navteq, which is owned by the part of Nokia not purchased by Microsoft, is one of many firms that make a business from collecting that data. The data provide logistics companies with predictable traffic patterns for their planning.

Think of the modern situation this way: One purpose motivated collecting metadata, and another motivated repurposing the metadata. The open problem focuses on how to create value by using the data for something other than its primary purpose.

Metadata as a source of value

Try one more contrast. Consider a situation without a happy ending.

itunes_logo150New technologies have created new metadata in music, and at multiple firms. Important information comes from any number of commercial participants—ratings sites, online ticket sales, Twitter feeds, social networks, YouTube plays, Spotify requests, and Pandora playlists, not to mention iTunes sales, label sales, and radio play, to name a few.

The music market faces the modern problem. This metadata has created a great opportunity. The data has enormous value to a band manager making choices in real time, for example. Yet, the entire industry has not gotten together to coordinate use of metadata, or even to coordinate on standard reporting norms.

There are several explanations for the chaos. Some observers want to blame Apple, as it has been very deliberate about which metadata from iTunes it shares, and which it does not. However, that is unfair to Apple. First, they are not entirely closed, and some iTunes data does make it into general use. Moreover, Apple does not seem far out of step with industry practices for protecting one’s own self-interest, which points to the underlying issue, I think.

There is a long history of many well-meaning efforts being derailed by narrow-minded selfishness. For decades, merely sampling another performer’s song in any significant length led to a seemingly trivial copyright violation that should have been easy to resolve. Instead, the industry has moved to a poor default solution, requiring samplers to give up a quarter of royalties. With those type of practices, there is very little sampling. That seems suboptimal for a creative industry.

Composers and performers also have had tussles for control over royalties for decades, and some historical blowups took on bitter proportions. The system for sharing royalties in the US today is not some great grand arrangement in which all parties diplomatically compromised to achieve the greater good. Rather, the system was put there as a consent decree after settling an antitrust suit.

If this industry had a history of not sharing before the Internet, who thought the main participants would share metadata? metatagsWho would have expected the participants to agree on how to aggregate those distinct data flows into something useful and valuable? Only the most naive analyst would expect a well-functioning system to ever emerge out of an industry with this history of squabbling.

More generally, any situation involving more than a few participants is ripe for coordination issues, conflict, and missed opportunity. It can be breathtaking when cooperation emerges, as in the online advertising value chain. That is not a foregone conclusion. Some markets will fall into the category of “deals waiting to be done.”


The systems are complicated, but the message is simple. Twenty years after the birth of the cookie, we see models for how to generate value from metadata, as well as how not to. Value chains can emerge, but should not be taken for granted.

More to the point, many opportunities still exist to whip up a recipe for making value from the new data layer, if only the value chain gets organized. On occasion, that goal lends itself to the efforts of a well-managed firm or public efforts, but it can just as easily get neglected by a squabbling set of entrepreneurs and independently minded organizations, acting like too many cooks.

Copyright held by IEEE. To view the original, see here.


May 26, 2014

Did the Internet Prevent all Invention from Moving to one Place?

The diffusion of the internet has had varying effects on the location of economic activity, leading to both increases and decreases in geographic concentration. In an invited column at VoxEU, Chris Forman, Avi Goldfarb and I presents evidence that the internet worked against increasing concentration in invention. This relationship is particularly strong for inventions with more than one inventor, and when inventors live in different cities. Check out the post here.



March 7, 2014

The Irony of Public Funding

Misunderstandings and misstatements perennially pervade any debate about public funding of research and development. That must be so for any topic involving public money, almost by definition, but arguments about funding for scientific research and development contain a unique and special irony.apache-logo

Well-working government funding is, by definition, difficult to assess, because of two criteria common to subsidies for R&D at virtually all western governments: specifically, governments seek to fund activities yielding large benefits, and these activities should be actions not otherwise undertaken by the private sector.

The first action leads government funders to avoid funding scientific research with low rates of return. That sounds good because it avoids wasting money. However, combining it with the second criteria does some funny things. If private firms only fund scientific R&D, where the rate of return can be measured precisely, government funding tends to fund activities where returns are imprecisely measured.

That is the irony of government funding of science. Governments tend to fund scientific research in precisely the areas where the returns are believed to be high, but where there is little data to confirm or refute the belief.

ApacheThis month’s column will illustrate, with a little example, the server software Apache. As explained in a prior column (“How Much Apache?”), Apache was borne and invented with government funding. Today, it is rather large and taken for granted. But how valuable is it? What was the rate of return on this publically funded invention? It has been difficult to measure.


September 27, 2013

Digital Public Goods

Precisely how does the online world provide public goods? That is the question for this DSC_1021column.

Public goods in the digital world contain some of the same features as those in the offline world. Yet, there are some key differences in the boundaries between public and private, and that shapes what arises and what does not.

That will need an explanation. (more…)

February 15, 2013

Gaming Structure

For several years, commentators have forecast that the rise in smartphones and tablets, as well as Facebook, would upend the structure of the gaming market. A variety of novel adroit aliens and irascible animals symbolically represent the new order, while new companies from new genres alter the identities of suppliers. mobile-application-development1

Methinks that all the talk of restructuring is exaggerated. The names have changed, but the same factors still matter for market leadership. The old structure had a number of economic determinants that haven’t gone away. For example, ongoing product development by independent firms continues apace, and all parties must manage the unknowable. Today, as in the past, independent firms cooperate with established publishers when it suits both parties.

If you ask me, we’re transitioning to the same structure with (at most) a new set of players. That’s because two factors used to matter most in gaming—uncertainty and market frictions—and they still do.


April 15, 2012

A Big Payoff

Google and Apple are two of the most profitable companies on the globe today. They seem to share little in common except that achievement. They took very different paths to the stratosphere.

Google, after all, is less than a decade and a half old, a child of the web with a successful approach to advertising, built around a search engine and many services to enhance the user’s experience. Apple is more than twice as old. Its original product, personal computers, makes up a fraction of its sales today, while its future profitability lies with a mix of software in iTunes and new hardware introduced in the last decade—namely, phones, tablets, and portable music devices.

What economic insight emerges from setting these two firms next to one another? A brief discussion of both of their businesses will reveal something trite and something deep. The trite part is this: Some settings produce lots of market value, and some firms capture large parts of that value, but those rarely happen together. The deep part forms the key insight today: these examples are fabulously profitable because they are unique.


July 4, 2011

The grocery scanner and barcode economy

Think about the world of bar codes and scanners. What was life like before their invention? This post offers an appreciation for this staple of modern retail life.

Give the barcode its due. The widespread deployment of barcodes and scanners reduces the costs of keeping accurate and timely inventories. It happened quietly in the last few decades and had numerous consequences.

Think about it. The number of products on the shelf of a typical retail store has increased by tens of thousands. The accuracy of cashiers has increased tremendously because the cashiers do not have to pause to read the price tag. Firms keep better inventory so the frequency of stock-outs — missing items — also has declined.

More to the point, all of that happened because somebody took the time to develop the bar code. Somebody made effort to get everyone in the industry to invent the equipment to take advantage of barcodes.

Among the influential people in that effort was a fellow named Alan Haberman. He passed away last week.

I never knew the man, so I cannot wax eloquent about his life. But I know something about bar codes, as well as the economics of value built around such symbols. Modern life could not exist without them. That is why this post is not a eulogy. It is an appreciation.

It would be an exaggeration to say that barcodes set me on my life’s intellectual path, but they were an influential example when I was a fledgeling and impressionable scholar. The bar code was one of the three canonical examples of the new era unfolding before us in the 1980s, a world of new standardization and increased interoperability. (VCRs and PCs were the other two). Those three examples, as well as a few others, did motivate my interest in the economics of this phenomenon. As readers of this space know, I have stayed here because new examples arise all the time, and in such diverse areas as WiFi, travel intermediaries, the MP3 player, smart phone, and in many places online.

Alright, maybe I am (a little) nuts, but read on.

In appreciation to Haberman’s life’s work, this is an opportunity to wax on a bit about the joys of the scanner economy. Once you begin to recognize the economics of bar codes, you realize that these economics are everywhere.  I hope you find this interesting, illuminating, and a little amusing. (more…)

May 2, 2011

The Direction of Broadband Spillovers

Revenue for US Internet access more than doubled during the first decade of the millennium owing to some simple arithmetic: the number of households using the Internet increased, and prices for broadband access averaged twice those of dial-up. More concretely, in the summer of 2000, while 37.1 percent of US households connected with dial-up, only 4.4 percent had broadband. By October 2009, 63.5 percent of US households connected with broadband.

The upgrade to broadband initially led most US households to spend more time online. At first, much of that new time went into the same activity found in dial-up (for example, checking e-mail, reading news, and shopping). Only gradually did users add activities that dial-up couldn’t handle (such as watching YouTube video, downloading music, or reading many blogs). By now, the transformation is rather apparent: broadband has created more online users and, moreover, these users are more valuable users of electronic commerce and advertising-supported media.

The relationship between broadband’s growth and other online markets is what economists call a growth spillover—that is, growth in one market spilled into another. Spillovers can be negative or positive. For example, broadband’s diffusion produced negative spillovers for the printed magazine and newspaper business, and it produced a positive spillover for online video sharing, such as YouTube.

Spillovers don’t need to be confined to a geographically local area, so they’re often challenging to observe and trace. This column focuses on understanding the geographic direction of the positive spillovers from broadband to online retailers and advertising-supported media, about which we know very little. To whom did the positive gains flow, and where?


March 14, 2011

The Internet and Wage Inequality

What has the Internet Done for the Economy?

The puzzling spread of the commercial Internet could explain wage inequalities

It is hard to overstate how much the business world relies on the Internet. Powerhouse retailers like Target and Wal-Mart can simultaneously manage their changing inventories, warehouses, distribution routes, and sales. FedEx and UPS can code every shipment online so that customers can find out exactly where their packages are and what time they will arrive at their doors. Buying a wedding gift? Just pull up the couple’s online registry and browse the items that have not been purchased yet. Shopping for insurance? You can get quotes quickly via secure online chats with company representatives.

None of that was possible before 1995, when the large, government-controlled networks somewhat begrudgingly opened their lines for commercial use. Advanced Internet technologies spread rapidly in businesses across the country—in small cities, sprawling suburbs, and dense urban hubs. Although this sparked wage and employment spurts everywhere, the gains were far more striking in regions that were already well off, according to a study to appear in the American Economic Review….To read more, click here.

Kellogg Insight provides summary of research articles. This is a summary of “The Internet and Local Wages: A Puzzle,” by Avi Goldfarb, Chris Forman and Shane Greenstein. For more about this topic on this blog see “Will the iPad Flatten us all?”

March 2, 2011

Digital Dark Matter

Astrophysicists draw on the term “dark matter” to describe the unseen parts of the universe. Many symptoms, such as the rotational speed of galaxies and gravitational effects, indicate the presence of dark matter. Yet, our present science lacks the appropriate concepts and tools for measuring directly what we only see indirectly today.

Economists need a similar label for some important building blocks of the digital economy that we do not measure using standard tools. Many indirect symptoms indicate their growth and importance. Many labels have been proposed—invisible infrastructure and private provision of public goods, for example. These labels capture a grain of truth, and, yet, miss something, too.

Let’s just call it “digital dark matter” and review what we know. (more…)

Next Page »

Blog at

%d bloggers like this: