July 3, 2014

As part of the various hearings on potential revisions of the Copyright Act, the issue of public databases of copyrightable works has arisen again. A similar issue came up in Pandora’s lawsuit against songwriters in the ASCAP rate court with Pandora claiming that they would be unable to exclude directly licensed works from blanket licensed works if publishers in direct licenses with the company would not give Pandora a list of licensed works, presumably at no cost to Pandora.  (And since I can’t imagine the ASCAP rate court ever ruling on anything that would benefit a songwriter, I can’t imagine the judge restricting Pandora’s ability to resell this valuable data for Pandora’s benefit.)

Sound reasonable, right? Not so much if what Pandora gets for free through the front door it sells out the back door, either directly or indirectly. For example, royalty reporting is costly to a company like Pandora.  If Pandora wanted to outsource their royalty reporting, publisher rights information (often called generically “metadata”) that publishers were forced to give to Pandora–but not publicly disclose–might be a valuable gift from the court to Pandora.  Particularly if Pandora found a company in the business of reselling publisher metadata.  That company that might find having a free copy of publisher ownership information a lucrative benefit of working for Pandora that could be resold to other music users.  That company might be willing to lower the cost to Pandora of providing royalty reporting in exchange for that metadata which would have more value to them the longer it was not available (at all or easily) to competitors. And that’s just one of the “nondisplay uses” that you could think of.

According to Ed Christman in Billboard, Universal Music Publishing is contemplating making the ownership information on their catalog available to the public—including Pandora, presumably–on a voluntary basis:

In a move to address potential changes in music licensing, Universal Music Publishing Group will make its entire song database more transparent and easily accessible to music licensees. The enhancement to the company’s data repository will even allow licensees to get a list of all the songs controlled by UMPG.

Details of UMPG’s move are forthcoming, but I hope that when the details do emerge, it will demonstrate this to be a very good move. First of all, by voluntarily making the data available, a publisher (or any rights holder) can also dictate the terms of use for the valuable data. It will draw a dividing line between what is and is not confidential information as well as establish a geographical boundary on where the data can be used. (Some countries may have greater data protection laws than others.)

A rights owner could, for example, voluntarily provide song name, writer names, names of artists performing the song, and contact information in their own database. By comparison, if the copyright owner (either U.S. or ex-U.S.) were required by law to provide anything more that the basic information for copyright registration, it’s likely that such a requirement could violate the foreign treaty obligations of the U.S. under TRIPS and certain Free Trade Agreements as an unenforceable formality prohibited by the Berne Convention and could be subject to arbitration at the World Trade Organization.

By voluntarily disclosing limited confidential information about its catalog, a rights holder could establish a legal basis to preempt non-display uses of the rights holder’s valuable property rights in that data not authorized by the rights holder.

“Nondisplay uses” is a category of rights exploitation that we have beaten the drum about on MTP for years. What is a “nondisplay use”? For example, when Google scans millions of books without permission of the authors and then uses its billions to fund lawsuits against the authors whose books it scanned, did anyone ask why this issue would be just so important to the company that they’d throw a couple hundred million in legal fees at it?

If the courts could get past gushing over the misperception of Google’s motive as being anything other than good old basic Silicon Valley greed, one might realize that teaching machines to translate text from one language to another is ever so much easier when you can teach your machines by referring to phrases that already have been carefully translated by humans paid by book publishers. This makes scraping keywords from your gmail much simpler, or if you had a client who wanted someone to translate voice intercepts from say, oh, I don’t know, Pashto into English. A client such as the National Security Agency.

So in my view, the books in the Google Books case were just the bright and shiny object. The nondisplay uses were the value.

See, e.g., Google Shuts Down GOOG-411. The always loquacious Marissa Meyer (then at Google) summed it up after giving her famous chat about how Google favors its own products in search:

“The speech recognition experts that we have say: If you want us to build a really robust speech model [who is “you”?] we need a lot of phonemes, which is a syllable as spoken by a particular voice with a particular intonation. So we need a lot of people talking, saying things so that we can ultimately train off of that. … So 1-800-GOOG-411 is about that: Getting a bunch of different speech samples so that when you call up or we’re trying to get the voice out of video, we can do it with high accuracy.”

Using those phonemes to recognize and monetize speech from voice intercepts (from Google Voice or the NSA) would be a nondisplay use of GOOG 411. Another less ominous example of a nondisplay use would be forcing a person (individual or company) to give up its confidential information for one seemingly public purpose, but then having no prohibition on using that same confidential information for a “nondisplay use,” meaning you get the information free through the front door, then take the information out the back door for your own profit, either by force of law or market domination.

Before we hear about permissionless innovation and all the wonderful things that a Google could do with that data if it were just liberated, realize that whenever a digital retailer sets up shop they either get a delivery of the relevant metadata that they need to account directly from the rights holder or comply with statutory reporting requirements.

Or at least they are supposed to. Getting retailers to provide accurate books and royalty payments has a lot more to do with the retailers desire to render those books and payments accurately than the availability of public information about the songs, recordings, books or images being exploited. This is certainly what I have seen after 15 years of observing this space and being on the team at SNOCAP that built a significant database of recordings and song data, including audio fingerprints.

The availability of accurate information does not necessarily invoke honesty, but honesty may invoke a desire to utilize accurate information properly.

I’m glad to see rights holders making catalog information available publicly. Nondisplay uses of the information can be conditioned by terms of use, and it keeps out mandatory formalities that will probably find the U.S. back in the WTO (see Fairness in Music Licensing Act that essentially socialized the cost of some music in restaurants through fines in the WTO, fines paid by the American taxpayer for the benefit of restaurateurs).

For the honest people, this will probably not be a huge change as they don’t use unlicensed works and they get the metadata with the license or have statutory guidelines. But for the shady people, there’s no more excuses for an “unmatched” account for these catalogs. If that results in some services getting caught using the “unmatched” songwriter’s money for their overhead, somebody may go out of business.

Or as we say in Texas, “Oops….”

