Professor Nunberg provides some excellent slides illustrating his complaints about the abysmal quality of Google Book search and its metadata. Recall that Jean-Noël Jeanneney, former president of the Bibliothèque Nationale of France, warned of the same kind of problems and gave similar examples of botched scans of French cultural works in his 2007 work “Google and the Myth of Universal Knowledge“.
Jean-Noël Jeanneney will be partcularly galled (no pun intended) by yet another variety of metadata screwup in Google Books–authorship attributed to the writer of a forward–Madame Bovary by HENRY JAMES for example.
Now remember–this is The Google we are talking about. The Google who only hires the best and the brightest. The Google who only hires from the best schools. The Google who would have you believe that they are the second coming. The Google who seems to employ people who don’t know who wrote Madame Bovary and who don’t know that Tom Wolfe wasn’t born in 1888.
The librarians who trusted The Google to scan their works thought they would get something back that was going to further their mission. I feel very, very certain that the metadata that was delivered by the best libraries in the world along with the books to be scanned was correct. What these librarians got back was gibberish.
Given the right machine, you could train a reasonably intelligent pet to scan books. In fact, I have dogs that could do a bang-up job with a little training. If–all they had to do was hit the “scan” button.
I would not expect a dog to know who wrote Madame Bovary.
What is valuable about a registry of intellectual property is not the digitized assets. Any fool with some money and time can digitize books. What is valuable is the name, rank and serial number that are connected to the books. Not to mention how they are organized, which has all kinds of cultural overtones.
But if you can’t even know who the authors are with any reliability or if you can’t know when a book was published (that is–who to pay and how much), then you’ll never be able to associate the payee information (such as W-9) with the titles.
That’s if you actually ever intended to pay anyone anything.
I wonder how these librarians feel now. Looks like Marian got googled.
The Copyright Alliance has launched a letter writing campaign that I would encourage you to sign up for if you believe this excerpt:
“[W]e [artists] are under assault. Our rights to control the distribution, use, and reproduction of our works in our vibrant digital age are dismissed by many who do not understand the value we bring to society. They tell us to work harder, create better, and give our works away. Some think that they should control our works and that they should be able to appropriate, perform, and copy them how they please, without our consent, benefit, or participation.”
If that’s you, click here.
Google’s Book Search: A Disaster for Scholars. Now that title caught my eye, not the least because it appeared in the Chronicle of Higher Education. The article is extraordinarily honest and well written, with solid research and supporting evidence.
We’ve become accustomed to librarians and academics uncritically fawning over the disaster that is Google Books (especially those privileged librarians among the sovereignly immune), but give this one by Professor Geoffrey Nunberg a read, particularly regarding the “metadata,” the fields in the Google Book Search that are supposed to contain information like year of publication, title, author, etc.
“Start with publication dates. To take Google’s word for it, 1899 was a literary annus mirabilis, which saw the publication of Raymond Chandler’s Killer in the Rain, The Portable Dorothy Parker, André Malraux’s La Condition Humaine, Stephen King’s Christine, The Complete Shorter Fiction of Virginia Woolf, Raymond Williams’s Culture and Society 1780-1950, and Robert Shelton’s biography of Bob Dylan, to name just a few. And while there may be particular reasons why 1899 comes up so often, such misdatings are spread out across the centuries. A book on Peter F. Drucker is dated 1905, four years before the management consultant was even born; a book of Virginia Woolf’s letters is dated 1900, when she would have been 8 years old. Tom Wolfe’s Bonfire of the Vanities is dated 1888, and an edition of Henry James’s What Maisie Knew is dated 1848….Google acknowledges the incorrect dates but says they came from the providers.”
Of course, Google claims that these mistakes are the copyright owners’ fault per usual–but what is interesting about this catalog of boneheaded errors is that the mistakes always seem to make the works OLDER than they actually are. Therefore more likely to be out of copyright and non-infringing, as opposed to NEWER than they actually are, therefore more likely to be in copyright and infringing. In fact–”to take Google’s word for it”–it seems a safe guess that all the listed works would be in the public domain according to the incorrect dates that Google has placed in their metadata. Which benefits Google. Of course, not even Google can make a book into a public domain work when it isn’t, but it does suggest that Google could say that an unlicensed work in copyright got into the claws of Google Books “by mistake”–an “innocent” infringement because the metadata said the work was in the public domain.
As one commenter to Professor Nunberg’s article notes: “Here’s another one: a recurring problem with date of publication is that all volumes of a journal are assigned the date of voulme 1.” That is–the oldest possible date. God knows what they did with the sheet music.
The applications running the Google Books registry will need to make a distinction between works in copyright and works out of copyright. That is a very important date in the settlement agreement. Where do you think that date is going to come from? It is starting to look like it will come from incorrrect data–data that makes the publication dates MUCH OLDER than they actually are.
Who’s going to check to see that the millions of copyright dates are correct? Nobody. And it’s yet one more thing for the overburdened copyright owner to sort out as Google continues its “cultural rape.”
“Innocent” infringement versus “intentional” infringement creates a rather large difference in how the punishment for the infringement is treated on judgement day–which would be on the later of the date that the plaintiff gets a final non-appealable judgement against Google for copyright infringement–or the author dies penniless. Also likely to foreclose criminal prosecution.
Perhaps this all has something to do with the mysteries of advertising placement? Professor Nunberg says he was told that “[t]he ad placement on Google’s book search right now is often comical, as when a search for Leaves of Grass brings up ads for plant and sod retailers.”
Hmmm. I think we noted that possibility two years ago in my review of “Google and the Myth of Universal Knowledge” where the absurdity of selling advertising in books was well argued by the prescient Jean-Noël Jeanneney, former president of the Bibliothèque Nationale of France:
“Recall Google CEO Eric Schmidt’s statements to the Wall Street Journal on the eve of the Viacom lawsuit: When asked to respond to the idea that “content” has intrinsic value, he said “prove it”. Which has to be one of the dumber, but yet illuminating, remarks to come from a Silicon Valley CEO on the subject of art and culture.
No wonder M. Jeanneney tells us that ‘[t]he visit I received from several Google executives after the beginning of my campaign [against Google Books] didn’t do much to reassure me.’
These statements echo and confirm one of the most important points raised in Google the Myth: M. Jeanneney writes, ‘What pays for the digitization of materials are linked advertisements from companies that have an interest in associating their image with old or recent works likely to promote that image. As a result, books will necessarily be hierarchized in favor of those best suited to satisfy the demands of advertisers, again, chosen according to the principal of the highest bidder [as is Google AdWords]. I wouldn’t want to see—although I’m amused by the thought—the text of Saint-Exupéry’s Le Petit prince accompanied by an ad for a sheep merchant.’”
Right on cue, Professor Nunberg tells us:
“Google’s fine algorithmic hand is also evident in a lot of classifications of recent works….[Google assigns a "]Religion["] tag…to a 2001 biography of Mae West that’s subtitled An Icon in Black and White [and] the Health & Fitness label on a 1962 number of the medievalist journal Speculum….
But even when it gets the [bookseller's standard] categories roughly right, the more important question is why Google would want to use those headings in the first place. People from Google have told me they weren’t included at the publishers’ request, and it may be that someone thought they’d be helpful for ad placement.”
So before you write off “cultural rape” as mere French “yankee go home” hyperbole, think again.
Professor Nunberg sums it up: “[Y]ou need reliable metadata about dates and categories, which is why it’s so disappointing that the book search’s metadata are a train wreck: a mishmash wrapped in a muddle wrapped in a mess.”
Maybe it’s not hyperbole, and maybe it is cultural rape for real. All those statements about what a great idea Google Books is, how it will make “millions” of titles available to the public–maybe that is yet more evidence of Google’s charm offensive to mask what an unmitigated disaster this project is from a cultural, copyright, antitrust and now scholarly perspective.
It’s nice to find an academic being honest about Google’s screw ups. Since Professor Nunberg is at Berkeley, Google might actually listen to him.
But it’s unlikely. The do-over to fix the metadata and cataloging system will take a very long time at vast cost. “Fixing” Google Books is not in Google’s interest and who can make them? As the scanning keeps going every minute of every day, it is becoming increasingly clear that Google thinks of Google Books as Google’s books–the entire intellectual capital of the world.
It shows again what happens when you put the sole retailer in charge of the metadata–the monopsonist buyer has no incentive to act to the benefit of its sellers, particularly when the sellers’ only enforcement mechanism is costly litigation against the monopsonist, particularly when the monopsonist has access to the public financial markets to raise its defense funds. (Which defense costs evidently are deemed not “material” by its public accounting firms and thus the true litigation exposure is not reflected in its public financial filings.)
Swiss they ain’t.
Good story on the Veoh case at CNET (and since it quotes MTP it’s a great story!)
Billboard has the story:
“‘We the undersigned wish to express our support for Lily Allen in her campaign to alert music lovers to the threat that illegal downloading presents to our industry and to condemn the vitriol that has been directed at her in recent days,’ said the statement, which went on to call for a three-strikes law that would result in restrictions to persistent offenders’ bandwidth levels to prevent P2P activity….[Radiohead's] Ed O’Brien told BBC News the meeting was “quite emotional” and said Allen was “extremely brave” to turn up.”She’s taken a lot of flak for what she’s said. What she’s done has been brilliant because she started the process where artists have stood up and said, you know what, there is a consequence to illegal file-sharing,” he said. “In the meeting, we didn’t always agree but we came to an agreement that we thought was good for everyone.”
“Extremely brave” is right. It’s funny how one brave woman can help others find their courage. Or something else they lost. What once was lost now is found by whatever path, and the Brits have passed an important threshold standing together. This development arguably unites the country behind Lord Mandelson.
The statement of members of the Featured Artist Coalition (although not formally issued on behalf of the organization–weird, but I’ll take it) sensibly suggests what will likely be an economic sanction for the third strike (which MTP also favors–see “Return of the Hadopi“) and is a 180 from the capitulationists who seem to have evaporated.
“The current settlement agreement raises significant issues as demonstrated not only by the number of objections, but also by the fact that the objectors include countries, states, non-profit organizations, and prominent authors and law professors.”
Now why was this result so difficult to anticipate? Or maybe it wasn’t–Google is still scanning.
Somebody asked me if I knew The Bald Guy from the music business. Apparently there is someone out there sending out newsletters acting like he is/was/might have been in the music business and said some very nasty and misogynistic things about Lily Allen’s recently public statements against file “sharing” or what we call “bartering” around here. Not to mention gratuitous and homophobic things about Elton John.
I have never run into The Bald Guy on a deal, at a company, never known anyone by that name who actually sold a record, humped a trap case, nurtured an artist’s career, worked a hit record, or worked a stiff record for that matter. No one ever said—wait! I have to check with The Bald Guy. In short, I have never run into anyone by that name in the music business. Neither on the tech side of the house. Or indeed—anywhere.
Now I have heard of someone by that name who is in the email business. That’s not a business I’m very familiar with, so it’s entirely possible that the guy is an email rock star. I actually sat next to a bald guy at the music awards during Canadian Music Week this year. He seemed to be getting bad vibes from Gene Simmons. Of course—that’s kind of like my brother’s in the army, maybe you know him. I got the impression that The Bald Guy is kind of like the Glenn Beck of the email business or something.
The thrust of the email that I heard about from The Bald Guy is that Lily Allen isn’t pretty enough for his standards (???) and that she’s not a good enough singer (given his superstar A&R track record) and that artists don’t know anything about the record business. And the most damning fact in the email business—she hasn’t made it in New York. And then there were some things said about Sir Elton that just aren’t worth repeating.
Now—is there a connection between The Bald Guy and Sir Elton’s letter to Lord Mandelson against file bartering? Maybe, maybe not. But the timing is curious. Which do you think will get more weight from Lord Mandelson? The Bald Guy’s email or the views of one of the greatest songwriters of all time?
Would Sir Elton have written his letter were it not for Lily Allen? Maybe, maybe not. But the timing is curious.
I have learned that the very best person to ask about what to do with a record is the artist. They may not know all the answers, but they usually have some pretty good ideas. And it is, after all—their record.
I wouldn’t ask someone in the email business what they think about selling records, and I wouldn’t expect them to ask me about the email business. I’d be more likely to ask them what they think about giving email away for free, and they’d probably tell me.
The email business must be a tough business. It sounds like it must be like the music business was 30 or 50 years ago, a bunch of wannabe Svengalis (speaking of wannabes) telling girls that they aren’t pretty enough to get a ride in the big black car, or their record wasn’t good enough to be worth paying off 100 jocks to play it instead of something they wanted to play. And then of course, there was that world outside of New York—New Jersey. If you ain’t made it in Jersey, baby….
That’s the problem with the Internet—everyone’s a critic. Even people in the email business. And it is very important to some that they tear down anyone who stands up. Particularly women. We have a name for that.
Us Baldry alums have to stick together.
“French publishers say Google has never negotiated with them about its digitisation scheme even though the millions of works held in US libraries already scanned by the company include thousands of French titles.”
Excellent reporting by Financial Times on the new twist in the Google Books fiasco. Welcome to France, Mr. Schmidt. Here’s a tip: N’entendez aucun mal, ne voyez l’aucun mal, ne parlez aucun mal.
Update: Choruss on the march–or at least on the panel again.
Jim Griffin was on another panel recently expounding on deus ex machina, the vulgar latinate for the Greek (e.g., Euripedes) “god in the machine” or the perfectly good English secular version, usually translated as the “ghost in the machine”.
For purveyors of vaporware, using the phrase “ghost in the machine” is ill-advised. Apparently our suspicions were correct–what Choruss has done is get one or more student government organizations to agree to give up part of the student activities fees they sort-of control (known as the “bursar’s bill,” “Bevo bucks,” etc., depending on which school you’re at) and pay these fees to Choruss. According to Griffin, Choruss is then going to give the money back to the school to support music education.
Lest that transaction just slip by–think about that for a second. Let’s see, it comes out of the student’s pocket and goes into the school’s pocket. Or said more accurately, it comes out of the students’ parents pockets, goes into the artist/producer/songwriter/publisher/record company pocket, and then Choruss decides to pay it to the school. The problem with that–Choruss is essentially a retailer, and once someone who is essentially a retailer “sells” the tracks to the students they have to pay a wholesale price (usually about 70% of the proceeds) to the rightsholders–songwriters, music publishers, artists, unions and record companies.
If Best Buy were to sell some CDs and decide not to pay the record company and give the money to the “Minneapolis Youth Orchestra” I think we would all find that odd and probably illegal. Unless, of course, all the rightsholders consented to it. So that means Choruss would have to identify the rightsholders to get permission (which they can’t, at least not yet).
When asked if Choruss would take a fee, Griffin said it would not. When asked what fee structure Choruss uses for its financial projections, Griffin said that Choruss doesn’t use finanical projections.
I’m sure A&R executives everywhere will be thrilled to know that someone somewhere someplace in the music business doesn’t have to run a P&L every time they want to buy a paper clip. Of course, if you are the ghost in the machine, maybe you don’t have to live under the same rules that everyone else does.But–what about the ghost in the machine? The ghost in the machine is supposed to be able to account to artists/producers/songwriters/publishers/record companies for their share of the student activities fees that Choruss is to collect. Two problems–the ghost doesn’t have a license and the hasn’t figured out how to account to anyone yet.
Let me offer another explanation for this phantasmagoria. Griffin also said that Choruss would be “going independent” soon. We used to call that running out of money, but let’s say it is “going independent” and will be taking in new money for whatever in the world this ghost is supposed to be. It is highly unlikely, particularly in this environment, that a VC would allow a strategic partner (such as Griffin’s putative investor, Warner Music Group) to set the valuation for the company. That means that in order to survive, Griffin must have some kind of validating commercial deal, preferably one that throws off what is lovingly referred to as “top line revenue” in the trade. Hence his deals with these six campuses (is that different than six universities? Time will tell.) The new money VC will then use these benchmarks to set a new valuation in what is sure to be an absolute knockdown dragout with WMG over liquidaiton preference, dilution, etc.
But of course, Griffin doesn’t do financial projections. Oh, no.
So–that means that the ghost is going to (A) have to get the deals to survive and (B) figure out what to do with the revenue. So in a flash of ghostly brilliance–let’s give the money away!! SMART!! The only problems with that are that the money is either (1) being paid for something other than rights and indemnity which is probably not what the students think they are getting, or (2) is being collected in return for rights, but Choruss can’t distinguish one song or recording from another so they don’t know how to divide it up, or (3) Choruss is collecting for a covenant not to sue (see Bennett Lincoff) and so doesn’t have to share the money with anyone but does not want to deal with the political fallout from artists/producers/songwriters/publishers/record companies who figure that out, which is unpopular among rights holders, not to mention Internet savants.
But wait–there’s more! Given Griffin’s public statement to The Register, it sounds like someone is going to be collecting millions of dollars. That’s definitely worth suing over.
I remember asking a guy from BMG why the company was still in the Napster lawsuit after Thomas Mittelhoff’s investment. He told me, make no mistake. That investment came from Bertelsmann, not BMG. Griffin may not be the only one looking for a deus ex machina in the third act.
To my knowlege–Griffin has no deals with any rights holders aside (possibly) from some elements within WMG and even then that will probably not hold if I had to guess. So what is he collecting money for, exactly?
Or to ask the musical question: Who you gonna call?
Now that it appears that Google has been dealt a major setback in the Google Books case, questions are raised about how effectively Google has been handling its—what would you call it? “massive”?–exposure to copyright infringement litigation from a financial accounting and Sarbanes-Oxley perspective.
It would not be unusual to see personal liability for CEOs and CFOs in SEC actions relating to failing to adequately reserve for other liabilities, such as environmental liabilities. I was discussing this issue with a friend of mine who raised some very interesting similarities between these situations from a securities law and financial accounting perspective.
Google’s 10K handles it this way:
“We have also had copyright claims filed against us by companies alleging that features of certain of our products and services, including Google Web Search, Google News, Google Video, Google Image Search, Google Book Search, and YouTube, infringe their rights. In the U.S. we announced a settlement with the Authors Guild and the Association of American Publishers; however, this class action settlement is subject to approval by the U.S. District Court for the Southern District of New York, and we are subject to additional claims with respect to Google Book Search in other parts of the world. Adverse results in these lawsuits may include awards of substantial monetary damages, costly royalty or licensing agreements, or orders preventing us from offering certain functionalities, and may also result in a change in our business practices, which could result in a loss of revenue for us or otherwise harm our business. In addition, any time one of our products or services links to or hosts material in which others allegedly own copyrights, we face the risk of being sued for copyright infringement or related claims. Because these products and services comprise the majority of our products and services, the risk of harm from such lawsuits could be substantial.
We have also had patent lawsuits filed against us alleging that certain of our products and services, including Google Web Search, Google AdWords, Google AdSense, and Google Chrome, infringe patents held by others. In addition, the number of demands for license fees and the dollar amounts associated with each request continue to increase. Adverse results in these lawsuits, or our decision to license patents based upon these demands, may result in substantial costs and, in the case of adverse litigation rulings, could prevent us from offering certain features, functionalities, products, or services, which could result in a loss of revenue for us or otherwise harm our business.
Although the results of litigation and claims cannot be predicted with certainty, we believe that the final outcome of the matters discussed above will not have a material adverse effect on our business, consolidated financial position, results of operations or cash flows. “
So let’s say one or the other of the Viacom case or the class action against Google for YouTube resulted in a $1 billion judgement. Just blow that off as nonmaterial? Google discloses that the company maintains an allowance for doubtful accounts, tax liabilities, a couple other things. But no express statement that they reserve for the infringement cases.
If you can find any place in Google’s financials that Google actually mentions a liability account accruing for potential damages arising out of the numerous copyright cases against the company, I’d love to see it. This would be something in line with FASB Statement 5.
And so it is well to say again in light of three governments objecting to Google Books for what are obvious copyright issues–where is the board?
Where is the Securities and Exchange Commission?