Tuesday, November 30, 2010

Some Amazon self-service publishers sell Project Gutenberg's free books

SEE UPDATE at the end.
Washington Post Technology writer Rob Pegoraro reports a complaint that "Amazon charges Kindle users for free Project Gutenberg e-books" -- meaning that Amazon is allowing suppliers to sell, through Amazon's self-service publishing, versions of public domain books that are apparently derived from Project Gutenberg editions and nearly identical to them.

Note that it's considered okay to spiff up basic public domain books (which is why they're called that) and then sell them for the value of what you've done to make them comfortably readable on an e-reader and, we hope, typo-free.

  Public domain books make up the bulk of the 20,000+ free Kindle books that Amazon makes downloadable to Kindles at any time, but apparently some suppliers are uploading to the self-servicing publishing area of Amazon (for the 70% revenue offered self-publishers of Kindle-formatted books) copies of the Project's e-books that were essentially created earlier by them from scanned images.  Even this would be legal, as, again, the basic text, once there are no Project title pages, is in the public domain.

  Pegoraro explains:
' The titles in question aren't just public-domain books that have long been freely available at such sites as Project Gutenberg.  They appear to be the exact Gutenberg files, save only for minor formatting adjustments and the removal of that volunteer-run site's license information. '

 You'd think that the work that another organization spent converting image-based pages into proofed text to make them more easily readable on an e-reader should not just be taken for re-selling while the original organization is offering their work for free.

Pegoraro expands on this:
' Gutenberg contributor Linda M. Everhart complained in an e-mail in late October that Amazon was selling a title she'd contributed to Gutenberg, Arthur Robert Harding's 1906 opus "Fox Trapping," for $4.

  "They took the text version, stripped off the headers and footer containing the license, re-wrapped the sentences, and made the chapter titles bold," wrote Everhart, a Blairstown, Mo., trapper. She added that "their version had all my caption lines, in exactly the same place where I had put them." '

  Everhart identified other "instances of Kindle cloning" and Pegoraro writes "These titles appear to be sold with Amazon's standard digital-rights-management restrictions, a limit absent from Gutenberg downloads."  (However, most of the public domain books I have, for free, from Amazon, don't have DRM restrictions on how many Kindles can share the book. )

  Everhart describes the kind of work she and others do to make the free Project Gutenberg books available, which involves downloading a scan of the book's pages from the Internet Archive's collection, running it through optical-character-recognition software and then correcting mistakes and stripping out extraneous data "before formatting the text to Gutenberg's strict guidelines.  Next comes converting that text file into an HTML version with linked images that can finally be uploaded to Gutenberg."

  Pegoraro adds, "Apparently it's less work to convert that output to a Kindle Store download..."

  Some don't think so, but there are definite indications that too many Amazon digital-text-publishing uploaders do not consider proof reading important.

  But, again, all this is permitted under the Gutenberg license.

  Project Gutenberg Literary Archive Foundation chief executive Greg Newby wrote in an email, "Is this legal? Yes. Is it ethical? I don't think it is."

  Newby added that although many other booksellers do this, Amazon is "the worst offender" because of the number of of suppliers for Amazon.

See the article for Amazon's response so far.

Newby has suggested to Amazon that it could directly offer Gutenberg titles as no-charge, DRM-free downloads -- something Apple did in its iBooks store.  He doesn't mention what the terms were though, on Gutenberg's part.

Pegoraro sees a simple solution:
' Search the Gutenberg site for a title you're interested in buying for your Kindle and download it from there if it's available.
  Not only does that site usually offer books in Kindle formats, you can even download them directly to a Kindle. '

The link he gives there is to this blog's article on how to browse/search a Project Gutenberg "Magic Catalog" on your Kindle for one of the 30,000+ books available there and then click to download the book direct to the Kindle.

  At this blog's Free Kindle Books page, I also include MobileRead Forum books, another non-Amazon e-book source, which displays its public-domain books sorted by Amazon-readable 'PRC' (Mobi) format and by most recent first.  These tend to be even better formatted, with linked table-of-contents page when applicable, and mostly free.

 And then there are the 2 million+ free books at Internet Archive, downloadable directly to the Kindle also.
  I explain that at the Internet Archive article.

  One thing needs to be mentioned about getting or not getting public domain books from Amazon (there are over 20,000 free e-books in the Kindle store -- see the footer of any blog article in this blog to find them and the temporarily free contemporary Kindle books as well).
You can "sync" the Kindle editions with any reading you do on apps for iPhone, iPad, iPod touch, Blackberry, Android smartphones, PC, Mac, etc., and resume from where you were on the other device.

  That syncing can't be done for ebooks that Amazon doesn't have on its servers.
  Also, highlighting and notes made on non-Amazon books aren't backed up to the Amazon servers (as Amazon doesn't have the books) and therefore aren't viewable and copyable at the customer's private, password protected website at Amazon.  This might be important to those who have a student's approach to books.

  Amazon probably should have a rule against self-service publishing (in Kindle format) of a public domain book that has been carefully converted to text by another entity, proofread, formatted in HTML and already released in Kindle-readable format at no charge -- but then Amazon would have to spend time checking all the uploaded public domain books against other editions of those books.

  Still, they could probably stipulate that there is no publisher payment for books just taken from Project Gutenberg and the Project's identifier statements stripped.

UPDATE - If you're reading this at the web edition of the blog, please read the first two comments to see the licensing language that Project Gutenberg has, which says that that if anyone modifies a Project Gutenberg book and does charge for it, they don't claim a right to prevent someone from "copying, distributing, performing, displaying, or creating derivative works based on the work as long as all references to Project Gutenberg are removed."

Kindle 3's   (UK: Kindle 3's),   DX Graphite

Check often: Temporarily-free late-listed non-classics or recently published ones
  Guide to finding Free Kindle books and Sources.  Top 100 free bestsellers.
UK-Only: recently published non-classics, bestsellers, or highest-rated ones
    Also, UK customers should see the UK store's Top 100 free bestsellers. Below are ways to Share this post if you'd like others to see it.
-- The Send to Kindle button works well only on Firefox currently.

Send to Kindle

(Older posts have older Kindle model info. For latest models, see CURRENT KINDLES page. )
If interested, you can also follow my add'l blog-related news at Facebook and Twitter
Questions & feedback are welcome in the Comment areas (tho' spam is deleted). Thanks!


  1. I have been considering this issue a bit recently, and appreciate your mention of it. There are a few PG books that I am considering doing some conversion of, as the auto-generated version on gutenberg.org doesn't have the images which I feel are rather important to understanding the content. I would also love to be able to sync the books with my ipod. I'm considering submitting to the kindle store, but I'm not sure of the best way to go about it. I know that I would not remove PG's information, and was thinking about putting a link to the PG page for the book in the product description (along with an actual description, and an explanation of what improvements I've made). I personally think that 50 cents would be a fair price for the improvements I'm making (linked ToC, images), but the lowest price I *can* set is 99 cents.

    I'm leaning toward a position that it's okay to make improvements to a PG file and then release it on kindle (for a low price, 99 cents is the most I'd consider for that), assuming that you also provide a link to the original and explain which part you did. However, I also rationalize that it's okay because I'm also a proofer on pgdp.net...

  2. Erin,
    What you say makes sense. But PG License says specifically that you can do anything you want with it, modify it, etc., use it for derivitive works -- but only that if you use the PG trademark (or leave it in) you can't charge anything for copies of it.

    However, there are two loopholes I see.

    1. "If an individual work is in the public domain in the United States and you are located in the united States, we do not claim a right to prevent you from copying, distributing, performing, displaying or creating derivative works based on the work AS LONG AS ALL REFERENCES TO PROJECT GUTENBERG ARE REMOVED."

    2. "Gutenberg is a registered trademark, and may not be used if you charge for the eBooks, UNLESS YOU RECEIVE SPECIFIC PERMISSION."

    As a proofer on pgdp.net, you probably can get permission to add the illustrations.

    But, of added interest to me is the wording that makes the Washington Post quote that it's "not ethical" to remove the Project Gutenberg references and then charge for it.

    That would seem to be covered as something they can't prevent. Maybe should add language that they feel it would be unethical to do this.

    Good luck on this interesting-sounding project. :-)

  3. Erin is right. Some books absolutely need the images that came with the original printed versions. A few months ago I read Jules Verne's Mysterious Island on my iPod touch without the maps and sorely felt the lack. Verne didn't intend his island to be that mysterious. The Riddle of the Sands is the same. Without the maps, readers miss much of the mystery that underlies the sea-faring story.

    Also, unless the rules have changed recently, PG's policies actually discourage giving them credit. Give no credit, and you can do what you want with their files. Give credit, and you have to give credit their way,, which can be a nuisance. That may be a factor with some of these Kindle books.

    There's a range of approaches to using public domain texts. Free copies are a must. I've downloaded about a dozen for my Kindle. The key there is to search Amazon (by title or author) with the sort by price from low to high, which puts free titles at the top. And getting the books from them means their bookmarking, synching and note taking work between Kindle apps.

    Erin's suggestion for 99 cent versions also makes sense, but there's the difficulty of separating versions where someone has actually added value by correcting typos, adding images, and improving the layout, from the quick buck artist who simply slaps their label on a PG file. I'm not sure how we can deal with that. Perhaps certain brand names will develop that mean quality. Amazon could help, but I've never gotten the impression they care about quality. They're too obsessed with bigness.

    And finally, there are the much improved editions at higher prices. That was much of what my little publishing company (Inkling Books) did for print editions up until recently. We (meaning me) worked hard to add as much value as possible. There is a need for what on Amazon would have to be $2.99-9.99 Kindle editions.

    Old books often include mysterious references that modern readers don't understand.In one G. K. Chesterton book I published (probably Eugenics and Other Evils), there was a reference to two men with rather ordinary names. Only after much work did I discover that one man killed the other in one of the most celebrated murders of early 20th century America. The man who was murdered designed Grand Central Station and had drugged and raped the other's wife when she was a teen. Without knowing that once well-know event, Chesterton's remarks make little sense. .

    I also added additional material to these books. There are numerous print editions of a 1894 book, Across Asia on a Bicycle of varying quality, but mine is sells better than the rest because I included pictures and sketches from the original, cleaned up with Photoshop. I added notes about people and places. And I added two additional chapters the book's authors published elsewhere. It sells well because it offers more at a price that's less than most of its competitors. But it's still not paid back the time it took.

    It's the difficulty of making adding value to a ebook and still getting a living wage that has me worried. I'm planning to create Kindle and iBookstore versions of my existing titles, but I'm not sure if I will do more than that. My Eugenics and Other Evils sells well because Michael Crichton specifically recommended it in his last book. But it isn't that easy to get the recommendation of a celebrate writer, so it's hard to make a book on which I've spent many hours pay. It's also hard to escape the noise generated by free and 99 cent versions that took almost no time to create. They can create a 100 editions in the time it takes me to create one.

    Amazon could help by creating some "seal of approval" or ranking scale that would be attached to the better of these books. But, like I said, I'm not sure Amazon cares about that sort of thing. They're more interested in being the "biggest bookstore in the world" than in being the best.

    --Michael W. Perry, Seattle

  4. Michael,
    Thanks for all that. I've asked people to check the comments too, so I hope more will read it.

    > "Verne didn't intend his island to be that
    > mysterious."


    Yes, in my response to Erin, I agreed that what she wants to do makes sense and I also quoted the PG language that shows they sort of demand removal of the references to Project Gutenberg if you have reason to charge for the modifications.

    I'd think they should accept a credit when building ON the free copy w/o indicting GP for there being a charge on it when modified by someone else.

    > "The key there is to search Amazon (by title
    > or author) with the sort by price from low to
    > high, which puts free titles at the top."

    Please note the footer links to only free books by varying sorts. In the ongoing Free Kindle Books page noted there, I also link to the Free Classics.

    But if looking for only one author, that is then very nicely restricted.

    > And getting the books from them means their
    > bookmarking, synching and note taking work
    > between Kindle apps.

    Yes, I put that into a colored blockquote.
    That part is important to me.

    The only way to separate by quality is to get the sample. That is usually quite a good indicator.

    Amazon's design of how the Kindle functions is to me an indicator they do care about quality. And they care about the quality of Customer Support because it pays for them to do so, which too many companies aren't smart enough to know.

    Doing review work of the book quality in a flood of Kindle books ?? Staffing reality ?

    Industry news analysts are themselves into bigness and eager to find the next killer of the last hot item. Size can mean survival in that world. But again I'm not convinced that Amazon doesn't care at all about quality.

    However, there is the cost of that to be considered. There is no quality if there is no company.

    Re your improvements, *I* feel you should charge at least $4.99 for those and you would probably do well if you describe the additions.
    People's star ratings will also help.

    Fascinating re the old, hairraising dramas meaning nothing to most of us today...

    I may get that 'Across Asia on a Bicycle'
    With all that care put into it. DON'T sell it for less than the competitor editions. It's that simple. Have more faith.

    I empathize with having to compete with $0-$.99 ebooks. Again, charge more for the enhanced versions. People do recognize that kind of work.

  5. There are other, potentially more serious, problems than just ethics with taking stuff from places like Guthenberg and charging for it.
    The works held by PG are in the public domain, but not always in all countries (and the site states so explicitly).
    However quite a few works end up being resold on Amazon by people who aren't careful about copyright regulations, and thus the potential crops up for works to be sold by parties not licensed to sell them in at least some domains.
    For example a work on PG might be out of copyright in Australia but not the EU. Someone from Canada pulls it and puts it up for sale globally through Amazon US. There's now an international judicial problem involving the laws of 4 countries as well as international treaties.
    For that reason alone Amazon would probably do best to remove from sale all content pulled from PG (and like sites) and clearly put in their TOS that they won't allow selling works to which you don't own the copyright which includes any public domain works.

  6. It's not clear why Gutenberg, Feedbooks, etc. could not publish most of their offerings on amazon themselves, and even make some money doing so. Because of synching, etc., I'd prefer to get PD from amazon, but the quality is just so often bad, I rarely bother even checking samples and get it from feedbooks if I can.

    I wish there were some way to rate format quality (as opposed to content). Even a subjective rating could be useful.

  7. They need to have traffic to their own sites to justify their own expenses and they would also give support there and be aware of any problems, directly.

    It could be that Amazon, as a for-profit company, just doesn't offer storage and support (including backups for annotations and all the syncing that people would want done as part of the included deal) for something that has to be free.

    They do link people, on the Amazon site, to Project Gutenberg and other free sites.

    I like your idea that the customer reviews on e-booksshould include ratings for formatting, layout, and accuracy. Have you written their feedback areas about that? I think that's a pretty important one.

  8. j,
    That's the same scenario as for 1984 but that was with MobileReference.

    Since then, they're supposed to have double checks on where there are geo rights problems. Don't know how effective it is but ...

    (Sorry your posting was stuck in one of the pending queues w/o my realizing it...)


NOTE: TO AVOID SPAM being posted instantly, this blog uses the blogger.com "DELAY" feature.

Am often away much of the day, and postings won't show up right away. Posts done to use referrer-links may never show up.

Usually, am online enough to release comments within a day though, so the hard-to-read match-text tests for commenting won't be needed this way.

Feedback and questions are welcome. Thanks for participating.

Technical Problems?
If you're having problems leaving a Comment, Google's blogger-help asks that you clear the 'blogger.com' cookies on your browser's Tools or Options menu bar and that will fix the Comment-box problems (until they have a permanent fix).

IF that doesn't work either, then UNcheck the "keep me signed in" box -- Google-help says that should allow your comment to post (it's a workaround to a current bug).
Apologies for the problems.

TIP: There's a size limit. If longer than 3500 characters or so, in a text editor, make two posts out of it.

[Valid RSS]