Wednesday, September 9, 2009

Amazon Winnowing Public Domain Duplicates

Amazon announced today at its Digital Text Platform forums:
' In an earlier posting, we said that we are working on improving the customer experience in the Kindle store for public domain titles like “Pride and Prejudice.”  Kindle customers often find it difficult to choose from the many different versions of the most popular titles.  As a first step, we stopped accepting additional public domain titles.  Later this week, we will be removing many of the duplicate copies of the best selling public domain titles.

 Some of you may disagree with the choice of titles we will be removing.  If you feel that your version of a public domain title has significantly more value to customers compared to the ones we have chosen to keep for sale, please contact us at and we will review your title.  Thanks for your patience as we continue to improve the Kindle Store.

The Kindle Team '

Scott Douglas drew our attention to this.   he also worried about 'big brother' possibilities with this and interprets the message as saying only one edition will be allowed, although Amazon says that if you prefer your version "compared to the oneS we have chosen to keep..."  they will review the title if the file-uploader contacts them.  They are talking here to those who want to upload to Amazon their own versions of public domain books.

I wrote to Scott on his post because I had first seen, courtesy of Kindlezen on Twitter that Librarian And Information Science News was reporting on Scott's post and a commenter there has a concern (as a UK observer) that Amazon might consider one's public domain books from other sources on our Kindles (many of us have lots of them from other sources) possibly bit-torrent material.  But, really, they have shown no interest in all the things we can put on our Kindles, learned from their own sanctioned Amazon forums, thanks to customers helping one another.

To Scott, I repeated what I wrote to the Librarian news site, thinking I was responding to him there.  Here is the text of what I wrote, which Scott and Librarian news site allowed to be posted right away.  I'm reposting it here so that Kindleworld blog readers can read the info also but with links to what I mentioned.
' Let's not get carried away.
The Project Gutenberg books are directly downloadable to the Kindle and have been for some time. That's 30,000 books.  The steps for doing this are on my site and others'.

Those who Kindle can get books direct to the Kindle from, (use ( to actually download the books to Kindle) and even now owned by Barnes & Noble.

Amazon will not - after '1984gate' - be deleting any books you've purchased which were uploaded to Amazon... mainly because they'd lose their entire Kindle crowd if they did that again and they know it.
  And they certainly won't be deleting material you got elsewhere.

On the Amazon forums there is a popular and humongous thread of about 1200+ posts from which many learn about how to get books from everywhere else, and how to quickly convert them, as needed, for the Kindle.

And I've written a piece on how to quickly convert any of the million free Google books so you can read them on the Kindle.

As for the public domain books, we can get them from just about anywhere.  What customers have complained about is the never-ending proliferation of public domain books on Amazon, some of which have no table of contents, are badly formatted, have all kinds of errors, because Amazon had let everything up in the digital-publishing upload area, within a day.

They are now, from what I read on Amazon forums, doing 5-day reviews of uploaded material.  Harry Potter books were uploaded almost daily - but the author refuses to make them available for the Kindle and those are then obviously illegal uploads.  Amazon customers reported lots of occurrences of such things.

 If Amazon will have only one version of a public domain, maybe they'll choose only those with working Table of Contents hyperlinks and correctly formatted etc.  My guess is they'd have two or three Amazon-chosen ones for the free-option.  IF it were only one* and the best in their minds, for free, then that's their prerogative and we have less work to do when trying to get a book [from Amazon].  I had to download and check out samples for about 12 versions of the Devil's Dictionary and most were missing essential things like working Table of Contents.

As it turns out, the best one I found came from an individual posting at Mobileread forums and was free. So that's what I'm using.

Remember, we can read MOBI or PRC files on the Kindle and rights-unprotected documents will be converted by Amazon (for free if you send it to [you]] and then move the converted copy to the Kindle yourself.  Many of us just run it through a free converter ourselves.

- Andrys

* As you can see -- in Amazon Kindle Team's note to those who want to upload, or have uploaded in the past, a new version of a public domain book, they mention "ones" to which the uploader's version is felt, by the uploader, to be superior.  So, I don't believe they're thinking of limiting public domain offerings of a book to only one edition but are choosing just a few from the many duplicates they already have and are now finally reviewing for quality of layout, etc. as requested by Amazon customers for some time.

(The above applies to the Kindle DX as well as Kindles 1 and 2.) Below are ways to Share this post if you'd like others to see it.
-- The Send to Kindle button works well only on Firefox currently.

Send to Kindle

(Older posts have older Kindle model info. For latest models, see CURRENT KINDLES page. )
If interested, you can also follow my add'l blog-related news at Facebook and Twitter
Questions & feedback are welcome in the Comment areas (tho' spam is deleted). Thanks!


  1. I recently downloaded two Amazon free Sherlock Holmes story collections. Neither had a linked Table of Contents, which is particularly necessary for a collection (as compared to a novel). So, there is no question that even Amazon's own PD material needs much work.

    Having said that, there are far too many versions of the same PD books (and Pride and Prejudice is as good an example as any). However, I do hope that they will have several DIFFERENT versions of translated works (remember that not all PD material was written in English).

    I would also like to see Amazon do a MUCH better job of proofing (at least preliminarily) Kindle books, as so many have MANY typos and significant formatting errors. I just finished 2 James Clavell novels that were replete with typos, all of which could have been fixed by simply spellchecking the OCR'd work before uploading it.

  2. I sure agree, Richard, though I don't think they have the resources to proof-read except to just not accept the most egregiously typo-ridden ones.

  3. The point of Amazon's announced undertaking, happens also to relate to what lies at the heart of librarianship: "to every reader his [or her] book; to every book its reader." A necessary condition for *that* to happen, is the ability to *uniquely* identify every edition of every work. *That*, in its turn, is the problem with the public domain books on Amazon. Not only are all of the links to every free p.d. title on Amazon virtually identical -- one has no way of knowing whether each links to a different edition of a work, or whether they comprise multiple links to a single electronic file. Which is clearly really really bad. If a title is free, all one is wasting is time. But in some cases, the only distinguishing feature *is* the price. One copy of 'Pride & Prejudice' is free, one is 99 cents, one is 2.99.. In such a case, then, one wonders about quality control -- is *that* what one is paying for with the 2.99 copy? Or is one just giving a couple bucks to some guy in a garage?

    The 'unique identification' problem is in fact endemic to Amazon's site, whether one is looking for a copy of 'Greyfax Grimwald', or a griddle. It's just less evident in those other areas, because there aren't as many griddles, or editions of 'Greyfax Grimwald'. The bibliographic control work always gets serious, though, when one gets to 'The Bible', 'Moby Dick', anything by Shakespeare, etc.

    I am in favor of Amazon fixing the 'multiple superficially indistinguishable public-domain editions'-problem. but I'm worried about it, too. I suspect that Amazon may ultimately need a librarian experienced with electronic resources and bibliographic control to help them -- and that Amazon is not enormously likely to recognize how this would streamline the process, because they're already spending money to fix something that they can't directly connect to the bottom line.

    Except that any time one spends messing around with free titles on Amazon's site is time one spends on Amazon's site *not* spending any money...

  4. Anonymous,
    You make a LOT of good points!

  5. I just bought my first Kindle and my biggest frustration with the device is what you describe in this nearly year-old post: Public Domain chaos on the Kindle Store. I've quickly discovered MobileRead and Feedbooks thanks to this wonderful site and now exclusively use the guides to download books of much higher quality than the Kindle store.

    Since you seem to follow all things Kindle, I have some questions for you:

    1) Did Amazon give up on weeding out books with errors, no TOC, etc.? Or are they committed to improving the Kindle Store experience for public domain books?

    2) Lets say you get a book from Feedbooks. If you want to sync through Whispernet, is there a free way to do that?

    3) If there is no free way - does e-mailing a book to your Kindle address give you essentially the same experience with the book as if you had purchased it through the Kindle store, in terms of the seamless sync that Whispernet offers?

    4) Is there any way to filter out books containing no table of contents when doing a search in the Kindle store? I have no interest in getting story collections that have no table of contents (i.e. Sherlock Holmes).

    After some amount of Googling I'm not yet sure I've got exact answers to these questions.


    Joe G.

  6. Joe G.,

    Glad you found the other places here.

    Re public domain books and chaos:
    I don't know that Amazon gave up on it, but it's probably a lower priority -- they already took a lot of flak for no longer carrying ones which were even worse problems.

    You know what their market is right now and I don't think heavy emphasis is put on free books, but they showed some concern over it.

    We can get free samples, so that's what I have done. If I don't like formatting or the lack of a TOC, I don't get go for the full book. I've found some excellent things though.

    Feedbooks is totally accessible by your kindle's web browser.
    And you can download from it.

    Check out the right-hand column of the this website to see a lot of questions answered.

    But I think I give links to Feedbooks and " " (the latter is the direct one to download books to your Kindle from ) in the Free Kindle Books and Low Cost page at:

    I also link you to their help pages for Kindle, so take some time to go through it -- there is a lot.

    The book files are in MOBI or KINDLE or PRC format which you can choose and you can just click on them to download them as they are Kindle-compatible and usually not large. There's no cost for the web browser.

    As I said, each of them has a help page for Kindle users and I tend to link you to those pages.

    In Guides, Tips, and Tutorials section at the right column, you'll see
    . "How to use the 24/7 web browser" and

    . "Websites for Mobile Access"

    For a quick way to get to the 2nd item, which leads you to How to use the web browser, go to

    But the reason I have the right hand column is so I don't have to type the help stuff a lot, so although it's confusing and takes too much time, give it a look when you have time. You just straightaway download the books from those two places with the Kindle web browser.

    For other books you download to your computer from other stores:
    Emailing a non-Amazon but Kindle-compatible book to your Kindle device costs 15c per megabyte of a file but it's seamless, yes.

    Filtering out books with no TOC. Just get the sample with your Kindle and look at it. I'm sorry but there's no other way.

    If you take your time, some of them tell you when they have a TOC, in the store descriptions, and that helps.

    I'm with you when there are collections (especially of poems or short stories and there is no TOC!)

    I would tend to choose Feedbook and manybooks more but when you get a public domain book from Amazon they keep your annotations for you (unless you disable annotations backup) and those go on a password-protected private3 website.

    Search on Clippings or on Annotations
    at top right.

    They stay with your book and if you delete it, Amazon has a copy in your archive since you bought it from them and you can download it again at any time and your notes and highlighting will still be with the book.

    Not so if you get the book elsewhere.

    Whatever, go slowly -- that's what getting back to books are maybe best for, to relax. I know that's easier said than done.

  7. Andrys - Thanks for your replies. You answered two of my questions, but I see my other two questions were not given enough context or worded carefully enough.

    Let me start by saying I have already read through a large chunk of your material (it's great, thanks!). I am already using the web on Kindle and downloading/reading books from Feedbooks - my preference is to do so through their guide (which I downloaded to my Kindle) or through a link at the end of a book. I have even found what I think is a novel method for getting web clippings onto my Kindle, involving Google Reader (I may post about that soon on my blog).

    But here's the scenario:

    I bought a Kindle 2 and a Kindle DX, and I also have a Blackberry. I have already sometimes read a book on more than one device and one book I actually read parts of on all three

    I've given up on getting free or low cost books from the Kindle Store because the ones I tried to buy all had over 10 versions with incomplete descriptions and reading through sample after sample soon proved pretty tedious.

    But one nice thing about the Kindle is the annotation features. I used highlights initially to put in my own TOC as I was reading through a book, and then that TOC got synced with my other devices, as well as bookmarks and where I was last reading.

    I still would like to use annotations, but as you point out elsewhere in the site these annotations won't get synced unless you purchase from the Kindle Store. So while FeedBooks does a terrific job of quality control, TOC, etc., I don't get the benefits of Whispersync if I get something from them.

    So here again are the two questions:

    2) Lets say you get a book from Feedbooks. If you want to sync through Whispernet, is there a free way to do that?

    3) If there is no free way - does e-mailing a book to your Kindle address give you essentially the same experience with the book as if you had purchased it through the Kindle store, in terms of the seamless sync that Whispernet offers?

    I ask these questions not for the ability to get the books - which is easy. But for syncing annotations, bookmarks, reading place, etc.

    Thanks again,

    Joe G.

  8. Joe G.,

    Oh, I see now. SYNC'g. Sorry I missed that.
    That was my fault.

    As far as I know, Amazon doesn't sync between books not purchased on Amazon and therefore kept on their servers along with the secondary log file that goes with it (on yur Kindle and on their servers) to let you know your last-page read etc.

    In other words, they take care of keeping all logs for books purchased from them (or received for free as they always have the full book file and the personal secondary files placed in your server library.

    Web clippings - there are so many ways to get them on your Kindle. Instapaper is one. Readability is another one.

    But you could use a method like the one I mention at,
    recommended to me by someone I credit, for moving anything to the Kindle without needing to use a USB and it involves Google docs.

    Will be interested to read what you're doing with Google Reader (which I find oddly unreliable for picking up updated-blog entries relative to other readers which I'm currently trying out which acknowledge later update of posts and will put them in the right updated order).

    I don't think there is any way to sync a book == as far as where you last were when you read that book on another device ==, unless it comes from the Amazon servers which keep the data up to date on all your devices.

    I don't even know of a priced way to do that, though you ask if there's a free way.

    There are free ways to, however, just get those on all devices, and I've seen advice for getitng .mobi or .prc documents onto the iphone but since I don't have an iPhone for testing I've never paid much attention to it.

    But google:
    mobi iphone kindle

    and you might come across it...

    Emailing a book to your Kindle address merely plops it on the Kindle for reading.

    But it'll be ignored during check & sync's. The menu shows that capability disabled.

    I can see why they wouldn't want to help keep sync'd logs of several hundred books not received through them since they'd have to duplicate all that on their own servers and they wouldn't have the location numbers needed for that as they don't have that particular e-book.

    I'm afraid sync'g between non-Amazon books is out.
    BUT if there's a way, someone will know at - the Kindle forums.

    There are some very aggressive experimenters there who enjoy sharing what they find or helping in solving an unusual problem.

    Good luck on it. Let us know what you find.

  9. Ok - so to recap the answers for handy reference:

    1) Did Amazon give up on weeding out books with errors, no TOC, etc.? Or are they committed to improving the Kindle Store experience for public domain books? IN BETWEEN. Amazon continues to try to weed out the worst books but is devoting few resources to the effort.

    2) Lets say you get a book from Feedbooks. If you want to sync through Whispernet, is there a free way to do that? NO.

    3) If there is no free way - does e-mailing a book to your Kindle address give you essentially the same experience with the book as if you had purchased it through the Kindle store, in terms of the seamless sync that Whispernet offers? NO. E-mailing just gets the book onto a single Kindle.

    4) Is there any way to filter out books containing no table of contents when doing a search in the Kindle store? I have no interest in getting story collections that have no table of contents (i.e. Sherlock Holmes). NO.

    To summarize: If you want sync from Whispersync, you must obtain books through the Amazon Kindle Store. However, if you want public domain books that have been carefully screened for quality, you'll need to obtain them from places other than the Kindle Store, and lose the sync benefits of Whispersync.

  10. Joe G.
    1. SEEMS to be devoting fewer resources. I cannot know that they are.

    2. and 3.
    I also said to ask at to see if others have found a way.

    I rarely give definitive NO's because I don't know all. But from what I do know, it's not possible in the normal way to sync books that are not even kept on the Amazon servers since they were not from Amazon and would have different paging mechanisms to even tell one where one is in the book.

    4. I did actually write that some who are offering these books do give info about a book having active Table of Contents in the description, since it is a much-sought feature and a bragging right.

    Getting samples is no problem for me but I realize it is, for you. If you blog this from your own perspective, the brief definitive answers you like are no problem if they're not attributed to me :-)

    I just don't really want blanket NOs attributed to me after I write in detail. But I can see you are organized in your pursuit of info for others :-)


NOTE: TO AVOID SPAM being posted instantly, this blog uses the "DELAY" feature.

Am often away much of the day, and postings won't show up right away. Posts done to use referrer-links may never show up.

Usually, am online enough to release comments within a day though, so the hard-to-read match-text tests for commenting won't be needed this way.

Feedback and questions are welcome. Thanks for participating.

Technical Problems?
If you're having problems leaving a Comment, Google's blogger-help asks that you clear the '' cookies on your browser's Tools or Options menu bar and that will fix the Comment-box problems (until they have a permanent fix).

IF that doesn't work either, then UNcheck the "keep me signed in" box -- Google-help says that should allow your comment to post (it's a workaround to a current bug).
Apologies for the problems.

TIP: There's a size limit. If longer than 3500 characters or so, in a text editor, make two posts out of it.

[Valid RSS]