Special Pages - Reports

Sunday, November 7, 2010

Tips: PDF Scissors tool - and reminder re non-Amazon free books - Update

I couldn't be here the last day and a half, and I hope any tips from this blog's comment areas and from my visits around the Kindle world will help.  I'll be doing more of these with one or two tips at a time so people can focus on those and have time to work with them, rather than tossing a bunch of them in one blog article which might be put aside until there's time and then forgotten.

TIP:  PDF SCISSORS
This is a new free tool to help enlarge the essential portion of a PDF when the words are too small to read and to divide the pages to ease navigation when viewing the results in Landscape mode.

PDFs with content too small to read
I JUST saw a comment at the a Teleread.Com comment area, offering a new free tool (on a good site) to help with PDFs that are comprised mostly of image pages and therefore cannot be converted to normal Kindle text format since there are no text fonts to enlarge while keeping within the screen frame.

This would be for the most adventurous among you as it is new and he feels people may find bugs, but he wrote it so he could get rid of margins in his PDFs to see the words on the pages better as they'd be larger, whether using a text-based or image-based page (images of a book's pages).  There are other tools that do this kind of thing, but using them requires comfort with Perl, Python or other scripts and files.

  Remember that mostly-text PDFs can be converted by Amazon (or by yourself using a free tool I've written about earlier) to text using standard size fonts which are of course larger and re-flowing the text lines to fit the a small e-reader screen.

Image-based PDF pages
  With pages that are actually pictures of pages, however, rather than actual text, that's not possible with the type of PDFs which are merely images of a physical book's pages.

  Images enlarged would just be larger than the screen and while we have zoom-in tools on the Kindle  (UK: K3), those are always awkward to use even on a computer with a large monitor but rewarding when you just need to look at an occasional table, figure, or diagram with tiny words inside them. Using Zoom-in and scrolling for each page is a no-go.

  The basic thought is that if the words of an image'd page are too small to read, we can put the Kindle into Landscape mode with the Aa key, which might help enlarge the image to fill the wider space across while keeping everything visible, requiring no scrolling to read it.   But until very recently, what we got instead was usually the same sized image with much wider margins on each side.

  The latest Amazon software for the Kindle 2, DX's and Kindle 3 (July 2010) included enhancements that can crop margins when viewing in Landscape mode so that the image or text can expand to fill the wide-screen mode.  That can make quite a difference.

  However, when the images have words that are really tiny, sometimes they're just not helped by rotating the page to Landscape mode unless there are margins we can crop, AND some images are of pages which themselves have humongous margins and a bit of text in the middle.

  With image-based pages there's no way to have something meant to be on 8-1/2" x 11" paper be highly readable on a 6" screen.  If you just enlarge it in vertical/Portrait mode, the lines go off the page.

  BUT, if there are wide margins, then if the margins ARE cropped, the image can enlarge to fill that new added space and the resolution will usually still be good because they were usually made for larger physical pages.

  It'd be nice if we could do this for Portrait mode too the way Amazon does this for Landscape mode.  The free utility utility offered may help with that although the video tutorial shows a person doing this for Landscape mode.

PDFS USING MULTIPLE COLUMNS
  You may also find yourself with a PDF with multiple columns that continue vertically for so long that you have to press Next Page to get the rest of the first column and then press Previous Page to go back to see the top of the 2nd column, etc. Very awkward and confusing too.

How PDF Scissors might help
  So, the new tool offers some help if it works well.  PDF Scissors by Gagan Mazed at SourceForge.net offers the following, in the author's words:
' What:

In short, It's a tool to crop pdfs.
Objective to create this, was to read pdf files (specially the scanned ones) easily in ebook readers, like kindle.
And by the way, it's a free tool! '
That's followed by a short, fast videoclip showing how it's done.  I used the Pause button to see the video'd procedure better, after clicking on it to take me to the larger version at YouTube since, ironically, it's too small on the PDF Scissors page to see what's happening.

The idea seems to be to crop out HALF of any given page (in Portrait mode), enlarge the width of the page-image by omitting or drastically minimizing the margins and then having the software batch all of these half pages together based on your cropping, ultimately putting each half-page onto a Landscape mode page, which will be especially helpful for multi-column pages as you won't have to scroll down and then back up with the awkward Next/Previous page button procedure.

I don't have time to try it myself tonight, but be sure to keep a copy of your PDF of course and try the utility on another copy.  Mazed goes on to say:
' How
  • Create crop areas to drop the white margins or crop columns.
  • Show all pages together in a stack.
    • Pages will be see-through with transparency.
    • This will help you a lot to decide how much to crop .
  • Create crop areas easily
    • Draw, resize, move crop areas
    • Copy / paste crop areas using usual Ctrl +C / Ctrl + V
Why:

I myself faced a lot of difficulties to read pdf in kindle (and mobile phones / internet tablets), specially the image based pdfs (scanned image pdfs). Got tired of zooming and scrolling while reading a nice book. So created this to help me 'dive into the reading'. I hope it helps you too.
'

This will be more work than many will care to put in, but if you have a PDF that's very important for you to read on an e-reader, it could be worth it.  And some will have assistants who can do the cropping for them.
  I think, though, that if Amazon ever lowers the price on its 9.7" Kindle DX Graphite (which is a beautiful reader), there'd be a run on them by people just needing a good-sized, extremely easy to read e-Ink based PDF reader.  See reactions by some hard-nosed Mobileread forum members who had long felt the Sony PRS-505 text contrast was the one to beat.  It's a very entertaining read.

UPDATE - See the FOLLOW-UP info from author PDF Scissors author Gagan Mazed and feedback from those who have tried the new tool so far.


TIP: REMINDER RE HOW TO FIND NON-AMAZON FREE BOOKS
This was written today in answer to someone who asked a question in the Comments area and is mildly modified for the blog.

Q:   I am in the UK and and trying to find all the free books everyone is talking about.  So far I only know how to get books directly from Amazon.  Are there other Kindle-friendly sites.  A friend told me I have to download free stuff from my computer and then use my wires to switch it from my computer to my kindle.  Is this correct?
  Thanks for your help.

A:   Anonymous in the UK,
At the bottom of every post for the last few months I include a link to free book sources and and how to find them everywhere.
  The shortcut for that is http://bit.ly/kfreelow3 .

  You'll see a lot of non-Amazon sites linked there (as well as the usual Amazon ones).
  (BUT I haven't added the Amazon UK free book links there yet though they are seen at the bottom of each recent blog post).

  Remember that when you get a free book from Amazon, it can be sync'd with your other Kindle-compatible devices because Amazon has the book on its servers.

  There are the usual good sources, such as feedbooks.com, mnybks.net, etc., but you can read about them at that free-books source/guide page.

  Also see (mentioned on that linked page):
  Project Gutenberg books downloadable from Project Gutenberg direct to your Kindle (no charge) using 'the Magic Catalog':
  http://bit.ly/kgutenb

  and
  How to convert any of 1 million or so free Google books to Kindle-readable books:
  http://bit.ly/milkbooks

  and
  The Internet Archive's 2 million+ free texts, most of them made readable on Kindle - see the article here at
  http://bit.ly/kwarchiveorg

That should get you started. There is also a link on that http://bit.ly/kfreelow3 page to the Amazon Kindle Community message thread in which other Kindle users give you information on how to get Kindle-readable books from other sites.


Kindle 3's   (UK: Kindle 3's),   DX Graphite

Check often: Temporarily-free late-listed non-classics or recently published ones
  Guide to finding Free Kindle books and Sources.  Top 100 free bestsellers.
UK-Only: recently published non-classics, bestsellers, or highest-rated ones
    Also, UK customers should see the UK store's Top 100 free bestsellers.

22 comments:

  1. Hmm, Did you try the tool? Let me know if it was of any use.

    ReplyDelete
  2. Anonymous,
    I mentioned I don't have time right now -- too many projects ahead of it but when I need to I will and in the meantime anyone else who IS concerned (I've seen some on the forums this week) can give it a whirl. Maybe they'll report back.

    ReplyDelete
  3. Thanks much for the pdfscissors complete explanation.

    I know it's hard work doing such a fine job on your blog, but the phrase "... using a free tool I've written about earlier" just screams for a hyperlink or at least a term that will find that tool in a search of this blog.

    ReplyDelete
  4. It 'stacks' the pages and "crops" all the pages similarly in the manner you specify. It'd work well if all the pages are laid out in exactly the same way but won't work if each page is different.

    ReplyDelete
  5. Joseph,
    True, I should have at least mentioned the free tool MobiPocket Creator if we want to do it ourselves (and then people could have looked that up) AND the fact that Amazon does this for you automatically if you just send a PDF to ourselves saying "Convert" in the subject line -- but then I thought I will have to explain again that it would cost 15c per megabyte in the U.S. and 99c per megabyte outside the U.S. to send it via 3G Whispernet to [you]@kindle.com though it would be free if you sent it to [you]@free.kindle.com which, if one has the WiFi type of Kindle AND a WiFi router enabled, you'd get a free converted copy linked so you can download it for moving to your Kindle via USB AND a copy sent direct to any WiFi-enabled Kindle once you are near your WiFi network of choice.

    And then I would have to link to just how you get the [you]@[free].kindle.com set up through the Amazon manageyourkindle page and I just did not want to get into it because this PDF tool blog entry was so long already :-)

    BUT, I should do a blog article focusing ONLY on the PDF conversion feature by Amazon AND the various modes used to get it UNconverted or Converted onto the Kindle if we want (when that's possible), as well as (1) how to get your @kindle.com address, (2) where manageyourkindle page is, and (3) what @free.kindle.com does for you, depending on which Kindle you have.

    I have usually explained these things within the context of certain blog articles in the past but I need just one reference-article I can point to each time. So thanks for the reminder.

    In a way, I've been waiting for the latest foot to drop pre-Christmas, re any software updates or anything Amazon might be planning and I may wait until Nov 20 after B&N launches its color Nook and Amazon will probably have some news-splashy things to tell us, such as any software updates affecting current models.

    Thanks for pointing that out though. I might do a blog entry just on this.

    In the meantime, use the search box at top right to search for:

    convert pdf

    to see all that's been written about that in the past.

    ReplyDelete
  6. Preparing PDFs for viewing on Kindle's 6" screen efficiently is a topic I'm keenly interested in. I find that K3's screen, new PDF features and my increasing desire to have technical documentation that is highly portable and accessible have converged to the point where I'm doing a lot more PDF than I ever did with K2.

    I do most of my PDF prep using Acrobat, which I'm fortunate to have a legal copy of. However, while it does almost everything I want to do, there are a few things that could be optimized and or simplified for the specific task of preparing Kindle content. There are 3 tasks I typically do when preparing:
    1. crop headers/footers/margins
    2. fix up metadata so it is meaningful
    3. fix up page labels so they match the page labels that I've cropped off (so that TOC/index page references match what Kindle displays).

    I am still looking for the 'perfect' tool for this, and so was interested enough to take PDF Scissors for a test drive. On the plus side, it has a nice clean interface and seems to perform as advertised. However it is arguably not even 'beta':
    0 - it won't save the cropped PDF, at least on my Mac. Brings up an 'Open' dialog instead of a Save dialog, and will not let me specify a file name or continue. So it is a complete non starter for me. But before I discovered this I found some other things that need improvement:

    1 - it needs ability to establish different cropping for left and right pages. Many PDFs are formatted with a larger margin on the 'binding edge' of the page. One-size-fits-all cropping is not ideal.
    2 - along those same lines, no way to restrict the range of pages to be cropped. Many PDFs have different sections with different layouts and margin setups. Ideally you could apply different croppings as appropriate.
    3 - I don't like the fact that it is essentially a web application (uses 'Java webstart' to load Java code from a server that runs locally in the JRE) - I'd much prefer a standalone tool for this kind of thing. This may be a good deployment option for corporate environments, but I think it is not appropriate for the wild internet. It is also not signed (security), and is rather slow to launch, and doesn't save window position.
    4 - OpenOffice (Java) gave an error when launched while I was running PDF Scissors. Said the JRE was no good. But it launches fine without PDF Scissors running. Another indication Java webstart tech may be half baked...


    I'll send this feedback to the author.

    ReplyDelete
  7. I tried PDF Scissors on a 400 page PDF ebook today. It seems to be a handy little program. It took about 15 min or so to do the 'stacking' of the book. Unfortunately the last three lines of every page were missing but I just estimated where to put the second box and everything was ok in the final cropped PDF. When you go to save, the button is mis-labeled 'Open' though it does act like 'Save'. I find reading an ebook in landscape normally gives three screens per page, with the third screen mostly blank. Only two screens per page in portrait with PDF Scissors is a nice improvement. Thanks for the heads up on this.

    ReplyDelete
  8. Brad,
    Thanks for the feedback on actually trying the 'Open' option, which seems to act as a 'Save.' Thanks for the information it basically works, if without the refinements Tom needs.

    Tom,
    You said, seeing 'Open' instead of 'Save' it was a "non starter' for you. Could you try the 'Open' if you didn't earlier?

    Thanks for giving him the feedback. I would like to get him here as well, but he replied instead in another blog article comment area in error. Will let him know there.

    ReplyDelete
  9. Anonymous,
    Thanks for that feedback re the caveat re the automated batch process.

    ReplyDelete
  10. @Tom

    Good feedback! I think, you are kinda right, it cann't be really called a feature complete beta. PDFScissors started with the frustration that I could not read my favorite scanned book, i googled up, but could not find a proper tool.

    Then I checked some pdf open source libraries and it appeared to me changing the crop is not really that difficult. I thought if I do it, at least one thing is for sure, I can modify it to meet my need :) So I fired up my Eclipse and started coding. What you see now is a product of < one week (nighttime only, I have a regular daytime engineering job).

    I managed to crop my pdfs and then create the site, wishing someone else could benefit out of it. Another goal was that people should be able to run the app without hassle, that's why I chose the Java webstart.

    I guess now I can take feedbacks or requests and make updates to the software.

    1. About left/right pages with different margins. Noted, very true. Will keep that in roadmap. I don't see any technical difficulties, I just need to find some nights to write the codes :)

    2. no way to restrict the range of pages to be cropped. Noted.

    3. Good point that someone may feel more comfortable with offline jar. With webstart the benefit is that, user will be able to use the latest version without hassle. And the current one is just the first version, surely it will go through a looot of changes, frequently. However, I will add this 'download' option as soon as software is 'mature' enough.

    NOTE: The app does NOT upload any file / any data from user's computer, the pdfs are cropped on user's machine. I have not written any code in the app that connects to internet.

    @Brad
    Good that it worked. Ups, silly open/save button problem. I'll fix it tonight (Funny, I never noticed that :D)
    Also about not showing last few lines, I found a bug already in the code, will fix it soon.

    Question for Brad:
    1. Was it awkward to wait 15 minutes for 'stacking' ? Do you think for your pdf, you could be happy without 'stack preview'? I mean you could draw the crop areas looking at the first page, rather than waiting for such a long time. I was thinking for a pdf over 100 pages, may be i should prompt user whether he really needs the stack view.

    The stack view has nothing to do with 'actual cropping'. I added that to help user see all pages together (It helped me a lot when i was cropping my 30 page pdf).

    Thanks to Andrys suggestion, I opened a forum (link in the pdfscissors.com), discussions can be done there if necessary (or here, anything that's comfortable).

    ReplyDelete
  11. Gagan,
    Thanks for your response. It'd be great if you have your central Q&A place at your site but be able to respond to questions here where visitors interested in the Kindle and PDF solutions for use on a Kindle, can find your informational responses specific to the Kindle and then follow up at your site.

    I did an update of this particular article's contents in a new blog article based on responses you gave yesterday (since this one was already long).

    ReplyDelete
  12. As far as the wait for stacking, if a utility is giving me a better reading experience on my Kindle I don't really mind the wait. Though just now I tried a few other PDF ebooks with varying results:

    - two of them went much quicker (100 pages in seconds, 353 pages in about 45 seconds), though the third, 317 pages, took about 8 minutes.

    - on three out of the four books I've tried the transparency thing didn't really work. I'm wondering if maybe the cover image can block out the pages underneath? Maybe you could have an option not to stack the first page or so?

    - on one book each page in PDF Scissors only displayed the bottom left quarter of the books actual page.

    While I can appreciate the usefulness of the 'all pages stacked' view for getting the cropping right, it might be nice to have the option to skip it.

    Even with some bugs in it for now, I can definitely see this program as a welcome addition in my arsenal for dealing with PDFs on Kindle. Keep up the good work.

    ReplyDelete
  13. This tool has really helped me.
    As a teacher, one of the things I really wanted a Kindle for was carrying around all of my course specs and marking sheets. When my Kindle arrived I was a little disappointed when I tried opening one of these PDF documents, as in portrait the text was tiny and in landscape I still had to scroll around a lot.

    This morning I used PDF Scissors to crop all of my course files and they became instantly more useable. I actually had my Kindle out in the classroom for the first time this afternoon, so that I could reference criteria when talking to students. Brilliant.

    The only thing I can think of for improvements would be the addition of an option to export just one page from a cropped multi-page document.

    Thank you, Gagen, for releasing this project to the public, I can see my self returning to it very regularly.

    And thanks to @KindleWorld for bringing it to my attention!

    Kind regards

    Jamie Myland (@raegar)

    ReplyDelete
  14. Jamie /@raegar,
    Thanks for taking the time to do a follow-up to let us know. I look fwd to trying it out in a couple of days too.
    Thanks, Gagen, for the early work and coming here to update us.

    ReplyDelete
  15. Brad,
    Thanks for the detailed good feedback and info !

    ReplyDelete
  16. Andrys, thanks for the heads-up about PDF Scissors. In my world, I read a lot of 100-page, double-column, small-print regulations in PDF format. PDF Scissors allowed me to quite easily create 4 portrait-oriented pages from 1. Because there were two columns, simply viewing the original in landscape on Kindle 3 was imperfect at best. Now I can read my regulation PDFs like any other e-Book.

    ReplyDelete
  17. Back from my little birthday vacation, read few nice books on kindle, after 'scissored', of course:)

    Thank you everyone, Andrys, Brad, Jamie, Tom for feedbacks, suggestions. Its your feedbacks that will inspire me to spend some spare time on coding.

    So what would be next pdfscissors features you would like? I am sure 'all' will be the best answer, but I wonder what to pick first?
    - odd/even page separation, so that they can have different crop boxes.
    - Stack view improvement - discard first / some page from previewing, cancelling whole stack view (in case of a big book).
    - Exporting some particular pages instead of all.
    - Anything else?

    You can also reply to the forum in www.pdfscissors.com

    ReplyDelete
  18. this program only run ONLINE. not good option.

    ReplyDelete
  19. Gagen,
    Thanks for considering the requests you've received.

    Anonymous,
    Gagan, the author says it doesn't connect online to do anything. Have you tried it? Let us know if you find that it does.

    ReplyDelete
  20. Gagen,

    Suppose I have an ebook (PDF) that looks a bit like this (format wise):

    ____________________________________
    ____________________________________
    ________________ ________________
    ________________ ________________
    ________________ ________________
    ________________ ________________

    i.e. some single column content in with mostly double column stuff. Is there a way to crop the single column material so that it is the header (or footer in some cases) for the first column of the two column crop (or footer for the second column...)?

    ReplyDelete
  21. Hello again!

    I just thought I would ping here to notify that pdfscissors has just been updated to 0.0.2 with the most frequently requested features: offline support, odd-even page groups (+ a few more).

    It also supports cropping every page separately. This is not a suitable option for large document, however, may be useful for small complex document, like the one mentioned in last comment.

    Hope you will enjoy the upgrades and send me feedbacks for further improvement ideas.

    ReplyDelete
  22. Gagan,
    Thanks. I'll include this update in an announcement soon. IF I somehow forget, don't hesitate to remind me. Thanks.

    ReplyDelete

NOTE: TO AVOID SPAM being posted instantly, this blog uses the blogger.com "DELAY" feature.

Am often away much of the day, and postings won't show up right away. Posts done to use referrer-links may never show up.

Usually, am online enough to release comments within a day though, so the hard-to-read match-text tests for commenting won't be needed this way.

Feedback and questions are welcome. Thanks for participating.

Technical Problems?
If you're having problems leaving a Comment, Google's blogger-help asks that you clear the 'blogger.com' cookies on your browser's Tools or Options menu bar and that will fix the Comment-box problems (until they have a permanent fix).

IF that doesn't work either, then UNcheck the "keep me signed in" box -- Google-help says that should allow your comment to post (it's a workaround to a current bug).
Apologies for the problems.

TIP: There's a size limit. If longer than 3500 characters or so, in a text editor, make two posts out of it.