Way to capture web page to searchable PDF

Discussion in 'Mac Apps and Mac App Store' started by Lastmboy, Jul 28, 2014.

  1. Lastmboy macrumors regular

    Joined:
    Jan 16, 2012
    #1
    I currently use DEVONthink Pro Office for all my document filing and also use it to clip web pages for multiple reasons. I clip all the sites I might want to read later. However, I also clip critical data pages. For example, if I just made a purchase and the receipt is on the screen, I would like to clip that to a PDF. As well, many of the sites I browse require an account and login. In these last two scenarios, my clip in DEVONthink is useless, as it will just clip the login page, and I never get the actual page I am looking at. I can easily capture a static image of the page, but that's not what I'm after, as it's not searchable, and the links on the page won't be active.

    I started playing with EagleFiler a couple of days ago. It does the same thing, but even easier, as I can clip a page to PDF by simply hitting the F1 shortcut and it's very quick. However, it also just clips the login page for any sites that require login. I can also use the Print... option and print to PDF. However, that seldom comes out formatted anything close to the original page. For sites that don't require any sign on or account, it works perfectly.

    There may not actually be a solution for this, but I thought I would ask the experts, just in case. I want to clip a web page, exactly as I am currently viewing it (or close to) in the browser, to a searchable PDF file. Anyone know of anything that can do this for sites that I've logged into my account? It can be a browser plug-in, separate app, paid app... I don't care. I'll pay for it. I just haven't found anything that can do it. Thanks.
     
  2. Baklava macrumors 6502a

    Baklava

    Joined:
    Feb 1, 2010
    Location:
    Germany
    #2
    I would use the Print dialog box on Safari and export it as PDF.
     
  3. SandboxGeneral Moderator

    SandboxGeneral

    Staff Member

    Joined:
    Sep 8, 2010
    Location:
    Orbiting a G-type Main Sequence Star
    #3
    Another vote for exporting it to PDF from the Print menu.

    You could also use the fantastic app called Evernote and use the Evernote Web Clipper to capture pages into the app where you can search it and sync it to all of your computers and mobile devices, if so desired.
     
  4. Lastmboy thread starter macrumors regular

    Joined:
    Jan 16, 2012
    #4
    That's what I've been doing for now. It works in most cases, but often the formatting is not correct. Also, would like something that is easier than several clicks. I wonder if a person could setup a shortcut key to execute this...? I'll have to experiment with Evernote and see if it runs into the same problem.

    An example of the problem is if I log into my Bigstock account and am browsing thumbnails. If I clip the page, I get the "hamster". In this case, it looks not bad if I print to PDF, but doesn't work in all cases. I could live with this if I could somehow configure a shortcut key to do it.
     

    Attached Files:

  5. Lastmboy thread starter macrumors regular

    Joined:
    Jan 16, 2012
    #5
    You're right! Evernote does work. I just tried it and got the correct page clipped and the layout is perfect. However, it is painfully slow. I could easily do 100 clips with EagleFiler in the time it took me to clip one page in Evernote. If I could find that exact capability in something that is not cloud based (i.e. is fast), it would be perfect! Is there any way to use the native Evernote app to do this? I used the browser plugin when I tried it just now.
     
  6. SandboxGeneral Moderator

    SandboxGeneral

    Staff Member

    Joined:
    Sep 8, 2010
    Location:
    Orbiting a G-type Main Sequence Star
    #6
    Yeah, I use the browser plugin to clip a page, then I dismiss the confirmation dialog, where it asks if you want to go to Evernote on the web. Then I open or switch to the native Evernote app and hit the sync button and the clip appears in my notebook.
     
  7. campyguy macrumors 68040

    Joined:
    Mar 21, 2014
    Location:
    Portland / Seattle
    #7
    A lot more expensive, but part of my Creative Cloud - I use Acrobat Pro CC.
     
  8. Lastmboy thread starter macrumors regular

    Joined:
    Jan 16, 2012
    #8
    How do you integrate it with your browser to clip the web page?
     
  9. onekerato macrumors regular

    Joined:
    Jun 6, 2011
    #9
    Actually, Safari is just sending the URL to DevonThink and EagleFiler and that is why these apps are only able to clip the login screen (i.e. the internal browser of DevonThink and EagleFiler is not logged in as you.)

    The only way around it is use Print to PDF from Safari as others have mentioned. You can add these apps to the "PDF" drop down menu in the print dialog. I print as PDF to Evernote this way.

    You can assign a custom keyboard shortcut such as CMD P to print to Evernote (or any other favorite app), so essentially hitting CMD P in Safari, then CMD P again will send the PDF to Evernote. See this MacSparky blog post on how to configure: http://macsparky.com/blog/2008/3/19/keyboard-shortcut-for-save-as-pdf-in-os-x.html

    Use the "Reader mode" in Safari where available to get a clean page (without sidebars, frames etc.). You can also use the "Develop" menu in Safari to change the user agent and visit websites disguised as a mobile web browser, which usually leads to uncluttered pages.

    I also wrote an app (free) PDFCombo, which appends PDFs together. You can print to PDFCombo, like you print to Evernote, and it will collect everything into one large PDF. It can also add TOC entries to the first page of each input PDF. Useful when I'm researching content on the web, and want to clip lots of pages into one "packet." Makes it easier to annotate.



     
  10. campyguy macrumors 68040

    Joined:
    Mar 21, 2014
    Location:
    Portland / Seattle
    #10
    Acrobat Pro has a "Capture" command, and it's quite customizable. It's integrated into Windows browsers but not Mac browsers, likely due to sandboxing requirements (or Adobe's just too lazy?).

    One enters in a URL and what one wants to capture - one could enter in just the URL of this thread or type in CNN's top domain and tell it to capture EVERYTHING. The command can be tweaked to maintain certain properties of each web page, paper size, link integrity, etc.

    FWIW, Adobe does have a 30-day trial so you can get your fix on. I've been using Acrobat Pro like this for over 10 years, and I'm addicted to this feature. If Adobe would break out all of their apps to be "leased" like their PS/Lightroom bundle, I'd pay for Acrobat Pro alongside Illustrator and forget everything else. And, Distiller is included with Acrobat Pro too...
     
  11. Lastmboy thread starter macrumors regular

    Joined:
    Jan 16, 2012
    #11
    I'm not sure if you read my question properly, but I'm not having a problem capturing web pages to PDF. I just can't capture password protected sites (sites where you have to login to your account to browse). The Capture to PDF feature has been in Acrobat since version 9. I currently have Acrobat X Pro, and I just tried it now out of curiosity. However, it does NOT work in this situation. It's using the exact same approach mentioned by onekerato in the message above, whereby it makes it's own connection to the URL to capture the page.

    This does not work, as it does not know your user ID and password to the site, and does not know you are already logged in, so it just says the page is not found. The error below is a screen capture of exactly what I got when I just tried it in Acrobat (using a link that is valid and displayed in my browser). Does it somehow work differently/better in the CC version?

    The one thing I have found useful about using the Acrobat approach is that it allows you to traverse and capture a complete site (assuming no login account is required), with options to specify how many levels, stay on server, etc., which is kinda cool. As for simply capturing single pages quickly, there are better/easier options (IMO).
     

    Attached Files:

  12. Lastmboy thread starter macrumors regular

    Joined:
    Jan 16, 2012
    #12
    Ya, pretty much what I figured. Just wondered if there might be something like the image capture plugins/apps that grab it right out of your browser.

    I would have agreed with you, but recently tried the "Clip to Evernote" plugin mentioned earlier, and it DOES handle this correctly, even on password protected sites. It works perfectly... it's just brutally slow. I'm trying to figure out how to get it to clip straight to the native app instead of the cloud. It's formatting is much more precise (and pleasing) than the Print to PDF option, and seems to work in every situation.

    Thanks for this. I have been using the Print to PDF feature to "Print to DEVONthink". It just takes quite a few mouse movements and clicks to get to that point. The info you provided above should make that a lot simpler.
     

    Attached Files:

  13. SandboxGeneral Moderator

    SandboxGeneral

    Staff Member

    Joined:
    Sep 8, 2010
    Location:
    Orbiting a G-type Main Sequence Star
    #13
    Brutally slow? A few seconds isn't that bad is it? I do agree that I wish Evernote clipped directly into the native app. But as far as I can tell there isn't a way to do that.
     
  14. campyguy macrumors 68040

    Joined:
    Mar 21, 2014
    Location:
    Portland / Seattle
    #14
    Sorry, my bad - I had several of my employees around chatting about...

    The only means to gain access to a secure web site via Acrobat is while directly connected to the server or to be remotely logged onto that server, which is what I have done in the past (while as a consultant, using a secure tunnel - which can be a bit of a PITA as it's slow).

    I'll ask a guy I know later when he stops by. We're (not-so) civil engineers, so bear with us... :) BTW, AFAIK capturing has been around since Acrobat Pro 6, which was my first version, dating me somewhat...
     
  15. campyguy macrumors 68040

    Joined:
    Mar 21, 2014
    Location:
    Portland / Seattle
    #15
    I'd forgotten about a small company here in Portland not far from my buddy's office - Iterasi - web archiving is what they do, commercially. Check out their Page Notary solution: http://pagenotary.iterasi.com - I know that it works with "external" secure web sites, it's a web clipper on steroids.
     
  16. Lastmboy thread starter macrumors regular

    Joined:
    Jan 16, 2012
    #16
    Hmmmm... looks very interesting. I will contact them. Thanks!!

    ----------

    Ok, I may have jumped the gun a bit with that statement. Sorry about that. The first two I tried took over a minute each, but there must have been an internet bottleneck or something at the time, as they seem to be clipping (no pun intended) at around 5 to 10 seconds now, which is manageable. I wish I could avoid all the popup windows and clicks, though. I just want to hit a key and know that it's clipped. I don't want any options to select or anything to confirm. EagleFiler is really slick. Hit F1, hear a little sound, done. It's instant, with no other windows or clicks required. It just doesn't work on sites where I have to login.
     
  17. campyguy macrumors 68040

    Joined:
    Mar 21, 2014
    Location:
    Portland / Seattle
    #17
    So my friend stopped by with a bottle of Merlot and I'm obligated to help him finish it. The firm I cited earlier is the go-to company for what you're looking for, I don't recall their pricing but it's cheap relative to what you're looking for - and they may be the only affordable solution other than connecting/tunneling directly to servers, which is what I have to do for my Federally-funded clients.

    I'm using Acrobat Pro with a rMBP. Iterasi would be a far simpler solution if I had an iMac or workstation!!! ;)
     
  18. Lastmboy, Jul 30, 2014
    Last edited: Jul 30, 2014

    Lastmboy thread starter macrumors regular

    Joined:
    Jan 16, 2012
    #18
    Are we talking about the same thing, here? It sounds like you're referring to VPN connections to other workstations or servers. I use TeamViewer for that. I'm not talking about needing remote access to the site or getting through firewalls. I'm referring to simple sites where you setup a free account (or paid) and get a user id and password.

    For a real simple example, go to your "User CP" page right here on MacRumors, then try to clip that, or even try to clip the page you're looking at right now with this message. All you'll get is a message saying you are not logged in, because it makes a separate call to the same URL (i.e. does not use the page loaded in your browser). The "Print... to PDF" option and Evernote's Webclip feature both get the (logged in) page right out of your browser. All other products, AFAIK make a separate call to the web site and will either get a "page not found" or "not logged in" message.

    I wouldn't know how to VPN onto the MacRumors server, nor would I think they would want me to :eek: Maybe I wasn't understanding what you were getting at. This isn't a security problem, though. Just sites where you have any type of account setup.

    Now you've got me thinking about the Merlot. I'll run out and get a bottle. This might more sense after a few glasses :)
     
  19. campyguy macrumors 68040

    Joined:
    Mar 21, 2014
    Location:
    Portland / Seattle
    #19
    I am referring to tunneling into workstations and servers via VPN, not sites like this - I'd rather be having fun or a glass of wine than paying to archive this forum portal. :D The basic premise of what you're wanting to achieve, however, remains similar to what we've discussed here. If I need something from a protected site I'll make a PDF and archive it while I'm logged in - but I don't need or want to do that too often. Ah, my friend just returned with a second bottle, so it's time to, uh, get back to business!
     
  20. saberahul macrumors 68040

    Joined:
    Nov 6, 2008
    Location:
    USA
    #20
    That's one cute hamster!
     

Share This Page