Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MSFT

macrumors member
Original poster
Dec 14, 2013
63
1
I work in a law firm where right now I am going through a lot of "ediscovery", to the tune of 40GB worth of emails and other documents in PDF. Searches mostly involve typing in keywords and waiting. I'll usually have about 100 PDFs open at a given time. Right now I am using a Windows 7 PC with a Core 2 Duo, 4GB RAM, and a 5400RPM hard drive. Searches are pretty slow.

I'd like to make a recommendation that we upgrade the computer to improve efficiency, but I'm not sure where the bottleneck is. Obviously I am going to suggest an Apple product, but spec-wise, do I want to lean heavy on the RAM? CPU? Will a big SSD be beneficial for these searches?
 
I work in a law firm where right now I am going through a lot of "ediscovery", to the tune of 40GB worth of emails and other documents in PDF. Searches mostly involve typing in keywords and waiting. I'll usually have about 100 PDFs open at a given time. Right now I am using a Windows 7 PC with a Core 2 Duo, 4GB RAM, and a 5400RPM hard drive. Searches are pretty slow.

I'd like to make a recommendation that we upgrade the computer to improve efficiency, but I'm not sure where the bottleneck is. Obviously I am going to suggest an Apple product, but spec-wise, do I want to lean heavy on the RAM? CPU? Will a big SSD be beneficial for these searches?

Anything with an SSD and a fair amount of RAM (8GB and up) will be fantastic for this workload.
 
SSD!! SSD only!
+1 for SSD. I believe that by taking a half-day to swap the SSD in for the HDD, you would have a positive ROI within a day. You didn't say how big the typical pdf is, but let's assume that it's mostly text and easy to render. Getting the file off disk and into memory, rendering it, and then the process of full-text search for 100 documents can be pretty resource intensive, and you may see benefit from more compute speed and memory. I say "may" because I don't know how well the text search app (or is it search within Acrobat) can take advantage of all the resources, or if the Mac version if more efficient than the PC version. But if time is of the essence - and with lawyers fees, how can they not be? - then get a faster, SSD-based system that uses a PCIe-based storage interface (like MBP) and xfer over the files). Bam!
 
I appreciate your responses!

The files are all text, and each one is only a few MB, but there are 60,000+ of them that have to be searched...
 
As a fellow law firm employee, I also agree that an SSD is a necessity, as is a lot of RAM (you did say 100 PDFs open simultaneously). Given the cost of RAM, I'd personally go for at least 16GB. On my computer, 500+ page PDFs take seconds to search.

Also, if you have decent eyesight I'd highly recommend a 4K monitor and running it at it's native resolution (or something close). I have a 27" model (Dell P2715Q), and I can easily view 8 documents simultaneously (4 horizontal by 2 vertical).
 
I appreciate your responses!

The files are all text, and each one is only a few MB, but there are 60,000+ of them that have to be searched...
You didn't say where the files are located that you are searching. If they are stored on a server, it may also be a bottleneck. If they will be on your local machine, then the larger memory and SSD should make it faster to search.
 
You didn't say where the files are located that you are searching. If they are stored on a server, it may also be a bottleneck. If they will be on your local machine, then the larger memory and SSD should make it faster to search.

They are on a server but I put them on my local hard drive before searching them.
 
First, allow me to offer a "yikes!" No offense intended. o_O

Many of us will offer "buy a Mac" (I did, long ago...) but there's a few other easy steps to take with PDF files. If you're using a "free" PDF reader application - stop. You work in a law office, they can afford a license of Acrobat Pro/DC - Acrobat DC can be purchased independently of the Creative Cloud subscription and there's a 30-day trial to demo to see if it suits you. If your office is on an older version of Acrobat - upgrade, as many earlier features were deprecated as of version 11. DC is much nicer to use than previous versions. Enough of that.

First, purchase an external SSD or quality USB thumb drive that is of sufficient size to contain the PDF files you're using. My Mini Server's 5400 RPM drives are about 1/3 the read or write speeds of an older SSD (with a USB 3.0 interface) drive I use just for this purpose - to hold temporary files - no internal SSD needed, for now, and I could buy a replacement from the stores I have a PO agreement with today. Get those PDF files off your system drive - the only other bottleneck on your PC that's slower is your network connection. I use an external SSD for my Photoshop, AutoCAD, Solidworks, and document searches every day - I used to be the "grunt" in the office, and now I'm the owner...

If you're using Acrobat Reader or Acrobat Pro/DC, there's a few other tweaks to be made to really speed up your workflow.
  • In either application, turn off a few options/preferences. First, change the setting for "Page Display>Resolution" from the default of your monitor to the System Setting - that lower resolution will not tax your graphics card nearly as much as the optimum default setting. Under that same group, uncheck/disable the "Rendering" options for "Smooth Line Art", "Enhance Thin Lines" (you're performing text searches...), and "Smooth Images".
  • Also, in either application, enable the "Search" Add-in, if it is not enabled - enabling that Add-in activated the by-default-not-activated Search back-end included with Acrobat/Reader.
  • Also, in either application, disable any/all of the Add-ins not used in your workflow.
  • In the "Internet" setting, change the default "56 kbps" setting to "LAN" or which ever speed is appropriate - as silly as it sounds, file previews have (for me) displayed far more rapidly than the really-slow-default setting.
  • Last, if all/most of the PDFs you're working with have "text" in them - you're searching for text/text strings, right? - take a bit of time and export them as RTF (Rich Text Format) files - special text files which are far easier to use/manipulate/search than PDF files, and you'd be searching through the file itself rather than the the "text layer/resource" of the PDF file.
Consider an SSD only after making the above tweaks to your workflow and only after considering buying a shiny new Mac. :D
 
I'm confused as to why you'd want 100 PDF files open at the same time, is that the only way you feel you can search? Is that because you kick of searches separately in each one?

Before upgrading (although it's still beneficial) I agree with the recommendation for Acrobat DC, there's an article here https://helpx.adobe.com/acrobat/using/searching-pdfs.html that includes an explanation of how to search in multiple PDF files and it'll build an index for you with links to the document and the location where it found the target text, surely having that index would be a huge time saver for you?

It seems to me that if you can fire of a search of all 60,000 documents then even if it's slow you can leave it running overnight and come back to a fully indexed catalog in DC.
 
Good suggestions. I will try today and report back.

I tend to have that many documents open at once because one program I use, FoxIt Reader, opens each document in a tab when I click on it. I switch between FoxIt and Adobe based on the type of search I need to perform. FoxIt is FAR faster and never crashes, unlike Adobe.

Unfortunately, certain rules prohibit me from using documents in anything but their "native format", so I have to stick with PDF (which is the case for 90% of files I deal with).

I also cannot search just one set of keywords overnight ad forget about it, because it's more like a treasure hunt. One clue leads to another. I probably do 30 searches in a given day.

Thanks all again for your input.
 
Since it appears that your computer is being used to make a living and that you are heavily taxing the machine, I would suggest either an a fast iMac or a previous generation MacPro, with the memory maxed out and a 1TB SSD. I don't do document review, but I do a lot of contract drafting and I use a maxed out 2013 iMac, with a 3TB fusion drive with two 27 in Apple Thunderbolt displays to provide the maximum amount of screen real estate, as I have 3 or more applications open at the same time. For document review, would probably opt for a MacPro, for maximum expandability.

You did not ask about reviewing software, but you may want to look at the X1 desktop app at X1.com. It indexes over 200 different document formats, including pdf, and it has the most robust search function that you can find anywhere. It even indexes email archives. The only issue is that it is windows only so I run it in a virtual machine under Parallels. It can't be beat for $80, plus around $20 per year for the update service. They also have an application specifically for ediscovery, but I have have not used it. I have used the desktop app or one of its predecessors for over 20 years and it is a lawyer's best friend. I hate that it is windows only but I find it so useful I have no other choice.

Good luck.
 
OFLawyer, great advice, thank you so much. Your setup sounds like a dream for someone who drafts contracts. Is it THREE 27" displays? You should take a picture of that and post it!

I will look into X1 desktop. I've recently been exploring Logikcull but it's quite expensive for what you get, even if you bill it down to the client.

I still have a year of law school (and a few years working in a firm after that) before I can choose my office's computers. If anything, I'm beginning to understand just how technology is an investment and not just a luxury.
 
Last edited:
No offense intended OP, but using free software without the appropriate enhancements is not something I'd consider using in an office environment for production. Acrobat's and Foxit's Reader applications aren't intended for searching large archives of documents, and I'm reading into your posts that you haven't even considered installing the right tool for the job - like the Acrobat Search Plug-in or Foxit's PDF IFilter, both of which are indexing tools specifically designed to batch index PDF files and create a searchable database - I've been using both for years, and I'd do that before dropping $3k on a Mac (and I have a $3k Mac on my desk, with 2 27" displays too). But I also give my employees the tools to do their job, but within reason. Good luck, over and out!
 
In support of ukchris: get the documents indexed using Adobe or other tools. A search over all documents, provided they are indexed, should not take more than 1-2 seconds on a decent computer and be instantaneous on a powerful computer. That would make following 'leads' much more productive AND more fun.
 
I have a question about indexing. I was under the impression that indexing is for larger PDF files. Most of what I am dealing with is under 1MB (e.g., each email chain gets its own PDF). But I have, lets say 60,000 of these files (the number is over a million but I am starting with this "smaller" batch).

I have Acrobat Pro DC, which can index PDF files, but is there a way I can make one index for a folder full of PDFs? If so, does it matter that there are other types of documents mixed in the folders (e.g., xlsx, doc, tiff)?
 
As the username suggests, I'm a litigator, and have spent more hours than I care to recall doing doc review. As others have mentioned, in terms of hardware, an SSD will make the biggest difference, followed by RAM.

But, I suspect your real bottlenecks are software and hardware. Assuming you don't have the budget for dedicated discovery platforms like Relativity (but consider something like Allegory, I think they can meet lower budgets) look at searching/indexing tools like X1 (Windows) or EagleFiler or Houdah (Mac). Taking a few hours to learn how a program like that can set up structured searches, tag documents, and otherwise organize your workflow will make you not just faster, but far more effective.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.