Best way to futureproof documents?

kylera

macrumors 65816
Original poster
Dec 5, 2010
1,196
27
Seoul
I would like to basically archive my schoolwork or some of my work documents for time to come. What is the best way to futureproof the following:

- DOC/DOCX/pages documents (mine do not have complex styling)
- PPT/keynote files (some have animations and transitions, but they aren't vital to understanding the files)
- PDF papers

Should I just go all PDF?
 

MegamanX

macrumors regular
May 13, 2013
221
0
my answer would be converted them all to pdf. pdf is a standard format and it will keep any formating a lot better.
 

snberk103

macrumors 603
Oct 22, 2007
5,503
87
An Island in the Salish Sea
I would like to basically archive my schoolwork or some of my work documents for time to come. What is the best way to futureproof the following:

- DOC/DOCX/pages documents (mine do not have complex styling)
- PPT/keynote files (some have animations and transitions, but they aren't vital to understanding the files)
- PDF papers

Should I just go all PDF?
Print to a good quality paper. Even bad paper will outlast just about any digital format.

If you insist on digital, my opinion would be PDF then ODT (Open Office).
 

redAPPLE

macrumors 68030
May 7, 2002
2,615
2
2 Much Infinite Loops
my answer would be converted them all to pdf. pdf is a standard format and it will keep any formating a lot better.
...but it cannot "really" be edited.

i would like to know how institutions does this. they surely have archived microsoft office documents. does opening an e.g. office 2000 file open without changing the format and document styles?
 

snberk103

macrumors 603
Oct 22, 2007
5,503
87
An Island in the Salish Sea
...but it cannot "really" be edited.

i would like to know how institutions does this. they surely have archived microsoft office documents. does opening an e.g. office 2000 file open without changing the format and document styles?
My understanding is that opening any Word Processor document (not just MS Word files) can be an issue as far as formatting is concerned.

If the document used fonts no longer available on the computer, the system will substitute another font. While it may look very similar, line breaks and page breaks could shift - causing formatting issues.

Often documents are formatted for a specific printer. Unless the system has access to the same printer, the document may format differently.

If the document uses complex formatting, the creator may have used undocumented - and subsequently discontinued - features. Or features that have had their behaviour changed in newer releases of the application. For instance, building tables. If it is just a plain grid, then there shouldn't be any issues. But once you start merging and joining cells the formatting can be difficult to maintain across SW generations.

etc etc

Maintaining archives of old documents is a huge and complex field. National archives, for example, often maintain very old computers so that they can run very old applications, in order to access documents that are really not that old.

Meanwhile, in Timbuktu, they recently smuggled out over a quarter of million manuscripts to keep them safe from the rebels. Books and parchments that are up to 900 years old. While we have issues keeping a 15 year old Word document safe. sigh....
 

talmy

macrumors 601
Oct 26, 2009
4,705
266
Oregon
I recently decided to go "paperless" and after some research went 100% PDF. It's been around for 20 years, is now effectively public domain, anything can be printed to it, and it is a suitable format for scanning text and images as well.
 

samiwas

macrumors 68000
Aug 26, 2006
1,594
3,518
Atlanta, GA
I've run into this same issue. I used WordPerfect in the mid 90s, and getting some of those documents open recently has been a challenge. I've found apps that open a lot of them, but it's not a foolproof method.

PDF is great for documents that are done and you never will need to do anything with.

If it's a text document, and formatting isn't an issue, you could also just save it as a plain text document. Plain text has been around practically since computers could type, and I don't think it's going anywhere. Very future proof.

As for powerpoint/keynote documents, I don't think there is a way to truly further proof them outside of printing them to PDF and just having the slides visible.
 

jdechko

macrumors 601
Jul 1, 2004
4,088
216
PDF is great for static, styled content.
Just plain text is also great for anything that is text-based with the downside being that there is no formatting. You could also use Rich Text or explore Markdown.

However, I recently read/heard a discussion that Office format might not be such a big deal in the end. Obviously it is a format developed by a single company, but that the spec has been published and it is easy to find programs that are able to easily open Office documents.

As long as you stick to simple styling (as you have done), most modern productivity software should be able to round-trip an office document without issue. What I mean by that is that if you created a document in Word with simple formatting (bold, italics, lists, tabs & justification), you should be able to save it, open it in Pages, make changes, save it and open it back in Word with no loss of fidelity. Same thing with presentation slides.
 

Scepticalscribe

macrumors Sandy Bridge
Jul 29, 2008
47,409
31,712
The Far Horizon
-..........
Maintaining archives of old documents is a huge and complex field. National archives, for example, often maintain very old computers so that they can run very old applications, in order to access documents that are really not that old.

Meanwhile, in Timbuktu, they recently smuggled out over a quarter of million manuscripts to keep them safe from the rebels. Books and parchments that are up to 900 years old. While we have issues keeping a 15 year old Word document safe. sigh....
Excellent post, and, as an historian, I hear you and echo what you have just written, fervently.

To me, it is incredible that we can store documents that are thousands of years old - or centuries old - that are as readable as the day they were written, while documents created in the then cutting edge 1990s can be as inaccessible as hieroglyphics were before Jean-Francois Champollion cracked them with the use of the Rosetta Stone. Absolutely bizarre, and paradoxical.

I to store them as PDF with the caveat of knowing you are very very limited on the editing of said document.
True, alas; it is why I dislike PDF as a format.
 

SilentPanda

Moderator emeritus
Oct 8, 2002
9,993
28
The Bamboo Forest
Nothing wrong with keeping them as PDF as others have said, but you might consider keeping just a plain old text version too. Sure you lose formatting (I guess you could use LaTeX) but for the most part, a text file will always be readable and at worst a computer nerd should be able to convert it easily should the encoding change drastically over the years.
 

stonyc

macrumors 65816
Feb 15, 2005
1,259
1
Michigan
Nothing wrong with keeping them as PDF as others have said, but you might consider keeping just a plain old text version too. Sure you lose formatting (I guess you could use LaTeX) but for the most part, a text file will always be readable and at worst a computer nerd should be able to convert it easily should the encoding change drastically over the years.
Yep, I was just about to post something like this... just keep them as txt files, since layout and font preferences may change over time, plain text would be the most flexible and adaptable format to archive your files.

EDIT: I'd like to add that a lot of sequencing data is kept in plain txt format (either tab-delimited or comma-separated) because txt files can be read almost universally on any machine. For reference, google "FASTA file format" or "FASTQ file format" to see what I mean. In addition, a lot of the data that I work in is in txt format because it lends itself to being more easily loaded in to analytical environments like R, etc. Like Panda said, you lose some of the encoding that you get from storing it in other formats, but for archival purposes there's a lot to like about plain text.
 

jafingi

macrumors 65816
Apr 3, 2009
1,468
154
Denmark
If you are not concerned about editing, use PDF.

I can recommend scanning all your documents with a software like Prizmo (which I use myself). It has OCR, so it recognizes all text (even handwritten), so you can search in it (and edit it, but that doesn't work so well).

Combine it with DEVONthink to categorize, organize etc., and you've got a really really great "document manager"!