5 Questions: How do you manage source documents?
Every genealogist eventually finds themselves with too many paper documents. About 10 years ago, I realized that I had over 25 linear feet of documents and no way to find everything. To bring order to this chaos, I digitized most of my research, assigned a number to every record, and used my website to index these records.
I posed to three DGS members the question of how they manage their documents. Here’s what they said.
The participants
- Susan Rainwater is the current DGS Director of Finance and member of the Website and Newsletter subcommittees. She is uses 75% digital documents, with about 25% paper still waiting to be digitized.
- Tony Hanson is the current DGS Director of Information Technology, and a former DGS President. He uses 99% digital documents and 1% paper.
- Todd DeDecker is the current DGS Website Administrator, and a former DGS President. He uses 50% digital documents and 50% paper.
- Caroline Simpson is a past DGS Secretary. She uses 50% digital documents and 50% paper.
Question 1
Rainwater: How much of your research is in paper documents?
Hanson: Virtually none.
DeDecker: Much of my paper research is based on documents received from family and represents half of my documents. Currently I store these in large archival boxes grouped by family. Within each box is a loose attempt at chronological order by individual. This system allows me to have a place to hold documents and artifacts but needs a full digitization effort and better overall organization.
Simpson: About 50% paper and 50% digital documents. In the beginning, during the first three generations, I was able to retrieve original paper documents. But after the great-grandparents, it became a digital world. Occasionally with the help of original land records and probate records from the county court house and the Texas State Archive I can get scanned copies of original records.
Question 2
Rainwater: How much of your research is in digital documents?
Hanson: Virtually all of my research is done by accessing documents that have been digitized.
My father’s parents emigrated from Norway, and Norway has done a fabulous job of digitizing records of interest to family history researchers.
My mother’s grandparent all emigrated from Germany. The records Germany have been provided by German researchers I have hired to go to the archives to search on my behalf. Everything they have found have been provided to me in digital format.
DeDecker: About half of my document collection is digital. Some comes from online resources and includes documents, books, and newspaper articles. One challenge I have is when I find something using Ancestry or another site, they provide a place to store that document. Unless I reference it somewhere else (research log, index) it can be forgotten until I stumble back onto it.
Additionally, I use my phone when I need to collect information quickly. The simplest method is to just take photos of the items (books, documents, maps). I have experimented with using a scanning app (both 3rd party and native apps) to build a PDF for a related set of items. Both require robust (and nearly immediate) organization into the broader system or they can become orphaned. My challenge is that I am fairly adept at capturing but a bit too lax with the organization component.
Simpson: The further back I go in time it seems the more digital documents I have. I have a color coding system for each side of my family. I have red notebooks for my fathers’ side, white for mom’s side. On the computer there are documents and pictures scanned into electronic file folders.
Question 3
Rainwater: When you need one of your documents, how do you find it? What sort of filing, organization, indexing, or keyword system do you use?
Hanson: I have a three-pronged approach. 1) I have created a file directory structure organizes images by category. 2) I also utilize a descriptive file naming convention that provides high level information about the file contents, and 3) I associate metadata* with most files (I am not as rigorous with this as I would like to be).
DeDecker: When looking for a paper document, I find the family archival box and search through its contents. This can be a fun exercise but not at all efficient. On the digital front, I have a structure of folders based on family and individual. This is mirrored on the OS file system and my camera roll. Both are backed up to multiple “clouds”.
Beyond the folder structure, the file name becomes the main search criteria. I have not yet gone deeper with this organization. I have considered adding years or decades to the folder structure as well as maintaining an external index to the documents. I have not convinced myself that either will help significantly at this point in my research. That may change as I get deeper into the process. For photos I have collected, I always have the geolocation enabled so I know where I took the photo. On phones, there is no file naming mechanism, so I try make use of the caption feature and include relevant details about the document, who I got it from. I consider these systems good enough for now, but recognize spending some time to improving my system will pay off in the future.
Simpson: My two main indexes are my research logs and a family timeline. Most of the time I can remember a location in my head. After my great-grandparents, I know that it’s in the family file subfolder that individual. My keywords are color codes.
Question 4
Rainwater: What’s your approach to documents that may have privacy issues – for example, a personal letter where the author is still living?
Hanson: I respect the privacy wishes of the author/creator when it comes to publishing such information.
DeDecker: I have a few items in my collection that I am not yet ready to share broadly. These items (documents or journals) are kept in my storage area. This includes some from deceased family that remain unpublished as well. As I describe this process, I realize that I keep this information in my head. I need to consider a simple classification system so I know what I want locked down and why. Possibly consider a rough time frame for release as well.
Simpson: All of the letters that I can source the people have passed on. The only document that I ran across was a death certificate when someone committed suicide. Due to the graphic nature of this event I can see why the family wanted to keep it a secret. I will respect the their wishes.
Question 5
Rainwater: If you have digitized your documents, describe your process.
Hanson: I use a flat-bed scanner if available and suitable. For larger documents I utilize a photo scanner such as the i2S CopiBook Scanner available at the Dallas Public Library. Otherwise I use a digital camera.
I scan at resolution suitable for the type of original image. I use the University of North Texas “Scanning Standards by Type of Material” webpage as a reference.
I usually save images as TIFF files and make lower-quality (and smaller) JPG or PNG derivative images as required. I implement file naming, storage and metadata as described above.
DeDecker: In the past I have scanned documents using a flatbed scanner. With the advances of smart phones, all my digitization efforts have moved to that platform. I consider photo collections different and simply use a bulk photo scanner. The software allows a folder organization along with a filename that will increment with each photo scanned.
Currently I use two methods to capture documents. The first is simply using my phones camera. This will include the location and date/time, as well as other information about the document or where/why I captured it. From there, I move the photo or groups of photos to an album in the camera roll for that specific family. Usually. I have also used scanning apps to scan a series of documents into a single PDF. This provides an immediate level of organization but little in the way of additional context about the documents. These are stored on the phones file system allowing me to choose a file name to help establish context. The file is saved in a cloud folder making it immediately available to any other device.
Simpson: I have started the process of scanning pictures, family Bibles and original documents.
Taking the time to label the pictures and writing an index number on the back, entering the number in an excel spreadsheet and sorting by family. I began using the scanner at the community college then a small scanner at home. Recently, I have taken a few items to the Dallas Public library scanners. I back everything up into the cloud.
*Note: Wikipedia describes metadata as “data about data,” breaking this down into six categories: descriptive, structural, administrative, reference, statistical, and legal. For the example of a photo, this might include where the photo was taken, who took it, what camera was used, the date it was taken, and copyright license information.
Each participant was given the five questions and responded with five answers. The conversation has been lightly edited for length.