Genealogy from the perspective of a member of The Church of Jesus Christ of Latter-day Saints (Mormon, LDS)

Friday, December 4, 2015

Is there a problem with adding multiple copies of the same document as sources?

The answer to the question posed in the title of this post is a definite no! In fact, there is no problem here at all. From time to time and more frequently lately, usually in the context of adding sources to the FamilySearch.org Family Tree, I get a question about adding the same or very similar record or document as a source usually from Ancestry.com. This situation arises because either the record hints from FamilySearch.org or the record hints from Ancestry.com suggest a record or a document that appears to be a duplicate of one that is already listed as a "source" for a particular individual. There is a segment of the genealogical community that is disturbed by this "duplication."

My response is so what? Why do you care? Now I will explain why I respond that way.

In our consumer society, we are deluged with duplicate signals. Even in this electronically saturated world we live in today, I get multiple paper copies of unsolicited advertisements delivered to my door by regular mail or simply dumped in my driveway. We have learned to have a convenient garbage container available where we process all of the mail including duplicate items we do not need right into the container, most without opening.

Is this the kind of response to adding assumed duplicate entries we are talking about? Hmm. I have no real reason (unless I were a compulsive hoarder) to keep junk mail. The items I throw away have no value to me either in the present or in the future. I will probably never wish that I had not thrown away that extra copy of a supermarket ad in my whole life. Now, if my wife needed the add and it was the only copy, that would be another story. But now we are talking about electronic blips that have no real physical substance. I might ask those who are so worried about an extra copy of a record in the Family Tree, when was the last time you went through your email list and deleted all the unwanted email? How many of the pop-up ads on your browser do you ignore every day? Do you stop to read all the billboards as you drive down the freeway?

First of all, the idea with the Family Tree program is to add all the "sources." There is evidently a really serious confusion between a record, a document and a source going on here. This confusion is not contained solely within the user community, but every one of the online programs use the terms in different ways. They also add in the terms "collections" and "names" to make things even more confusing. This is especially true when they are trying to impress users with numbers.

What is a record?
What is a document?
What is a name?
What is a collection?
What is a source?

Not only are there no fixed definitions for these terms, some of them are used interchangeably. I suggest that the words "record" and "document" are very often used to refer to the same thing. There may be some fine distinctions, but what we are trying to say is that there is a physical item that contains information. When I filed a brief with the court, regardless if it contained one page or a hundred pages, it was one document or record. We commonly use the term collection to refer to a number of closely related documents. But you will find that both FamilySearch.org and Ancestry.com have collections that only contain a very few documents and in many cases with Ancestry.com only one document.

The confusion comes from using the term "source" as meaning the same thing as either the word "record" or "document." In my mind, a source is neither a record nor a document. A source is a statement identifying where I found a document or record. The way I write down the information where I obtained or reviewed a document or record is a citation to the source.

So, if I look at a U.S. Census record (whatever) and find my ancestor's family listed there, then the U.S. Census becomes my "source." But wait, that is not all. I also need to record where I found that U.S. Census record. Now this is where we have fallen down almost completely as genealogists. Some times we record the entity that created the document, but we leave no indication about where we actually found the record or document. How many citations, no matter how carefully crafted, tell me where I can go to find a copy of the item cited? Practically none. Genealogists spend an inordinate amount of time crafting and worry about the formalities of citations while at the same time almost uniformly ignoring telling us where they found the document. It is great that you tell me you got the item out of the U.S. Census but where did you find the U.S. Census? For example, the most commonly used formats for book citations do not indicate the library or other repository where the book was found. It is left entirely up to the reader to determine where to find the book.

Online research has evolved to the point where proper source citations contain not only a link to the  place where you found the record or document but also a notation about when you viewed the document online. This is important because it lets us know whether we should worry about whether or not the document we viewed is still online to be viewed.

Let's look at a citation for a "source" from the Family Tree:


If you look carefully you will see the following:

  • The title of the record
  • A link to the location of the record online
  • A reason for adding the record, if I put one in
  • The traditional citation form
  • The date I added or viewed the record

Pretty complete. But wait, there is something going on here that is being totally ignored by those fussing about duplicate records. This "source" shows me a website where I can go to view the record. I their zeal to avoid duplicate records and documents, these individuals are ignoring the fact that we are talking about sources not records. If we found this same U.S. Census record on more than one website, shouldn't we be recording all those sources for the record? Yes, we should. What if one or the other of these sites goes down, wouldn't it be nice to know about all the other places where I might find the same record? This is especially true if a copy of the record is not also attached for reference. But even then, I might want to consult that other source for additional documents. So by limiting the the Family Tree to one "source" for every copy, I am losing all that research.

Let's stop worrying about duplicate sources and start worrying about losing the links to available copies of the record. The more links to different online locations, the greater the chance that I will be able to see a copy of the document or record and the greater opportunity I have to look for additional records.

What about adding multiple references to the same source? For example, I find the 1920 U.S. Census record for my family on FamilySearch, do I need to add in several links to the same document? No, but so what if there are several links? I spend a lot of time cleaning up entries but I am trying to make sure I don't throw away valuable information at the same time. If it offends your sense of order to have duplicate sources (not duplicate records mind you) then go for it. Like I said above, who cares?

A last note. FamilySearch has cautioned us that when there are duplicate record hints suggested, we should attach all of the hints, even the duplicates. This is because we are telling the program that the search made is correct. If you want to later detach those extra sources, that is fine, but if you ignore or mark the source as "not a match" then you are running the risk of telling the computer that the search was wrong and so you will not get any more suggested hints. For the same reason, you should be attaching all the suggested record hints in all the programs, because this indicates to the programs that the search is correct and then the program can find even more hints. Let's stop being aggravated because we have more record hints than we can process. Let's think of all the advantages that condition affords us.

A last last note. Adding sources to the Family Tree has an additional benefit of slowing down and mostly stoping unwarranted changes to the entries. Adding more and more sources is a way to discourage junk.

8 comments:

  1. "Adding more and more sources is a way to discourage junk."

    This would be the case if a person adding something to a FS-FT individual read the sources. In most cases they cannot read original data referred to by a "source" because so many of the things imported to FS-FT from other websites are just indexes or purported transcripts (with, of course, many errors). Much of this index material is copied by Ancestry from FS, and the same is true of other sites. FS-FT makes it difficult to attach a record as a source, including a record that had been indexed and copied by the aggregators. Entering the myriad aggregated index copies in FS-FT can be a lot of work, and since the actual records in many cases are unavailable on any internet site, it is a bit genealogically foolish to make all those entries just to make the FS-FT program happy.

    ReplyDelete
    Replies
    1. When the FamilySearch Family Tree program is happy, you are happy. :-)

      Delete
    2. An internally consistent search engine is one thing. Genealogical accuracy can be quite another kettle of fish.

      Delete
  2. Thanks James, for answering my questions about multiple sources. Your answer made the problem very clear and I'll stop being so obsessive about cleaning up my sources. Thanks for doing a great service to the FH community with your blog! Pat

    ReplyDelete
  3. The thing I wonder about is how many of the url's we are putting in our sources are still going to work 50 years from now? Should we be uploading a copy of document and adding that image to the source as well?

    ReplyDelete
    Replies
    1. That is a key question. Digital preservation is a real issue.

      Delete
  4. Thank you for the information. I have recently been wondering if I was "over reacting" on adding all the sources or records for the same event, but I felt that it was right to do it, so I have been continuing to do it. I also a little while ago realized that if I add a "link" to a source in ancestry for a record, that anyone who does not have "ancestry" or any other subscription based program can not see the "source" or "record" from that site. I started to ADD all of the PERTINENT INFORMATION from that "source or record" to my 'reason or explanation' of why I was adding that source or record. In this way any important facts or information contained in that record, whether an index or an image, is available to anyone seeing my "source" attached to Family Search/ Family Tree. With the added important information in the "reason to attach statement", I am not excluding others or myself from knowing what information is in that record or source and why it is important to attach it. In this way I am also not "losing" the information I need if I or others can not view the link.
    Thank you for all your insights and information. I appreciate them very much. They are always very helpful.

    ReplyDelete