Wednesday, September 28, 2016

Unconnected People on the FamilySearch Family Tree

One way to visualize the Family Tree is to think of it as huge forest of pedigrees, all with interconnected branches surrounded by a cloud of unconnected individuals and floating bits of other predigrees. Living people form the individual trees in the forest. Their ancestors are part of forest canopy that is all intertwined and ultimately related, but the cloud of unconnected individuals are either duplicates of people already in the forest canopy or people waiting to be connected to the forest. The most important thing to understand about this forest with its unending canopy is that there is an absolute place or node for each person who ever lived on the face of the earth or ever will live on the earth. One single place, no matter how complex the connections to other nodes.

Practically speaking, when you open the Family Tree for the first time, you either find yourself as an unconnected new sprouted tree or already firmly anchored to the canopy as part of a more developed tree. If you explore all the branches of what you can see of the forest, you will find them virtually unending. We arbitrarily define some of the branches as ancestors and relatives, but in reality, we are all related either by blood or marriage or both and ultimately can trace our trees back to common ancestors in the canopy. I like to think of the forest and the canopy as fractals where the individual branches just keep getting more and more complicated as they branch out into the canopy.

By definition, any time there is more than one entity in the Family Tree either connected or unconnected that belongs to a single node so putting a wrong person into any node actually puts that person, usually as a duplicate, into the cloud. It just appears that the person is in a node.

From the perspective of the user, it initially appears that the Family Tree is "their" family. This is especially true if the user's family has not done a lot of family history research. Many of the family lines will simply end with no connection to the forest canopy. All is not well in the forest however. The duplicates in the cloud are often hard to identify or even find. Sometimes the floating pedigrees are extensive and they interfere with the connected pedigrees that form the canopy.

OK, that's enough metaphorical analysis for one post. However, anyone working with the Family Tree has to take into account this cloud of unconnected pedigrees and individuals floating around out there waiting to be either merged as duplicates or attached to their unique node in the forest. Where did the unconnected cloud come from? Here are some of the sources:

  • Multiple submissions of the same person with the same or different detail information
  • The extraction programs that added unconnected people to the cloud
  • Private extraction programs where the users add everyone from a small community because of a lack of sufficient documentation to connect the individuals 
  • People with insufficient information to be found and attached to their node on the Family Tree
  • Some people who are in fact made up and never really existed
There are probably some other even stranger categories also. One ongoing challenge in adding people to the Family Tree is trying to determine if they are already out there in the cloud. We call this finding duplicates. I have found that as I add more information to the people in my part of the Family Tree, I find more duplicates. However, the supply of duplicates is not infinite and eventually, the number found either slows down or stops until a user, unaware of the person in the canopy already, creates another duplicate. 

From the perspective of someone living who comes into the Family Tree forest, it appears that there are an overwhelming number of connections. But as you work in the forest, you begin to see that many branches and nodes are empty and waiting for discovery or attachment. For all its vast number of nodes and branches the Family Tree is still alive and growing. There are still uncountable numbers of possible people who can be incorporated but have yet to be discovered through research. 

But what about the unconnected cloud. Right now, we have only the search for duplicates that gives us any idea of the number of people in the cloud and the complexity of their relationships to the existing forest canopy. A program may exist already that analyzes the Family Tree and reports on the number of unconnected individuals and pedigrees but such information has not been shared with the common users. But even with an idea of the size of the unconnected cloud, the issue of connecting all those people may be unresolvable. In some cases, as I mentioned, when individuals are added because of insufficient information on their attachment to the Family Tree, there may not be a resolution absent a dramatic increase in the available source documents. But for those members of the cloud who were created by extraction programs, diligent searching for duplicates will reveal many of their connections. 

I personally think the cloud is growing just about as rapidly as the trees. As the forest and the canopy get their nodes identified, there are always an exponentially larger number of nodes available. This is a project that has just barely begun. 

1 comment:

  1. Another way is with a merge when a person is left on the right side and not dealt with. It could be because the spouse is named "unknown" or first name only, but I always bring all of the people over and then do the merge otherwise the people are just left hanging out there alone.