2006-12-23

The XML Files: XML Data Migration Case Study: GEDCOM


Download the code for this article: XMLFiles0405.exe (143KB)



XML's ubiquity and continually improving tool support has created a magnetism that attracts organizations everywhere. As organizations move to XML, they must also provide a coherent data migration strategy that allows their users to bring old files forward. This is a nontrivial problem that typically requires tedious code to implement the transformation process. The System.Xml namespace in the Microsoft® .NET Framework, however, can greatly simplify these data migration challenges through its extensible APIs and support for XSLT.

An example of data migration is what's currently happening around genealogy data formats. Genealogists have long relied on the Genealogical Data Communications (GEDCOM) 5.5 format for sharing genealogical information. GEDCOM 5.5 is text based but not XML based. A beta version of GEDCOM 6.0 is available and is completely based on XML. But what about the gigabytes of genealogical information that can still be found in GEDCOM 5.5 format? This presents an interesting data migration challenge that really should not be ignored.

In this column I'll walk you through solving this data migration problem using System.Xml. This process can serve as a blueprint for other data migration problems you may face.


GEDCOM 101

GEDCOM was developed to facilitate exchanging genealogical data across different genealogy programs and systems. A common format like GEDCOM allows users to share their work with others regardless of the program they're using.

Version 5.5 is the most widely used version of the various GEDCOM specifications (see The GEDCOM Standard Release 5.5). GEDCOM 5.5 relies on a simple text-based grammar that leverages line delimiters and level numbers to structure family tree information. Figure 1 provides a sample GEDCOM 5.5 file.

Each GEDCOM line contains a level number, a tag, and an optional value. Multiple lines constitute a GEDCOM record. A level of 0 marks the beginning of a new GEDCOM record. Every line that follows is part of the record until you reach another line with a level of 0. The tag name conveys meaning about the information on the line as defined in the specification.

In the example shown in Figure 1, HEAD is the first record and it contains six children (SOUR, DATE, GEDC, CHAR, SUBM, SUBN). SOUR contains two children (VERS and NAME) while DATE contains one child (TIME). The increasing level numbers indicate parent-to-child relationships. A line may also contain a unique identifier, as shown in the following snippet:

0 @SUB01@ SUBM
1 NAME Aaron Skonnard
1 ADDR 456 Main



XSLT for GEDCOM 6.0

The latest GEDCOM specification, version 6.0, defines a full-fledged XML format for GEDCOM information. The specification even provides a Document Type Definition (DTD) that defines the elements and attributes that make up the complete GEDCOM 6.0 vocabulary.

The GEDCOM 6.0 format is much different from the one my GedcomReader simulates. In order to migrate to GEDCOM 6.0, you have to either modify the GedcomReader implementation in order to simulate the new format or write an XSLT that performs the transformation in a subsequent step.

The GEDCOM 6.0 format is much more complex than the simple mapping I simulated in GedcomReader. Consequently, trying to simulate GEDCOM 6.0 in the GedcomReader code would be extremely difficult and error prone. Using an XSLT transformation to accomplish this step is a more tractable problem.

In Figure 7 I've provided an XSLT in the sample project that illustrates how to generate a GEDCOM 6.0 file from the intermediate XML format shown in Figure 3. The XSLT covers the most common GEDCOM 6.0 use cases, and it produces files that pass validation against the GEDCOM 6.0 DTD.

You can use this XSLT by taking advantage of the System.Xml.Xsl.XslTransform class, as shown here:

GedcomReader gr = new GedcomReader(gedcomFileName); XmlDocument doc = new XmlDocument(); doc.Load(gr); gr.Close(); // done using GedcomReader XslTransform tx = new XslTransform(); tx.Load("gedcom6.xsl"); FileStream fs = new FileStream("skonnord6.xml", FileMode.Create); tx.Transform(doc, null, fs, null);

The output file, skonnord6.xml, will now be GEDCOM 6.0-compliant. At this point it's also possible to write other XSLT transformations that change the intermediate XML format into another format of your choosing. For example, you could write an XSLT transformation that produces human-readable HTML pages for viewing and navigating the GEDCOM family tree information.

As you can see, it didn't take much code to implement a complete GEDCOM migration path. The ability to programmatically move between GEDCOM 5.5 and GEDCOM 6.0 greatly simplifies the migration scenarios that are involved in building a complete genealogy system around the new GEDCOM 6.0 data model.

Back to top

Where Are We?

I've walked through a real-world data migration scenario using the System.Xml classes in .NET. I started with the GEDCOM 5.5 format and moved to an intermediate XML format that can be processed in a variety of ways. This is made possible by a custom XmlReader implementation called GedcomReader.

The intermediate XML format can also be transformed into a variety of other formats. I was able to migrate to GEDCOM 6.0 by using an XSLT transformation (see Figure 8).

Figure 8 Migrating to GEDCOM 6.0
Figure 8 Migrating to GEDCOM 6.0

The extensibility points provided by System.Xml facilitate dealing with a variety of data migration scenarios like this one. If you have old data formats lying around that would benefit from migrating to XML, follow the approach discussed in this column to implement your own migration path.

Back to top

Send your questions and comments for Aaron to xmlfiles@microsoft.com.

Aaron Skonnard teaches at Northface University in Salt Lake City. Aaron coauthored Essential XML Quick Reference (Addison-Wesley, 2001) and Essential XML (Addison-Wesley, 2000), and frequently speaks at conferences. Reach him at http://www.skonnard.com.

Subscribe From the May 2004 issue of MSDN Magazine.


2 CONT Salt Lake City, Utah 84150
In this example, the SUBM record has a unique identifier of SUB01. This ID must be unique within the scope of the document. Identifiers make it possible to establish links between records. For example, the HEAD record referenced this SUBM record on the following line of code:
1 SUBM @SUB01@

These are the basics of the GEDCOM 5.5 grammar. GEDCOM 5.5 is capable of representing sophisticated tree structures through its use of level numbers, identifiers, and cross-referencing capabilities. I don't have enough space in this column to cover GEDCOM semantics in more detail. Suffice it to say that GEDCOM 5.5 makes it possible to express a wide range of genealogy-related information.

Since GEDCOM was primarily designed to represent tree structures (family trees) in an interoperable manner, moving GEDCOM to an XML format seems like a perfect fit.

Back to top

Mapping GEDCOM to XML

The GEDCOM 5.5 grammar maps nicely to XML. One way to define a simple XML mapping is to convert each GEDCOM line into an XML element with the same name. The line's optional data, if any, can be placed in an attribute named "value". Identifiers can be placed in an attribute named "id" and references to other records can be placed in an attribute named "idref". Figure 2 may help you visualize the mapping. Applying this simple mapping to the GEDCOM 5.5 file shown in Figure 1 produces the XML document shown in Figure 3.

Figure 2 Mapping GEDCOM 5.5 to GEDCOM XML
Figure 2 Mapping GEDCOM 5.5 to GEDCOM XML

The XML version conveys the same information, but now it can be processed by a much wider range of tools and technologies. For example, you could process the XML file with your favorite XML API (such as DOM, SAX, XmlTextReader, or XPathNavigator), a query language like XPath or XQuery, or a transformation language like XSLT. Ultimately, once you have GEDCOM data in XML format, you can do just about anything with it.

Back to top

2006-11-10



Early in my life I got tired on knowledge…


In the process, I’ve discovered things of real value:



To accomplish things = to do things in your own way


To understand things = to appreciate things not done your way,

to accept things beyond you



These two things do not preclude knowledge, but rather require it. They also make knowledge complete by laying bare its purpose.



In more classical terms, knowledge acquires value only if it has aided in the creation of things new and wonderful, or it brings you into communion with a larger whole you have not realized before.





In this way, I am an intellectual but not an intellectualist. I pursue knowledge not as an end unto itself. In the same way, I promote
myself and my ideas not for my own glory. Everything I do, everything I know, from now on, serves a greater purpose…



2006-11-04

the Forests of Rivendell

Wallpaper Forest







2006-02-27

when Darkness prevails..

"..I have lived a very long time. My master also lived many centuries, and his master before him.




"My master's master told me about the Hundred-Year Darkness... and the Great Hyperspace War.


He taught me how the agents of the dark side do their evil works. The dark siders act in complete secrecy, until the leaders of society have fallen to their evil seductions. Then they strike with great suddenness, bringing down civilizations it took aeons to build."





Master Jedi Shayoto, 3,962 BrS



(Tales of the Jedi: Dark Lords of the Sith)




I am obviously quoting this in reference to the success so far of the Great Duendeng Itim.

2006-01-07

reason to Dream


" without a Dream there is no reason to work
...without Work there is no reason to dream."


.



.                                                                                                       Copyright © 1987 Lee Ayers











speaking of dreams, nanaginip po ako


ang dami kong panaginip ngayon, and i remember most of them,



unlike some time before


before i worked, i never dreamed, really,  ..or at least i dont remember dreams when i wake up, or they are not vivid




eto:





Bingi daw ako, sa una kong panaginip, which is not far from truth in real life,


pero sabi:
























Deaf
 

To dream that you are deaf, indicates that you are feeling secluded from the world.
You may be closing yourself off from new experiences or shutting yourself out.
If someone else in your dream is deaf, this suggests that someone close to you
is withdrawn and not sharing their emotions.









which is also not far from truth in real life












Sa ikalawa kong panaginip, we were standing near a fireplace, c mama, my sister janine, and I

isa syang bahay na bato, at parang may kagubatan sa labas, kitang-kita from inside the house bcoz of the glass panels


a visitor came in, binalita nya patay na daw tatay ko!


nahulog, nadurog, ang puso ko!


sobra


then I woke up,



and wondered,

hindi naman ako ganon kalungkot cguro kung mamamatay tatay ko, kung lola ko pa, cguro ganun ka-intense


so i looked this up, too?


yes:






















 Death

Dreams about death are not necessarily bad omens, but they usually represent anxious or angry feelings. To dream of your own death is actually positive - it means renewal and letting go of an old stage of life. This is also a common dream when you are getting over an illness - and it's a good sign that you are getting better. However, if you dream that you are dying slowly, you need to drastically change your routine and reenergize your life. To dream about the death of a loved one, suggests that you are lacking a certain quality that the loved one represents. Ask yourself what makes this person special or what do you like about him. It is that very quality that you are lacking in your own relationship or circumstances. To dream of a death frequently signifies news of a birth. To be aware of a dead person you cannot identify foretells an inheritance which may not be personal, but could be indirectly beneficial to you.




so, i realized it was the positive values my father represents that saddened me, not his physical death,

but the loss of the values in life that for me he represents,  Loyalty, Perseverance, Hardwork,

maybe there's one word that conveys all these values, but iM at a loss


and i also realized that things we don't deal with in  "real life"  always turn up somewhere, in weird circumstances, coincidences, and in dreams



















yun lang po, sharing



...


2006-01-01

Numerology

Another one of those stupid things we really enjoy doing:





How is Numerology Portrait calculated?

The ancient science of numerology offers insight into the personality by assigning numeric values to names and birth dates, calculating numerological values and then interpreting the results.

To calculate the values used in numerology, all digits of a number are first added together. If the outcome is a number with more than one digit, the resulting digits are added together again until they are reduced to a single digit. For example, the number 27 is reduced by adding 2 + 7 to get 9. The number 1974 is reduced by adding 1 + 9 + 7 + 4 to get 21; then 21 is further reduced by adding 2 + 1 to get 3. All numbers are reduced to single digits between 1 and 9 except the special master number 11, which is not reduced in numerological calculations.

Letters are first converted into numbers, which are then added together until they become a single digit. The letter A = 1, B = 2, C = 3, etc.; M = 13, which becomes 1 + 3 = 4. For example, the name Amy is equal to 1 + 4 + 7 = 12. 12 is then further reduced by adding 1 + 2 to get 3.

Your Numerology Portrait applies the results of several calculations to provide insight into the most important aspects of your personality. If you have a Y in your name, please consult the table for further information on how we treat that letter's dual nature as consonant and vowel.

Your soul number reveals your inner, private self, the underlying motivations that influence your decisions and actions, your subconscious desires and your most deeply ingrained attitudes. (It is determined by adding the values for the vowels in your full birth name.)

Your Numerology Portrait is based on the following calculations:

Is Y considered a consonant or a vowel?

- If your first name begins with Y, then Y will appear in the "first vowel" section of your portrait. All other sections of your portrait will treat that Y as a consonant.

- If a Y appears in any other position in your first, middle or last name, it will be considered a consonant if preceded by a vowel and vice versa.

Total for each letter:




  1      2       3      4      5     6      7      8     9
A=3 B=0  C=2  D=1  E=5  F=0  G=1  H=0  I=1
J=1  K=0  L=2  M=1  N=1  O=1  P=0  Q=0  R=3
S=0  T=0  U=0  V=0  W=1 X=0  Y=1  Z=1


Consonant Total: 1  (73)
Vowel Total: 5  (50)
Grand Total: 6  (123)
Date Total: 1  (37)
Missing Number(s) are: 2
First letter is L
First vowel is A



Your Soul Number is FIVE.

A deep inner restlessness and discontent with the status quo makes you seek out adventure, excitement, and the unconventional. You thrive on new ideas, change, travel, experimenting with new ways of doing things. Predictability and routine make you feel lifeless and unhappy so you must find a lifestyle that is varied enough to be mentally stimulating and challenging. Independent, freedom-loving, and easily bored, you have trouble making commitments and finishing projects. You often "move on" prematurely, whether in a personal relationship or in your work. You need to develop discipline and perseverance when you have an important goal.

You have many talents and need many outlets and avenues for their expression, but try to finish one thing before attempting the next.









Will you find your soulmate? Find out today with a Free Psychic Love Reading from Keen.




upgrade



Your Personality Number: The image you present to others and your power of attraction





Your Destiny Number: Your future aims and life purpose





Your Career Number: Your talents and gifts

















And much more!



Find out more with your full-length reading...

license to use my original content

is hereby granted as long as credit is given to the original author, this license should neither be understood as a claim that any image or content found here came from me nor should it be construed that I convey any license to use such content that did not come from me; the use of third-party images and the responsibility to assess the fair use of such or to procure a separate license to use them is the sole responsibility of the licensee, any clarification should be directed to the original author, see above
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
you can support the author by clicking one of the links below, Thanks!

amazon-nimbosa20-02

Lijit Stats Wijit - Recent Readers List