For some dblp authors we have stored supplemental information. For example, if you visit the author page of Stephan Diehl, you may find a link to his "Home Page" right behind his name. In the huge dblp XML file, you can find the following record which contains this information:
<www key="homepages/d/StephanDiehl"> <author>Stephan Diehl</author> <title>Home Page</title> <url>http://www.st.uni-trier.de/~diehl/</url> </www>
These records are called person records. Person records have been added to include personal information of authors to the dblp data set while keeping the XML format downward compatible to not crash existing software of third parties. Therefore, we reused the existing www elements with the following somewhat strange convention: Person records always have the key-prefix "homepages/", the record level tag is always "www", and they always contain a title element with the text content "Home Page". The author element is used to identify the person, the url element contains the address of the author's home page.
Sometimes people change their name, or people are known by several names. For example look at Margaret H. Dunham, Alon Y. Halevy, C. J. van Rijsbergen, Anastasia Ailamaki, ... To represent multiple alias names, we simply enumerate the name variations in the person record using author elements. The first name is used as the primary name of a person. All uses of the other (secondary) name variations are mapped/redirected to this primary name.
<www key="homepages/r/CJvanRijsbergen"> <author>C. J. van Rijsbergen</author> <author>Cornelis Joost van Rijsbergen</author> <author>Keith van Rijsbergen</author> <title>Home Page</title> <url>http://www.dcs.gla.ac.uk/~keith/</url> </www>
To identify people, it is also helpful to store additional information like their affiliation or their name in an alternative writing system (e.g. see Wei Wang in dblp). In person records, there is an optional note element. The contents of this field are printed out at the heading of the corresponding dblp person page. Look at Chen Li or Atsuyuki Morishima.
<www key="homepages/l/ChenLi"> <author>Chen Li</author> <title>Home Page</title> <url>http://www.ics.uci.edu/~chenli/</url> <note>Irvine, CA, USA</note> </www>
The note field of a person record may also contain a "type" attribute that specifies the nature of its content. E.g., see Chin-Chen Chang.
<www key="homepages/c/ChinChenChang"> <author>Chin-Chen Chang</author> <author>Alan Chin-Chen Chang</author> <note type="unicode name">張真誠</note> <note type="affiliation">National Chung Cheng University, Taiwan</note> <title>Home Page</title> <url>http://www.cs.ccu.edu.tw/~ccc/english/index.html</url> </www>
More information on the XML structure of the dblp records and several design decisions can be found in the following paper:
- Michael Ley: DBLP - Some Lessons Learned. Proceedings of the VLDB Endowment, Volume 2: 1493-1500 (2009).