Fandom

VroniPlag Wiki

Quelle:Nm/Malin etal 2005

< Quelle:Nm

31.373Seiten in
diesem Wiki
Seite hinzufügen
Diskussion0

Störung durch Adblocker erkannt!


Wikia ist eine gebührenfreie Seite, die sich durch Werbung finanziert. Benutzer, die Adblocker einsetzen, haben eine modifizierte Ansicht der Seite.

Wikia ist nicht verfügbar, wenn du weitere Modifikationen in dem Adblocker-Programm gemacht hast. Wenn du sie entfernst, dann wird die Seite ohne Probleme geladen.

Angaben zur Quelle [Bearbeiten]

Autor     Bradley Malin, Edoardo Airoldi, Kathleen M. Carley
Titel    A Network Analysis Model for Disambiguation of Names in Lists
Zeitschrift    COMPUTATIONAL & MATHEMATICAL ORGANIZATION THEORY
Verlag    Springer
Ausgabe    11
Datum    July 2005
Nummer    2
Seiten    119-139
URL    http://www.casos.cs.cmu.edu/publications/papers/networkanalysismodel.pdf

Literaturverz.   

no
Fußnoten    no
Fragmente    1


Fragmente der Quelle:
[1.] Nm/Fragment 215 11 - Diskussion
Zuletzt bearbeitet: 2012-04-21 22:30:48 WiseWoman
Fragment, Gesichtet, Malin etal 2005, Nm, SMWFragment, Schutzlevel sysop, Verschleierung

Typus
Verschleierung
Bearbeiter
Hindemith
Gesichtet
Yes.png
Untersuchte Arbeit:
Seite: 215, Zeilen: 11-31
Quelle: Malin_etal_2005
Seite(n): 119, 120, Zeilen: 30ff; 1ff
Investigative data mining is increasingly performed on networks constructed from personal name relationships extracted from text-based documents. In such networks, a node corresponds to a particular name and an edge specifies the relationship between two names. Before such a network can be analyzed for centrality, grouping, or intelligence gathering purposes, the correctness of the network must be maximized. Specifically, it must be decided when two pieces of data correspond to the same entity or not. Failure to ensure correctness can result in the inability to discover certain relationships or the cause of learning false knowledge.

Names are not unique identifiers for specific entities and, as a result, there exists many confounders to the construction of correct networks. Firstly, the data may consist of typographical error. In this case, the name “Nasrullah” may be accidentally represented as “Nasarullah” or “Nasurullah”. There exists a number of string comparator metrics to account for typographical errors, many of which are in practice.

However, even when names are free of typographical errors, there are additional confounders to data correctness. For example, there may occur name variation, where multiple names correctly reference the same entity or same name correctly references multiple entities i.e., there can exist name ambiguity.

Link analysis is increasingly performed on networks constructed from personal name relationships extracted from text-based documents

[...] [P. 120] [...]

In such networks, a vertex corresponds to a particular name and an edge specifies the relationship between two names. Before such a network can be analyzed for centrality, grouping, or intelligence gathering purposes, the correctness of the network must be maximized. Specifically, it must be decided when two pieces of data correspond to the same entity or not. Failure to ensure correctness can result in the inability to discover certain relationships or cause the learning of false knowledge.

Names are not unique identifiers for specific entities and, as a result, there exist many confounders to the construction of correct networks. Firstly, the data may consist of typographical error. In this case, the name “John” may be accidentally represented as “Jon” or “Jhon”. There exist a number of string comparator metrics (Winkler, 1995; Cohen et al., 2003;Wei, 2004) to account for typographical errors, many of which are in practice by various federal statistical agencies, such as the U.S. Census Bureau. However, even when names are devoid of typographical errors, there are additional confounders to data correctness. For instance, there can exist name variation, where multiple names correctly reference the same entity. Or, more pertinent to our research, there can exist name ambiguity, such that the same name correctly references multiple entities.

Anmerkungen

The source is not mentioned anywhere in the thesis.

Sichter
(Hindemith), WiseWoman

Auch bei Fandom

Zufälliges Wiki