Fandom

VroniPlag Wiki

Nm/Fragment 215 11

< Nm

31.268Seiten in
diesem Wiki
Seite hinzufügen
Diskussion0 Share

Störung durch Adblocker erkannt!


Wikia ist eine gebührenfreie Seite, die sich durch Werbung finanziert. Benutzer, die Adblocker einsetzen, haben eine modifizierte Ansicht der Seite.

Wikia ist nicht verfügbar, wenn du weitere Modifikationen in dem Adblocker-Programm gemacht hast. Wenn du sie entfernst, dann wird die Seite ohne Probleme geladen.


Typus
Verschleierung
Bearbeiter
Hindemith
Gesichtet
Yes.png
Untersuchte Arbeit:
Seite: 215, Zeilen: 11-31
Quelle: Malin_etal_2005
Seite(n): 119, 120, Zeilen: 30ff; 1ff
Investigative data mining is increasingly performed on networks constructed from personal name relationships extracted from text-based documents. In such networks, a node corresponds to a particular name and an edge specifies the relationship between two names. Before such a network can be analyzed for centrality, grouping, or intelligence gathering purposes, the correctness of the network must be maximized. Specifically, it must be decided when two pieces of data correspond to the same entity or not. Failure to ensure correctness can result in the inability to discover certain relationships or the cause of learning false knowledge.

Names are not unique identifiers for specific entities and, as a result, there exists many confounders to the construction of correct networks. Firstly, the data may consist of typographical error. In this case, the name “Nasrullah” may be accidentally represented as “Nasarullah” or “Nasurullah”. There exists a number of string comparator metrics to account for typographical errors, many of which are in practice.

However, even when names are free of typographical errors, there are additional confounders to data correctness. For example, there may occur name variation, where multiple names correctly reference the same entity or same name correctly references multiple entities i.e., there can exist name ambiguity.

Link analysis is increasingly performed on networks constructed from personal name relationships extracted from text-based documents

[...] [P. 120] [...]

In such networks, a vertex corresponds to a particular name and an edge specifies the relationship between two names. Before such a network can be analyzed for centrality, grouping, or intelligence gathering purposes, the correctness of the network must be maximized. Specifically, it must be decided when two pieces of data correspond to the same entity or not. Failure to ensure correctness can result in the inability to discover certain relationships or cause the learning of false knowledge.

Names are not unique identifiers for specific entities and, as a result, there exist many confounders to the construction of correct networks. Firstly, the data may consist of typographical error. In this case, the name “John” may be accidentally represented as “Jon” or “Jhon”. There exist a number of string comparator metrics (Winkler, 1995; Cohen et al., 2003;Wei, 2004) to account for typographical errors, many of which are in practice by various federal statistical agencies, such as the U.S. Census Bureau. However, even when names are devoid of typographical errors, there are additional confounders to data correctness. For instance, there can exist name variation, where multiple names correctly reference the same entity. Or, more pertinent to our research, there can exist name ambiguity, such that the same name correctly references multiple entities.

Anmerkungen

The source is not mentioned anywhere in the thesis.

Sichter
(Hindemith), WiseWoman

Auch bei Fandom

Zufälliges Wiki