In this lecture we will present various key developments in the Authorship Attribution (AA) methodology. More specifically we will examine a very successful and widely used stylometric feature in AAI studies, i.e. the n-gram. We will investigate n-grams in character and word levels and explore their quantitative properties. We will also discuss the various methods of tokenization existed for these kinds of units and some reference to previous studies. In addition we will present an overview of the most effective machine learning algorithms used in AA (Support Vector Machines and Random Forests). This will be a non-technical presentation and the focus will be on the concepts underlying the specific algorithms.
Podrobnosti události
- Začátek události
- 16. 5. 2019 15:50
- Místo konání
- nám. Jana Palacha 2, Praha 1 (místnost č. 18)
- Organizátor
- Ústav Českého národního korpusu FF UK
- Typ události
- Konference a přednášky
- Přílohy
- Plakát