Croatian Dependency Treebank HOBS
ID: 310 Croatian Dependency Treebank is a part of the Croatian National Corpus (i.e. Croatian part of the Croatian-English Parallel Corpus, CW2000) where 4,626 sentences (118,529 tokens) are planned to be manually annotated at the analytical layer following the Prague Dependency Treebank formalism adapted to Croatian. The corpus size is currently 3,465 sentences (88,045 tokens). It is published under CC-BY-NC-SA license. Distribution Availability
Available - Restricted Use
Licence CC - BY - NC - SA
Restrictions: Academic - Non Commercial Use, Attribution, Share Alike
Execution location: hidden
Distribution Access/Medium: Downloadable
Distribution rights holders:
IPR Holder
Contact Person
Monolingual text corpus Languages
Croatian
Language Script: Latn
Linguality Linguality type: Monolingual
Size Character encoding
UTF - 8
Annotation Segmentation Lemmatization Segmentation Segmentation level: Sentence
Morphosyntactic Annotation - B Pos Tagging Syntactic Annotation - Treebanks Segmentation Segmentation level: Paragraph
Resource Creation Creation started: 06/01/2007
Funding Project Central and South-East European Resources (CESAR)
Funding Types: Eu Funds, National Funds
Funders: European Commission (50%), University of Zagreb, Faculty of Humanities and Social Sciences (50%)
Project duration: 02/01/2011 - 01/31/2013
Metadata Created: 07/30/2012
Last Updated: 02/04/2013
Version Version: 1.0
Last Updated: 07/30/2012
Documentation
Agić, Željko. Pristupi ovisnosnom parsanju hrvatskih tekstova / PhD thesis. Zagreb : University of Zagreb, Faculty of Humanities and Social Sciences, 2012-07-09, 216 p.
Tadić, Marko. Building the Croatian Dependency Treebank: the initial stages. // Suvremena lingvistika. 33 (2007), 63; 85-92.
People who looked at this resource also viewed the following: