Icelandic Parsed Historical Corpus View resource name in all available languages
Sögulegur íslenskur trjábanki IcePaHC
About 1000000 words of Icelandic text, from every century between the 12th and the 21st centuries inclusive annotated for phrase structure, part-of-speech-tagged and lemmatized. Distribution Availability
Available - Restricted Use
Licence LGPL
Restrictions: Attribution, Share Alike
User Nature: Academic, Commercial
Download location: hidden
Distribution Access/Medium: Accessible Through Interface, Downloadable
Execution location: hidden
Attribution Details: Wallenberg, Joel, Anton Karl Ingason, Einar Freyr Sigurðsson and Eiríkur Rögnvaldsson. 2011.
Icelandic Parsed Historical Corpus (IcePaHC).
Version 0.9. http://www.linguist.is/icelandic_treebank
Distribution rights holders:
IPR Holder
Contact Person
Monolingual text corpus Languages
Icelandic
Language Script: Latn
Linguality Linguality type: Monolingual
Text Format Size
73,014 Sentences
1,000,000 Tokens
Character encoding
UTF - 8
Modalities Annotation Syntactic Annotation - Treebanks Tagset: http://www.linguist.is/icelandic_treebank/Tagset
StandOff: False
Segmentation level: Sentence, Word
Standard practices conformance: Penn Tree Bank
Theoretic Model: Phrase structure
Annotation Mode: Mixed
Annotation Manual:
http://www.linguist.is/icelandic_treebank/Icelandic_Parsed_Historical_Corpus_%28IcePaHC%29#Annotation_guidelines
Time Coverage
12th to 21st centuries
Geographic coverage
Iceland
Metadata Created: 11/29/2011
Last Updated: 01/25/2013
Version Version: 0.9
Last Updated: 08/29/2011
Usage Foreseen Use Nlp Applications Use NLP Specific: Other
Actual Use - Nlp Applications Use NLP Specific: Other
Documentation
Document Type: In Proceedings
Eiríkur Rögnvaldsson, Anton Karl Ingason, Einar Freyr Sigurðsson, Joel Wallenberg,
The Icelandic Parsed Historical Corpus (IcePaHC) ,
http://www.lrec-conf...
, LREC 2012
, 2012
Editor: Nicoletta Calzolari, Khalid Choukr, Thierry Declerck, Joseph Mariani, Jan Odijk, Stelios Piperidis
Publisher: European Language Resources Association (ELRA)
Book Title: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
ISBN: 978-2-9517408-7-7
Keywords: Icelandic, Faroese, treebank, parsed corpus, annotation
Document Language:
English
People who looked at this resource also viewed the following: