Home
Register
Login
Browse Resources
Community
Statistics
Help
About
META-SHARE Members
META-SHARE Repositories
META-SHARE Managing Nodes
LR Sharing
Licensing LRs
Notice and Takedown Policy
Privacy
Data Protection
Data Protection Statement
9
Last view: 2024-12-20
SemCor
SemCor
SemCor corpus in English and Romanian.
« Back
Download
You don’t have the permission to edit this resource.
Edit Resource
Distribution
Availability
Available - Restricted Use
Licence
MS Commons - BY - NC - ND
Restrictions:
Inform Licensor, No Derivatives, No Redistribution
Distribution Access/Medium:
Accessible Through Interface
User Nature:
Academic, Commercial
Contact Person
Dan Tufiş
http://www.racai.ro/...
Research Institute for Artificial Intelligence, Romanian Academy
RACAI
Director of the Research Institute for Artificial Intelligence, Romanian Academy
[javascript protected email address]
Casa Academiei, Calea 13 Septembrie nr. 13, etaj 3, Bucureşti, România, 050711
050711 Bucharest
Romania
Tel.: 0040 21 3188103
Fax: 0040 21 3188142
http://www.racai.ro/
RACAI
Casa Academiei, Calea 13 Septembrie nr. 13, etaj 3, Bucureşti, România, 050711
050711 Bucharest
Romania
[javascript protected email address]
Tel.: 0040 21 3188103
Fax: 0040 21 3188142
text
Bilingual text corpus
Languages
Romanian (175,603 Tokens)
English (178,499 Tokens)
Linguality
Linguality type:
Bilingual
Multi-linguality type:
Parallel
Size
354,102 Tokens
Character encoding
UTF - 8
Modalities
Written Language
Annotation
Segmentation
StandOff:
False
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
XCES
Annotation Mode:
Automatic
Annotation Tools:
TTL Web Service:
http://ws.racai.ro/t...
Lemmatization
StandOff:
False
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
XCES
Annotation Mode:
Automatic
Annotation Tools:
TTL Web Service:
http://ws.racai.ro/t...
Syntactic Annotation - Shallow Parsing
StandOff:
False
Segmentation level:
Word
Format:
text/xml
Annotation Mode:
Automatic
Annotation Tools:
TTL Web Service:
http://ws.racai.ro/t...
Morphosyntactic Annotation - Pos Tagging
Tagset:
Morpho-Syntactic Descriptors: http://nl.ijs.si/ME/V4/msd/html/index.html
StandOff:
False
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
XCES
Theoretic Model:
Hidden Markov Models
Annotation Mode:
Automatic
Annotation Tools:
TTL Web Service:
http://ws.racai.ro/t...
Semantic Annotation - Word Senses
StandOff:
False
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
XCES
Annotation Mode:
Manual
Annotation Tools:
WSD Tool
Syntacticosemantic Annotation - Links
StandOff:
False
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
XCES
Theoretic Model:
Lexical Attraction Models
Annotation Mode:
Automatic
Annotation Tools:
LexPar
Metadata
Created:
11/28/2011
Last Updated:
02/01/2013
Source:
METANET4U
Documentation
Document Type:
Manual
RACAI,
SemCor Corpus
,
http://ws.racai.ro:9...
Keywords:
SemCor corpus, word sense disambiguation, POS-tagged, lemmatize, chunked, linked
Document Language:
English
People who looked at this resource also viewed the following:
Term Bank of the Republic of Lithuania
Lexical database of Standard Lithuanian language
SemEval-2014 ABSA Test Data - Phase A
The Tampere Bilingual Corpus of Finnish and English