Antocont
and Word Sketch Engine
) to build and query/extract different linguistic patterns (concordance
, collocates
, keywords
, n-grams
,... and concgram
, BTW)see gitbook
interpretative, linguistic, or paralinguistic information
to a digitalized corpus of written and/or spoken data.treebank
: syntactically annotated corpus)An annotation scheme should contain at least:
An infrastructure for developing and deploying software components that process human language. GATE helps scientists and developers in three ways
Walk through basic learning modules
language resources
)Processing Resources
Applications
and Runtime ParametersAnnotations
Data Stores
and Saving Applications[Ref] 1. 人民日報 2. 聯合報
A comparable corpus can be defined as a corpuscontaining components that are collected using the same sampling frameand similar balance and representativeness (cf. McEnery, 2003: 450), e.g.the same proportions of the texts of the same genres in the same domains ina range of different languages in the same sampling period.