Computational Linguistics Laboratory at Katanov State University of Khakasia
Home Products Downloads Support Publications

This Web site has been created and is supported by the Computational Linguistics Laboratory (CLL) at Katanov State University of Khakasia (KSU) located in Abakan, Russia. Its aim is to provide information about laboratory's activities and products.

This site is outdated. Latest information about the CLL's activities is available at http://vetsky.narod2.ru


Our aim

The CLL conducts non-commercial theoretical and applied research within scopes of information retrieval, text summarization, data mining, computer assisted language learning (CALL), and corpus linguistics. The research is supported by federal and local grants.


Laboratory info

The CLL at KSU was founded in 2002 to conduct work in the following areas.

  1. Applied linguistics research, development of computer systems to be used in Computer Assisted Language Learning and Insruction. By now 10 such systems have been created (see Products section). A classification of software used in foreign language learning and teaching is given in [12].
  2. Automatic text summarization research. V.Yatsko (last name also spelt "Iatsko"), the head of the CLL, is the author of symmetric summarization conception that underlies PASS and ETS allowing to produce coherent and adequate summaries. For details see Our Publications [1-4]. ROS system allowing to summarize Web pages indicated by the user in a continual mode is being developed. In 2008 we released Universal Summarizer (UNIS) that has a smart automatic text classification function. Once the text is classified as scientific, publicistic, or fiction UNIS applies algorithms specially optimized for this text type to significanyly increase the quality of resulting summaries.
  3. Evaluation of the Internet information retrieval systems. Depth of user's search [5] and reference dictionary conceptions are being developed to evaluate automatic text summarization systems as well as the Internet information retrieval systems [11].
  4. Discourse analysis. Integrational discourse analysis conception [6-8] distinguishes between surface and deep levels of discourse structure. Currently we are investigating various types of possessive discourse and linguistic features of possessive relations differentiating between alienable and inalienable possession [9]
  5. Computer learner corpora research project. This ongoing project is aimed at 1) creating corpora of texts (dictations, expositions, compositions, etc.) produced by Russian-speaking learners of English; 2) creating tools for error tagging and automatic analysis of these corpora; 3) contrastive analysis of Russian learner corpora with corpora produced by speakers of other languages. The project is in line with research done by Granger et al [10].
  6. Linguistic Toolbox (LIT). LIT provides the user with a set of instruments for linguistic analysis, such as tokenizer, text splitter, tagger, dictionary comparer, wordlist, concordancer. By means of these instruments the user can get statstic data about the text, annotate it with POS tags, and conduct various types of searches. LIT supports English and Russian; its prototype version was ready in March 2008. A Beta version is to be released by the end of the year.
  7. Data/text mining. We are developing algorithms for mining chat logs and blogs with the aim of preventing undesirable events, for example acts of violence. TEXOR system that performs such mining is available online. Recently we completed a commercial project on sentiments mining having created a system that recognized and analyzed opinions of users about commercial products. The system works on an ontology and linear grammar that we specially developed for this project.

Copyright © 2006-2010 CLL at KSU
Contacts us
Address: 90 Lenin Street, office 236,
Abakan, Russian Federation, 655017
E-Mail: Support: iatsko@gmail.com, Webmaster: slavay@khsu.ru

Phone:
+7(3902)260227
Fax: +7(3902)243364