Accession Number : AD0769560
Title : RAND Corporation Data in Systran. Volume 2.
Descriptive Note : Final rept. 1 Feb 72-1 May 73,
Corporate Author : LATSEC INC LA JOLLA CALIF
Personal Author(s) : Toma,Peter P. ; Kozlik,Ludek A.
Report Date : AUG 1973
Pagination or Media Count : 400
Abstract : NTS SOME EMPIRICAL LINGUISTIC FINDINGS BASED ON A MILLION-WORD Russian corpus with syntactic annotations. The corpus, consisting of Russian mathematics, physics, cybernetics, astrobotany and physiology, has been produced by the Rand Corp., Santa Monica, California and converted for use by SYSTRAN language-analysis processing procedures. Since all syntagmas are explicitly marked in the Rand data base, little or no contextual reference is necessary in order to establish semosyntactic relationships that may be utilized as the most essential components of an automatic parser for S+T text. Volume II deals with text statistics, the bulk of which is high-frequency wordlists in descending frequency order as well as alphabetical order for both individual and combined subject matters. (Modified author abstract)
Descriptors : (*Machine translation, *Russian language), Computational linguistics, English language, Syntax, Computer programming
Subject Categories : Linguistics
Distribution Statement : APPROVED FOR PUBLIC RELEASE