RiTa is a free/open-source library for writing in programmable media. It provides functions for simple language processing and generation tasks without the overhead or complexity of a full NLP stack. Designed to be small and easy-to-use, it runs in a variety of environments including the browser, p5.js, Processing, Node and Android. Features include grammar and Markov-based text generation, tools for inflection, conjugation, stemming and tokenization, as well as analysis of English features such as part-of-speech, phonemes, syllables, and stresses. RiTa's customizable lexicon can be searched via partial matches on combinations of any of the features listed above. It also includes a letter-to-sound engine for analysis of unknown words, and a small scripting language designed for writers (RiScript)

Java

Java

Javascript

Javascript

Processing

Processing

node.js

Node JS

Android

Android

p5.js

p5.js

Observable

Observable

from old Norse, meaning to mark, scratch, or scribble

RiTa Version 2


Version 2 is a full rewrite of the library that is simpler, faster, more consistent, and more easily maintained. But it is NOT backwards-compatible with all programs written with version 1.x. Nearly all of the functionality of v1 is still available (plus many new features), but much of the organization has changed. Migrating an older work to v2 is generally quite straightforward (and I'm happy to help if you run into problems)...

Key Features
  • Integration of the RiScript scripting language, designed for writers
  • Letter-to-sound engine for feature analysis of arbitrary words (with or without a lexicon)
  • Smart lexicon search for words matching part-of-speech, syllable, stress and rhyme patterns
  • Heuristic algorithms for inflection, conjugation, stemming, tokenization, and more
  • Powerful new options for generation via grammars and Markov chains

Important Changes
  • Nearly all features of the library are now static functions on the RiTa object. So you can access them as RiTa.XXX(), e.g., RiTa.phones(), RiTa.tokenize(), RiTa.conjugate(), and so on. This means that there is only one object to import into your programs.
  • The RiMarkov and RiGrammar objects are significantly improved, and are now accessed via the RiTa.markov() and RiTa.grammar() static functions. RiTa grammars are still compatible with JSON, but can now include all the features of RiScript.
  • RiScript is a new scripting language that allows authors to integrate static text with simple generative primitives (choices, sequences, transforms, etc.). RiScript is designed specifically for writers (for example, no quotes are required for strings) and can be run via RiTa.evaluate() or included in any RiTa grammar.
  • RiTa v2 is available for Java (via Maven and Github Packages) and JavaScript (via NPM and UNPKG). It is also available, of course, as a library for Processing and p5.js.

Note: you can find an archive of version 1.x here...

Text-generation via Markov-chains and context-free grammars (with built-in support for RiScript)

Smart lexicon search to match pos, stress, syllable and letter count, stress, soundex, and rhyme (regex) patterns

Modules for tokenization, inflection,
stemming, verb conjugation, and
concordance/KWIC model creation

Customizable lexicon with a powerful letter-to-sound algorithm for analysis of unknown words

Taggers for a range of granular features, including phonemes, stress, part-of-speech, syllables, etc.

Runs in Node or the browser
with or without p5/js & Processing
(also in Android)

Reference