RiTa is an open-source library for writing in computational environments. It provides functions for simple language processing and generation tasks without the overhead or complexity of a full NLP stack. Designed to be small and easy-to-use, it runs in a variety of environments including the browser, p5.js, Processing, Node and Android. Features include grammar and Markov-based text generation, tools for inflection, conjugation, stemming and tokenization, as well as analysis of English features such as part-of-speech, phonemes, syllables, and stresses. RiTa's customizable lexicon can be searched via partial matches on combinations of any of the features listed above. It also includes a letter-to-sound engine for analysis of unknown words, and a small scripting language designed for writers (RiScript).

Java

Java

Javascript

Javascript

Processing

Processing

node.js

Node JS

Android

Android

p5.js

p5.js

Observable

Observable

from old Norse, meaning to mark, scratch, or scribble

RiTa Version 3


Version 3 is a rewrite that is simpler, faster and more powerful. But it is NOT backwards-compatible with all programs written in previous versions. Nearly all of the previous functionality is still available (plus many new features), but much of the organization has changed. Migrating an older work to v3 is generally quite straightforward (and I'm happy to help if you run into problems)...

Key Features
  • Integration of the RiScript scripting language, designed for writers
  • Letter-to-sound engine for feature analysis of arbitrary words (with/without a lexicon)
  • Smart lexicon search for words matching part-of-speech, syllable, stress and rhyme patterns
  • Heuristic algorithms for inflection, conjugation, stemming, tokenization, and more
  • Powerful new options for generation via grammars and Markov chains

Important Changes
  • Nearly all features of the library are now static functions on the RiTa object. So you can access them as RiTa.XXX(), e.g., RiTa.phones(), RiTa.tokenize(), RiTa.conjugate() and so on, which means there is only one object to import into your programs.
  • The RiMarkov and RiGrammar objects, now accessed via RiTa.markov() and RiTa.grammar() are significantly improved. RiTa grammars are still JSON-compatible, but can now include all the features of RiScript. Grammars also now use the RiScript choice primitive via square brackets, e.g., [ unicorn | wizard | elf ], rather than parentheses.
  • RiScript is a scripting language, designed specifically for writers, that enables the easy integration of static text with algorithmic elements (choices, sequences, transforms, gates, etc). RiScript can be run via RiTa.evaluate() or included in any RiTa grammar. Read more about the details here.

Note: you can find archives of versions 1.x and 2.x here.

Text-generation via Markov-chains and context-free grammars (with built-in support for RiScript)

Smart lexicon search to match pos, stress, syllable and letter count, stress, soundex, and rhyme (regex) patterns

Modules for tokenization, inflection,
stemming, verb conjugation, and
concordance/KWIC model creation

Customizable lexicon with a powerful letter-to-sound algorithm for analysis of unknown words

Taggers for a range of granular features, including phonemes, stress, part-of-speech, syllables, etc.

Runs in Node or the browser
with or without p5/js & Processing
(also in Android)

Reference