github Speech-Rule-Engine/speech-rule-engine v4.0.0
SRE Version 4: Now in TypeScript

latest releases: v4.1.0-beta.10, v4.1.0-beta.8, v4.1.0-beta.9...
2 years ago

This is the full conversion of Speech Rule Engine to Typescript with a partial re-implementation of some of the features like rule and locale handling. Below is a (very likely incomplete) list of all the changes. Please see the acknowledgements after the TLDR.

TLDR

  • SRE moves to ES6 using TypeScript and webpack:
    • API now uses promise for engine setup
    • Single bundle file for both node and browser
    • Support for alternative bundlers
  • New locales for Norwegian (Bokmal and Nynorsk), Swedish, and Catalan
  • Support for two dimensional formula layout in Nemeth Braille
  • Major rewrite of rule handling
    • Smaller memory footprint of indexed rules
    • Smaller locale files
  • All localisation now in a dedicated repository sre-l10n
    • Bespoke YAML format for speech rules for easier translation
    • CrowdIn support for simple message translations
  • New API methods for generating word representations of numbers, ordinals, and vulgar fractions
  • Internet Explorer support deprecated

Acknowledgements

  • NumFocus for a "Small Development Grant" that financed a one week sprint to make the initial conversion to TypeScript possible
  • TextHelp for their support on
    • refactoring the rule engine, redesigning the rule format and CrowdIn integration
    • localisations into Nordic languages
  • Statistical Institute of Catalonia (Idescat) for providing the Catalan translations
  • American Action Fund for supporting the ongoing Nemeth work.
  • MathJax for their continuing support of the system.

TypeScript Conversion

  • The entire code base has been converted from JavaScript in Google Closure Syntax to TypeScript based on ES6 standard.
  • Bundling is done using webpack.
  • Code is fully cycle free and easily usable with other bundlers.
  • Support for some alternative bundlers has been added (rollup and eslint). For more information see the README.
  • Code is formatted with prettier
  • Code is linted with eslint
  • All code adheres to the latest JSDOC documentation conventions.

Code Structure and Building

  • Sources are in ts directory.

  • The src directory and all legacy JavaScript code has been removed.

  • Building sre is now done with

      npx tsc; npx webpack
    
  • The bundled file is in lib/sre.js. It works both in node and in a browser. Simply include the file sre.js in your website in a script tag.

  • Consequently the sre_browser.js bundle no longer exists.

  • Transpiled Javascript files are in the js directory, which is created on the fly.

  • Script sre4node.js has been removed as SRE libraries can be loaded from the js directory directly.

  • The Makefile is now exclusively for building the unicode mapping files, one per locale.

  • The mathmaps subdirectory with locale sources has been moved to the top level.

npm Package

Structure of the package remains nearly unchanged with two exceptions:

  • lib/sre_browser.js is no longer available.
  • js directory with JavaScript files is contained in the distribution for easier integration into third party projects.

Mathmaps Directory

  • Unicode mappings are again in files with an .json extension.
  • Likewise compiled locale mappings are in a single .json file in lib/mathmaps.
  • Use make all to create the mathmaps.
  • JSON minimization is done via an intermediate step to generate .min files, which is handles in the Makefile and ensures that only newly altered files have to be minimized.
  • Each locale now contains a messages subdirectory. These contains messages used for generating alphabets, font names etc. Note, that in the combined, minified version of the locale .json, messages always need to come first.
  • Each locale now also contains a rules subdirectory. These contains the speech rule sets.

Locales and Rule Handling

New Locales

  • Norwegian (Nynorsk and Bokmal) support for all rule sets
  • Swedish support for all rule sets. Still experimental.
  • Cataln support for all rule sets except Clearspeak

2D Nemeth output

  • Nemeth support for 2D dimensional layout based on the existing Nemeth rule set handled via a new layout renderer.
  • Add to setupEngine:
markup: 'layout'
  • Or run on the command line with
./bin/sre -b braille -c nemeth -k layout

New Speech Rule Handling

  • Introduces inheritance of speech rules from an abstract base locale.
  • Speech rules are separated into dedicated precondition and action files. Speech rules are only formed when an action is given of an existing precondition.
  • The idea is that effectively only actions have to be localised. While new rules or preconditions can still be added, the majority can be inherited from the common base locale.
  • Speech rule sets are minimised as much as possible. However locales can alter rules by adding new preconditions, ignoring existing one or overwriting base actions with localised ones.
  • Reduction of size for locale files.
  • Smaller Memory footprint for indexed rules.

New Localisation Support

  • All localisation now in dedicate repository sre-l10n
  • Dedicated YAML format for speech rule actions.
  • Support for CrowdIn Localisation of symbols, functions and units.
  • Automatic update of speech rules from the sre-l10n repository.
  • No more changes to locales in this repository

API Changes and Additions

Promise based processing

The functionality for loading locales and updating the engine have been refactored to use ES6 promises. This changes the asynchronous behaviour of the engine, which client code will have to take into account.

Changes to Setup Functions and File Loading

  • In particular the changes to the following API method in system.ts:
    • engineReady() returns a Promise that resolves as soon as the engine is ready for processing.
    • setupEngine() is now an asynchronous function that returning a Promise that resolves as soon as the engine is ready for processing.
    • Other methods that return promises are the file loading methods file.toSpeech, file.toSemantic, ...
  • The engine is considered ready for processing, when all necessary rule files have been loaded for the current locale and the engine is done updating other internals, like the rule indexing structure, the constraint structures, etc.

Custom Load methods

  • Allows to specify a custom method for loading locales.
  • Custom load method can be passed to the feature vector in setupEngine.
  • In the browser in can be defined in the SREfeature variable.

For more information see the README.

New API Functions and Features

  • Four new API functions for Translation of numbers, ordinals and vulgar fractions to word representation in respective locales.
  • Sub locales are exposed (e.g., for different reading of numbers in the same language).
  • New corresponding options for CLI frontend.

Revamping the Rule Engine

Changes to data structures, speech rules and code structure.

Simple Speech Rules

Simple speech rules for unicode symbols, functions and units are now handled separately from regular speech rules. That is, the data structure MathSimpleStore no longer inherits from MathStore. While this requires some additional logic for parsing, looking up, and selecting simple rules it reduces the memory footprint of functionality never required by simple rule stores.

Speech Rule Stores

Classes with interface SpeechRuleStore no longer have a trie for indexing speech rules. They are exclusively a container for storing rules together with a common context. Rule look up can only be done via findRule .
In particular stores do no longer provide a lookupRule method that matches rule applicability with respect to a given DOM node.
Rule lookup is not done on tries only.

Speech Rule Engine

The core engine no longer uses a SpeechRuleStore to lookup rules. Previously an active store would have been selected or constructed as a combination of stores, that the store's trie would be use for looking up rules. Now the engine uses a single trie only.

Speech rules are immediately sorted into the trie on load of a locale. While the trie can still be pruned, there is no longer any combining of rule stores (into the active store) or reindexing of tries. The rule engine can therefore no longer be furnished with a selection of speech rules stores, only. It will always work with all rules of all locales currently loaded.

Locale Messages

Code for locales that was included in the compiled version has been considerably reduced. Only methods for number string generation and alphabet combinations remain. The latter are often shared between multiple locales. Consequently the growth of the size of sre.js should be small when adding new locales.

Messages for locales have been refactored into three categories, included in a new messages subdirectory and in the locale JSON structures:

  • alphabets: Strings for Greek and Latin alphabets and corresponding prefixes.
  • messages: Messages for MathSpeak, fonts, embellishments, roles, etc.
  • numbers: Strings necessary for generating numbers.

Bug fixes

Version 4 contains a number of bug fixes, some introduced during the conversion. However, the issue tracker has been very much neglected during the conversion period. Hopefully this will change in the near future.

Deprecation Notes

  • The deprecated -i option has been removed.
  • The sre_browser.js library is no longer necessary and no longer created.
  • Support for Internet Explorer has been removed. That is, the IE mappings file at npm repository will no longer be updated. You can still use it, but it will not get any new locale updates and might stop working in some future release.

This will be the last release in this repository. It will move to the Spech-Rule-Engine organisation in the future.

Don't miss a new speech-rule-engine release

NewReleases is sending notifications on new releases.