rasa 2.3.0 on Python PyPI

Improvements

#5673: Expose diagnostic data for action and NLU predictions.
Add diagnostic_data field to the Message
and Prediction objects, which contain
information about attention weights and other intermediate results of the inference computation.
This information can be used for debugging and fine-tuning, e.g. with RasaLit.
For examples of how to access the diagnostic data, see here.
#5986: Using the TrainingDataImporter interface to load the data in rasa test core.
Failed test stories are now referenced by their absolute path instead of the relative path.
#7292: Improve error handling and Sentry tracking:
- Raise MarkdownException when training data in Markdown format cannot be read.
- Raise InvalidEntityFormatException error instead of json.JSONDecodeError when entity format is in valid
  in training data.
- Gracefully handle empty sections in endpoint config files.
- Introduce ConnectionException error and raise it when TrackerStore and EventBroker
  cannot connect to 3rd party services, instead of raising exceptions from 3rd party libraries.
- Improve rasa.shared.utils.common.class_from_module_path function by making sure it always returns a class.
  The function currently raises a deprecation warning if it detects an anomaly.
- Ignore MemoryError and asyncio.CancelledError in Sentry.
- rasa.shared.utils.validation.validate_training_data now raises a SchemaValidationError when validation fails
  (this error inherits jsonschema.ValidationError, ensuring backwards compatibility).
#7303: Allow PolicyEnsemble in cases where calling individual policy's load method returns None.

#7420: User message metadata can now be accessed via the default slot
session_started_metadata during the execution of a
custom action_session_start.

from typing import Any, Text, Dict, List
from rasa_sdk import Action, Tracker
from rasa_sdk.events import SlotSet, SessionStarted, ActionExecuted, EventType


class ActionSessionStart(Action):
    def name(self) -> Text:
        return "action_session_start"

    async def run(
      self, dispatcher, tracker: Tracker, domain: Dict[Text, Any]
    ) -> List[Dict[Text, Any]]:
        metadata = tracker.get_slot("session_started_metadata")

        # Do something with the metadata
        print(metadata)

        # the session should begin with a `session_started` event and an `action_listen`
        # as a user message follows
        return [SessionStarted(), ActionExecuted("action_listen")]

#7579: Add BILOU tagging schema for entity extraction in end-to-end TEDPolicy.
#7616: Added two new parameters constrain_similarities and model_confidence to machine learning (ML) components - DIETClassifier, ResponseSelector and TEDPolicy.
Setting constrain_similarities=True adds a sigmoid cross-entropy loss on all similarity values to restrict them to an approximate range in DotProductLoss. This should help the models to perform better on real world test sets.
By default, the parameter is set to False to preserve the old behaviour, but users are encouraged to set it to True and re-train their assistants as it will be set to True by default from Rasa Open Source 3.0.0 onwards.
Parameter model_confidence affects how model's confidence for each label is computed during inference. It can take three values:
1. softmax - Similarities between input and label embeddings are post-processed with a softmax function, as a result of which confidence for all labels sum up to 1.
2. cosine - Cosine similarity between input label embeddings. Confidence for each label will be in the range [-1,1].
3. inner - Dot product similarity between input and label embeddings. Confidence for each label will be in an unbounded range.
Setting model_confidence=cosine should help users tune the fallback thresholds of their assistant better. The default value is softmax to preserve the old behaviour, but we recommend using cosine as that will be the new default value from Rasa Open Source 3.0.0 onwards. The value of this option does not affect how confidences are computed for entity predictions in DIETClassifier and TEDPolicy.
With both the above recommendations, users should configure their ML component, e.g. DIETClassifier, as
```
- name: DIETClassifier
  model_confidence: cosine
  constrain_similarities: True
  ...
```
Once the assistant is re-trained with the above configuration, users should also tune fallback confidence thresholds.
Configuration option loss_type=softmax is now deprecated and will be removed in Rasa Open Source 3.0.0 . Use loss_type=cross_entropy instead.
The default auto-configuration is changed to use constrain_similarities=True and model_confidence=cosine in ML components so that new users start with the recommended configuration.
#7817: Use simple random uniform distribution of integers in negative sampling, because
negative sampling with tf.while_loop and random shuffle inside creates a memory leak.
#7848: Added support to configure exchange_name for pika event broker.
#7867: If MaxHistoryTrackerFeaturizer is used, invert the dialogue sequence before passing
it to the transformer so that the last dialogue input becomes the first one and
therefore always have the same positional encoding.

Bugfixes

#7420: Fixed an error when using the endpoint GET /conversations/<conversation_id:path>/story
with a tracker which contained slots.
#7707: Add the option to configure whether extracted entities should be split by comma (",") or not to TEDPolicy. Fixes
crash when this parameter is accessed during extraction.
#7710: When switching forms, the next form will always correctly ask for the first required slot.
Before, the next form did not ask for the slot if it was the same slot as the requested slot of the previous form.
#7749: Fix the bug when RulePolicy handling loop predictions are overwritten by e2e TEDPolicy.
#7751: When switching forms, the next form is cleanly activated.
Before, the next form was correctly activated, but the previous form had wrongly uttered
the response that asked for the requested slot when slot validation for that slot
had failed.
#7829: Fix a bug in incremental training when passing a specific model path with the --finetune argument.
#7867: Fix the role of unidirectional_encoder in TED. This parameter is only applied to
transformers for text, action_text and label_action_text.

Miscellaneous internal changes

#7420, #7515, #7574, #7601