prj#01

Curves Ahead

09/2024

The Goal

Building on prj#00, this new prj#01 project pushes representation further by using spline curves as output.

I adapted new tools and the Pipeline from the Labs. The ML model learns from structured story elements—actors, locations, actions, emotions ...- and generates splines that trace their evolution through each scene.

This project works with Seq2Seq.

The R&D Kitchen

I started with a question and brainstorming using OpenAI chat (see below for sample of the original question)

The journey to encode, decode, and infer from the decoder turned into quite an interesting process, 'coworking' with the AI It all started with a simple prompt to describe the problem.

From there, the kitchen heated up: Seq2Seq models, variable-length output sequences, GRU/LSTM Encoder-Decoders with Attention, Message-passing GNNs, LSTMs, RSNNs, and more went under the analysis and discussions.

After a back-and-forth, the decision landed on Seq2Seq—a recipe refined through a few iterations into a candidate pipeline

Adding more tools to the Labs

Prepared curves from story events: generated embedding from text
Generated splines from handcrafted references for training: opted for SVGs
Trained spline Seq2Seq pipeline with Keras:
- Built training model:
  - Encoder inputs → LSTM → encoder LSTM.
  - Decoder inputs → LSTM → decoder LSTM (with initial_state=encoder_states).
  - Explored various models and event pipelining approaches
- Inference model:
  - The encoder model is derived as an intermediate output from training.
- Inference predict:
  - Loop over time_max_len to decoder_model.predict

Below is a diagram of the Inference model (notice that the training architecture is different from the architecture used during inference. In this case the Encoder Input Layer and Encoder LSTM Layer )

Version 01.0: Splines per scenes

After training the model with handcrafted image representing scenes (each of the splines representes an element -actor, action, emotion...- of the structured story) the system was (supposedly) able to infer new splines from new texts.

- A new SplineSeq2SeqContentRule to produce splines from story elements

- This uses splineSeq2SeqAi which calls the prediction model

- Finally a new SplineContentRenderer to visualize the output

And a new representation of a story, in this case a few chapter of "Winnie the Pooh"

Version 01.1: Coloured scenes

Inspired by Sol LeWitt’s “Wavy Brushstrokes” (1995) I updated the Pipeline new Rule and Renderer view.

This is back to story "Cinderella" representation:

Version 01.2: Filling spaces

Inspired by Bridget Ridley with a new Renderer view and same Rules.

This is back to story "Cinderella" representation:

prj#1_3_spline_seq2seq_content_fil_rule_20.ml.png

And finally, this was the original prompt mentioned in "The R&D Kitchen" that I provided to OpenAI chat to start the analysis discussion:

...

- Input is composed of: JSON as [ "event": {"name": <string>, ... "emotions": <list of strings>} ]

- Output is composed of multiple variables for each field in the event input (1-1):

- For each field in the "event," a list of x, y control points represents key values.

Note that the list of control points can vary across types, e.g., [ "event_curves": {"name": <list of control points>, ... "emotions": <list of control points>} ]

To train the model, I can provide JSON with events and their respective event_curves.

I was considering producing embeddings for each event field and training a multivariable regression model to produce event_curves per field.

However, I realized I could train N multivariable regression models —one for each event type — and then predict N times, once for each event type.

But this still has two problems:

1) The N trained models will not consider correlation information from all input fields for a given event, as each model will be independent.

2) I don’t see a multivariable regression model dynamically producing a list of control points with a variable size.

What alternative options can I consider to solve these two problems? (Using other ML techniques is fine with me.)

....