My Writing

hackerm0m 🚢

4y ago

Thread

Building Quality Containers Systempoetics.com

Today I read a paper on self-attention transformers in neural networks.

http://nlp.seas.harvard.edu/annotated-transformer/

Forget Norvig & Russel; read Ursula Le Guin to learn about Machine Learning.

Here's why

#machinelearning #ursulaleguin #scifi

🧵👇

In Le Guin's Left Hand of Darkness, Foretellers tell the future. But the future they tell depends on the question that is asked.

Machine learning is all about the question you ask the model.

The question is the, why?

Traditional computational problems, mostly analytic problems, are all about the HOW.

They are about breaking something down into constituent steps and doing those steps quickly.

A self-attention #transformer parallelizes steps that used to run sequentially.

There is a small impact on quality but an enormous gain on speed.

Self-attention is a HOW innovation, not a machine learning innovation.

It optimizes the steps of an algorithm.

BUT just like we cannot separate form from content, we cannot separate solution from speed

Self-attention works well with language problems, like translation.

It does not sacrifice quality.

In self-attention applied to linguistic problems, we keep track of the position - like the position of a letter in a word.

The position is mapped to a sinusoidal curve.

PE(pos,2i)=sin(pos/100002i/dmodel)

(see tweet 1 for the reference)

Mapping from a position in a word to a curve is profound. I want to pause for a moment. Mapping is a symbolic gesture that allows us to extract more meaning.

We move from a lower-dimensional space to a high-dimensional space: a letter's position to a point on a curve in space and time.

When I confront a problem or a block, I move to a higher-dimensional space.

I become aware of my current context and move into an overview of that context.

When I return to the space of the problem, my location and perspective have changed.

Owen Barfield wrote in saving the appearances not to mistake the model for reality. And Gregory Bateson, among others, wrote not to confuse the map with the geography.

But what is the geography in machine learning?

There is no geography in machine learning. We are mapping an imaginary world.

Some imaginary worlds are more useful than others.

When the foretellers in Le Guin tell the future, they also move into a higher dimensional space - an alternate reality.

The question frames the space of this alternate reality.

For true machine learning innovation, we need to look at the quality and speed of answers and the quality of the questions.

I'm doing #solidity this month, but maybe next month, I'll do #machinelearning.

The all-in-one writing platform.

Write, publish everywhere, see what works, and become a better writer - all in one place.

Trusted by 80,000+ writers