Thinking about typing mathematical notation on a phone

2022-05-15

This post is incomplete; I'll change it and add to it as I work on this problem.

I'd really like to be able to type mathematical notation on my phone.

There are many contexts in which I'd like to do that: writing blog posts; posting on social media; taking notes for myself.

At the moment, whenever I want to toot something on mathstodon.xyz with maths in it, I wait until I'm at my PC, because writing TeX on a touchscreen keyboard is such a pain.

I'd also like to question some of the fundamentals about how TeX works, and investigate if there are other way that work just as well or better.

I've had a go at implementing some of these ideas.

Video

Here's a screen recording I made on 2022-05-19, talking about the problem, trying to type some TeX, and then typing with my new system.

Observations

Characters that TeX uses a lot are hard to type on a phone: \ and {} in particular.

Phone keyboards typically show the alphabet and a couple of punctuation symbols on the main screen, with toggles to get to further screens. You can usually type numbers by long-pressing or swiping the top row of letters, or by switching to the second screen, where there are some of the more common non-letter symbols.

There are lots of different layouts for phone keyboards, with different choices about which keys get on the first screen, and what's available with long presses.

. and , are the only easy punctuation characters to type on a phone keyboard.

Some keyboards insert a . automatically when you press space twice. They'll also put a space after punctuation automatically, and capitalise after some punctuation.

Phones can autocomplete words, saving key presses and fixing typos, but making it hard to type non-words.

But autocorrect works in whatever language you want, so I think that keywords have to be localised.

Phone keyboards can type emoji and some other non-ascii characters much more easily than physical keyboards.

Existing systems

Typed notations have to make a balance between these objectives:

easy to type
easy to read
looks like conventional handwritten notation
concise
broad domain
conveys semantics

System	Easy to type on phone	Easy to read	Looks like conventional handwriting	Concise	Broad domain	Conveys semantics
TeX	No	Yes	no	no	yes	ish
AsciiMath	Yes	Yes	ish	no	no	ish
MathML	no	no	no	no	yes	no/yes
APL	no	no	ish	yes	no	yes
Mathematica	ish	yes	no	no	yes	yes

For a notation used to produce typeset maths, I'd put those objectives in this order, from most to least important:

broad domain
easy to type
easy to read
concise
looks like conventional handwritten notation
conveys semantics

Things that don't go well in TeX

(This is not a complete list)

The default mode is to produce italic letters: you have to mark commands with a \.

This means that if you type cos instead of \cos, you get $c o s$ , which most people don't notice isn't the same as $\cos$ .

Brackets aren't always stretchy, and \left and \right are so long that I'm tempted not to use them.

The LaTeX replacements for TeX commands with unusual grouping rules, like \frac to replace \over, lead to code that doesn't read the way mathematicians speak. I reckon \over could be redeemed by making it stick to the immediately adjacent tokens, instead of absorbing everything to the next curly brace.

Common structures

You should be able to produce just about any typesetting you like, but things that you do often should be more convenient.

Does TeX get this prioritisation right?

Subscripts and superscripts
Fractions - place two expressions on top of each other, with a horizontal line between them.
Brackets - normally
Grids - e.g. matrices, vectors, aligned equations
Operators
Different typefaces - italic, bold, upright, blackboard bold, fraktur
Environments - in TeX, \begin{…} … \end{…} is used a lot to change the syntax of the contents.

Decisions and questions

Input shouldn't be case-sensitive (but displayed case is important: how to insist on case?)

It's easy to type spaces, so encourage spaces to separate things. (but the number of spaces mustn't be important - phone keyboards love to add spaces)

Everything gets converted to lower-case; have to explicitly opt in to upper-case with a modifier.

Words for everything, but support symbols too (if things like × are easy to type)

. before a string of non-space characters to represent a symbol, e.g. variable.

, works similarly to ., but produces upper-case.

Replace curly braces for grouping with: open = {, close = }, and for convenience and = }{ instead of close open, which doesn't look like it makes sense.

My attempt at a new notation

At the moment, I've spent an afternoon on it and it does some simple rewrite rules to convert to TeX.

I don't think this is sustainable: I'm sure that at some point I'll want to diverge from TeX's syntax enough that the conversion will become less straightforward.

But for now, this lets me easily use structures from TeX that I haven't thought about in detail yet.

Here's how it works:

Tokenise

First, the code is tokenised.

There aren't many token types. I think that it's a good idea to minimise the number of token types.

Dot symbol

Regex: /^(\.\s*(\w+?))(?:\W|$)/

A dot, followed by optional space characters, then a string of word (non-space, non-punctuation) characters. Converted to lower-case.

Produces a symbol. Equivalent to just typing lower-case word characters unmodified in TeX

e.g. .n, .speed

Comma symbol

Regex: /^(,\s*(\w+?))(?:\W|$)/

A comma, followed by optional space characters, then a string of word (non-space, non-punctuation) characters. Converted to upper-case.

Produces a symbol. Equivalent to just typing upper-case word characters unmodified in TeX

Dot literal

Regex: /^\.\./

Produces a literal dot.

Comma literal

Regex: /^,,/

Produces a literal comma.

Plain tex

Regex: /^("([^"]*)")/

Produces a string of plain text.

Equivalent to \text{…} in TeX.

Whitespace

Regex: /^(\s+)/

Ignored!

Number

Regex: /^(\d+(?:\.\s*\d+)?)/

The second real localisation problem: for convenience, I'd like to type numbers without putting a dot in front, but that means that I have to say what the decimal separator is.

So this might go, and you'll have to put a dot in front of a number, unless I can think of a clever way of dealing with decimal separators (maybe everything from a digit to the next whitespace is a single token?)

Keyword

Regex: /^(\w+)/

Any string of word characters.

If it's a defined keyword, it's replaced with whatever TeX code the definition specifies, otherwise it's prepended with a \ and we assume it's a TeX command.

Symbol

Regex: /^(\W+)/

Any string of non-word characters.

At the moment, I assume these are all valid TeX. At least in MathJax, things like × and ÷ are OK.

Most of the structural features are implemented as keywords.

Keyword	Replaced with
`lb`	`\left(`
`rb`	`\right)`
`eq`	`=`
`plus`	`+`
`minus`	`-`
`times`	`\times`
`divide`	`\divide`
`sup`	`^`
`sub`	`_`
`from`	`_`
`to`	`^`
`open`	`{`
`close`	`}`
`and`	`}{`
`bold`	`\mathbf`
`italic`	`\mathit`
`roman`	`\mathrm`
`bb`	`\mathbb`
`frak`	`\mathfrak`
`op`	`\operatorname`
`l`	`&`
`e`	`\\`

When converting to TeX, dot/comma symbols and number literals are always wrapped in curly braces, so they're treated as atoms.

A nice side-effect of this is that environments become slightly easier to type, omitting curly braces:

begin.matrix … end.matrix

Here's the definition of standard deviation written in this system, on my Samsung tablet:

Sigma eq sqrt open frac 1 ,N sum from open. i eq 1 close to ,N lb .x sub .I minus mu rb to 2 close

That produces

σ = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}

Note that a couple of things were capitalised, and the space between the dot and the i just after open doesn't matter.

So...

I don't believe I'm making something that will realistically be used by many people, or even by myself, but I think I've identified some important points and I'm sure it'll inspire something in my real work down the line.