Thinking about typing mathematical notation on a phone
This post is incomplete; I'll change it and add to it as I work on this problem.
I'd really like to be able to type mathematical notation on my phone.
There are many contexts in which I'd like to do that: writing blog posts; posting on social media; taking notes for myself.
At the moment, whenever I want to toot something on mathstodon.xyz with maths in it, I wait until I'm at my PC, because writing TeX on a touchscreen keyboard is such a pain.
I'd also like to question some of the fundamentals about how TeX works, and investigate if there are other way that work just as well or better.
I've had a go at implementing some of these ideas.
Video
Here's a screen recording I made on 2022-05-19, talking about the problem, trying to type some TeX, and then typing with my new system.
Observations
Characters that TeX uses a lot are hard to type on a phone: \
and {}
in particular.
Phone keyboards typically show the alphabet and a couple of punctuation symbols on the main screen, with toggles to get to further screens. You can usually type numbers by long-pressing or swiping the top row of letters, or by switching to the second screen, where there are some of the more common non-letter symbols.
There are lots of different layouts for phone keyboards, with different choices about which keys get on the first screen, and what's available with long presses.
.
and ,
are the only easy punctuation characters to type on a phone keyboard.
Some keyboards insert a .
automatically when you press space twice.
They'll also put a space after punctuation automatically, and capitalise after some punctuation.
Phones can autocomplete words, saving key presses and fixing typos, but making it hard to type non-words.
But autocorrect works in whatever language you want, so I think that keywords have to be localised.
Phone keyboards can type emoji and some other non-ascii characters much more easily than physical keyboards.
Existing systems
Typed notations have to make a balance between these objectives:
easy to type
easy to read
looks like conventional handwritten notation
concise
broad domain
conveys semantics
System |
Easy to type on phone |
Easy to read |
Looks like conventional handwriting |
Concise |
Broad domain |
Conveys semantics |
---|---|---|---|---|---|---|
TeX |
No |
Yes |
no |
no |
yes |
ish |
AsciiMath |
Yes |
Yes |
ish |
no |
no |
ish |
MathML |
no |
no |
no |
no |
yes |
no/yes |
APL |
no |
no |
ish |
yes |
no |
yes |
Mathematica |
ish |
yes |
no |
no |
yes |
yes |
For a notation used to produce typeset maths, I'd put those objectives in this order, from most to least important:
broad domain
easy to type
easy to read
concise
looks like conventional handwritten notation
conveys semantics
Things that don't go well in TeX
(This is not a complete list)
The default mode is to produce italic letters: you have to mark commands with a \
.
This means that if you type cos
instead of \cos
, you get
Brackets aren't always stretchy, and \left and \right are so long that I'm tempted not to use them.
The LaTeX replacements for TeX commands with unusual grouping rules, like \frac
to replace \over
, lead to code that doesn't read the way mathematicians speak.
I reckon \over
could be redeemed by making it stick to the immediately adjacent tokens, instead of absorbing everything to the next curly brace.
Common structures
You should be able to produce just about any typesetting you like, but things that you do often should be more convenient.
Does TeX get this prioritisation right?
Subscripts and superscripts
Fractions - place two expressions on top of each other, with a horizontal line between them.
Brackets - normally
Grids - e.g. matrices, vectors, aligned equations
Operators
Different typefaces - italic, bold, upright, blackboard bold, fraktur
Environments - in TeX,
\begin{…} … \end{…}
is used a lot to change the syntax of the contents.
Decisions and questions
Input shouldn't be case-sensitive (but displayed case is important: how to insist on case?)
It's easy to type spaces, so encourage spaces to separate things. (but the number of spaces mustn't be important - phone keyboards love to add spaces)
Everything gets converted to lower-case; have to explicitly opt in to upper-case with a modifier.
Words for everything, but support symbols too (if things like ×
are easy to type)
.
before a string of non-space characters to represent a symbol, e.g. variable.
,
works similarly to .
, but produces upper-case.
Replace curly braces for grouping with: open = {
, close = }
, and for convenience and = }{
instead of close open
, which doesn't look like it makes sense.
My attempt at a new notation
At the moment, I've spent an afternoon on it and it does some simple rewrite rules to convert to TeX.
I don't think this is sustainable: I'm sure that at some point I'll want to diverge from TeX's syntax enough that the conversion will become less straightforward.
But for now, this lets me easily use structures from TeX that I haven't thought about in detail yet.
Here's how it works:
Tokenise
First, the code is tokenised.
There aren't many token types. I think that it's a good idea to minimise the number of token types.
- Dot symbol
-
Regex:
/^(\.\s*(\w+?))(?:\W|$)/
A dot, followed by optional space characters, then a string of word (non-space, non-punctuation) characters. Converted to lower-case.
Produces a symbol. Equivalent to just typing lower-case word characters unmodified in TeX
e.g.
.n
,.speed
- Comma symbol
-
Regex:
/^(,\s*(\w+?))(?:\W|$)/
A comma, followed by optional space characters, then a string of word (non-space, non-punctuation) characters. Converted to upper-case.
Produces a symbol. Equivalent to just typing upper-case word characters unmodified in TeX
- Dot literal
-
Regex:
/^\.\./
Produces a literal dot.
- Comma literal
-
Regex:
/^,,/
Produces a literal comma.
- Plain tex
-
Regex:
/^("([^"]*)")/
Produces a string of plain text.
Equivalent to
\text{…}
in TeX. - Whitespace
-
Regex:
/^(\s+)/
Ignored!
- Number
-
Regex:
/^(\d+(?:\.\s*\d+)?)/
The second real localisation problem: for convenience, I'd like to type numbers without putting a dot in front, but that means that I have to say what the decimal separator is.
So this might go, and you'll have to put a dot in front of a number, unless I can think of a clever way of dealing with decimal separators (maybe everything from a digit to the next whitespace is a single token?)
- Keyword
-
Regex:
/^(\w+)/
Any string of word characters.
If it's a defined keyword, it's replaced with whatever TeX code the definition specifies, otherwise it's prepended with a
\
and we assume it's a TeX command. - Symbol
-
Regex:
/^(\W+)/
Any string of non-word characters.
At the moment, I assume these are all valid TeX. At least in MathJax, things like
×
and÷
are OK.
Most of the structural features are implemented as keywords.
Keyword |
Replaced with |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
When converting to TeX, dot/comma symbols and number literals are always wrapped in curly braces, so they're treated as atoms.
A nice side-effect of this is that environments become slightly easier to type, omitting curly braces:
begin.matrix … end.matrix
Here's the definition of standard deviation written in this system, on my Samsung tablet:
Sigma eq sqrt open frac 1 ,N sum from open. i eq 1 close to ,N lb .x sub .I minus mu rb to 2 close
That produces
Note that a couple of things were capitalised, and the space between the dot and the i
just after open
doesn't matter.
So...
I don't believe I'm making something that will realistically be used by many people, or even by myself, but I think I've identified some important points and I'm sure it'll inspire something in my real work down the line.