Every toot I've ever written, in alphabetical order
In the shower this morning, one of those stupid thoughts formed itself in my mind:
I wonder what the entire corpus of toots looks like, in alphabetical order?
I can sort of answer this: I'm one of the admins of mathstodon.xyz, a big-ish Mastodon instance. I could run a query to get the text from every toot the server is aware of.
The immediate concern that arises is whether this could reveal any private information. I don't think it would unless I do it twice in quick succession and someone recreates a private toot from the difference between the two, but just to be safe I've done it on just my posts.
I ran this query on the postgres database:
sudo -u postgres psql mastodon_production -c "copy (select text from statuses where account_id=1) to '/tmp/cp_statuses.csv';"
(a normal mastodon user who doesn't have direct database access could request an export of their account data and wait a few minutes)
That produced a CSV file with a row containing the text of each post.
Then I sorted that alphabetically in Python:
import csv
with open('cp_statuses.csv') as f:
r = csv.reader(f)
rows = list(r)
all = ''.join(sum(rows,[]))
import re
with open('cp_statuses_sorted.txt', 'w') as f:
ordered = ''.join(sorted(re.sub(r'\s','',all)))
for i in range(0,len(ordered),80):
f.write(ordered[i:i+80]+'\n')
And that produced this: the text of every Mastodon post I've ever written, in alphabetical order.
So there you go!