Archive for the ‘typography & language’ Category.

Word Games, part 3

What do you mean I haven’t posted in 3 months!? …oh, I guess I’ve made a bunch of half-written drafts, but never finished any of them. I should get on that.

In the meantime, here’s a word game for you. How many words can you name in 10 minutes that contain all 5 vowels? Y’s are optional but stylish. It’s okay if the words contain some vowels multiple times, or if the vowels are out of order. My word list is in the first comment.

Ligature Alternatives in LaTeX

I’ve been corresponding with Dario Taraborelli and Will Robertson, and we have concluded a couple things about LaTeX and alternative glyphs for ligatures: Don’t bother reading behind this cut if you don’t use LaTeX →

Reading the Canterbury Tales, old school

After my previous encounters with the Canterbury Tales, I’ve decided to read the entire thing. As before, I think that reading a modern translation loses the interesting parts of the language and screws up the poetry, so I want to read it in Middle English. and just because I can, I’m reading (a copy of) a 600-year old handwritten version. Those are images of the Hengwrt (“HENG-urt”) manuscript, which was probably written sometime between 1400 and 1410 (note that Chaucer himself died in 1400, but it’s hard to get closer to when he was alive). The images are very high quality; click the “all sizes” button towards the top to enlarge it a bit, then the “original resolution” link to see its true glory. I wish there was a way to set the default resolution higher, but I don’t know if that’s possible.

Sure, I could read a version with original spelling in a modern font, and towards the end I probably will. but for the moment, the novelty of reading a handwritten manuscript hasn’t worn off, so I’m persisting. It took a bit of time to get the hang of the handwriting and grammar, so in the interests of helping others follow in my footsteps, here is an illustrated guide to reading the manuscript.

Learn about letters and the alphabet, grammar, spelling and vocabulary, and other tips for reading Middle English manuscripts behind this cut →

Vannevar Bush would be proud

Experiment terminated early

Conclusion: poetry is hard. No, I think I said that wrong, so I’ll try again. Writing poetry is hard! and totally underappreciated! I set out to see how difficult it was several weeks ago, when I decided that I would start writing my posts in metered verse, slowly turning them more and more into poetry until I got through 5 posts or someone noticed, whichever came first. I began by discussing the latest in gay marriage debates and airline liquid bans in dactyllic verse (though I relaxed the sentence length restrictions for the second paragraph). I then wrote about XeTeX in iambic pentadecameter (intended to be 3-line stanzas of iambic pentameter, but I couldn’t handle word breaks at the end of lines). Both of these posts took several hours to compose over multiple days, which took much longer than I expected. I have tried several times to compose a subsequent post with word breaks where line breaks should be, but couldn’t do it before finding several more postworthy things, and got severely backlogged in stuff I wanted to write about. So, as of today, I’m abandoning this experiment and admitting defeat. I can’t do it: writing poetry is way too hard. and the worst part is that the things with which poets concern themselves are so subtle that no one notices unless you point and say, “look at the structure of this language; it is unusual.” They put all this effort into amazing, clever literary constructions, and we, the unwashed masses, don’t even notice what they do most of the time. I certainly didn’t appreciate it until I tried it out. Attempting this has given me a newfound respect for poetry and poets. I had no idea it was so difficult, and I’m amazed that others are so skilled at it.

To any LaTeX people using Macs:

Get XeTeX on your box and tell me what you think of TeXing with the font Zapfino (which has been installed on every Mac). My first impression, looking round the Tubes, is that it’s gorgeous, with its swashes and alternatives for every character. However, it appears to need a bunch of extra macros, and it might be problematic understanding everything. I wish that I could try this out myself, except Zapfino is prohibitively costly just to mess around a bit. So anyone who has it without shelling out a dime, if you could give me your opinion I’d appreciate it much. I guess I ought to simply buy a Mac (despite the cost) because I’ve wanted one for ages and they look so very nice.

Yes, they’re all real words (the first sentence notwithstanding)

We’ve all heard of people discussing whether or not ‘gruntled,’ ‘whelmed,’ ‘combobulated,’ &c are real words. The only thing I can add to that debate is that ‘whelmed’ is a real word, and it means ‘engulfed or submerged.’ What I’m more intrigued by are words that actually have valid counterparts that no one ever uses.1 This topic came to my attention as I sat on my balcony enjoying the clement weather, and investigating it has turned my understanding of English etymology into a ravelled web once again. I want to say I’m exasperated by this, but I never started with any asperity, so I haven’t really run out of it yet.

Even more interesting, though, are the false positives. I know many people whom I consider experts in their fields, but I doubt many of them are former perts. I imagine that most discomfiture is not due to a lack of comfits. I may be decanting odd words at you, but it’s better than canting them. What a strange language we speak!

1: I admit, the dangling preposition has its place. I considered writing “By what I’m more intrigued are words…” but that was too much even for me.

English, a silent language

Lots of letters can be silent in English, but I’m missing a few. Can anyone help fill in the rest of this table? Proper nouns are cheating. So is “February.”

Edit: words in italics were found by minorninth at this page. To be honest, though, I’d like better ones for Q and Y.

  • A – boat
  • B – numb
  • C – scissors
  • D – adjective
  • E – made
  • F –
  • G – gnaw
  • H – through
  • I – piece
  • J –
  • K -knew
  • L – talk
  • M – mnemonic
  • N – damn
  • O – leopard
  • P – psychic
  • Q – lacquer
  • R –
  • S – island
  • T – often
  • U – fugue
  • V –
  • W – who
  • X – faux
  • Y – prayer
  • Z – rendezvous

Does anyone else keep these lists?

Words I’ve managed to use in conversation:

  • pecuniary
  • smarmy (which sounds just like it means)
  • brickbat
  • offing
  • stripling
  • apposite
  • phthisis
  • juxtapose
Words I’m still trying to work into the conversation:

  • lickspittle
  • pulchritude (which sounds nothing like what it means)
  • disingenuous
  • phantasmagoric
  • ontic
  • perfidy
  • the phrase “I’m not really into Pokémon


My name, written in Hindi, written in Unicode:

ऐलन डेिवडसन

Yeah, that’s right—real programmers code in binary (or hexadecimal, if they get lazy). The coolest thing about this is that if I had been more confident, I could have done it without getting help from the Internet. but I wasn’t, so I double checked stuff online. I’m still not entirely sure I got it right, so if you or someone you know is familiar with the Devanagari alphabet, please double check my spelling. I have written this so that people who don’t have Hindi vowel-rendering turned on (which I suspect is the majority of my readers) will see this correctly, while anyone who actually has a computer set up to read Hindi/Sanskrit/&c will think the ि and व should be swapped. I’m aware of the problem, but can’t fix it for everyone.

Unicode is surprisingly intricate: like x86 machine code, UTF-8 (the most common encoding of Unicode, since it’s backwards compatible with ASCII) and UTF-16 use a variable-length encoding for characters, so that common character sets like ASCII take up less room than uncommon ones like Braille (which is not as widespread on the Internet as it is elsewhere). Unicode text files typically start off with a Byte-Order Mark, which describes the basic unit size of characters along with the endianness of the machine on which it was encoded; these BOMs are partly why it’s such a universal encoding system. Unicode actually raises some pretty challenging questions in terms of “alphabetical” sorting and accent placement, and even presents some security problems by opening the way for homograph phishing attacks (for instance, see this Shmoo article on IDN attacks, which mentions that www.pа can be registered with a Cyrillic first ‘а’ and could be full of scams. Yes, I have written both the URL and the ‘а’ with the actual Cyrillic letter).

Yes, it’s totally dorky to learn about Unicode, but it’s actually kinda cool at the same time.