Archive for the ‘Computer Science & Coding’ Category.

Behold!

My name, written in Hindi, written in Unicode:

ऐलन डेिवडसन

Yeah, that’s right—real programmers code in binary (or hexadecimal, if they get lazy). The coolest thing about this is that if I had been more confident, I could have done it without getting help from the Internet. but I wasn’t, so I double checked stuff online. I’m still not entirely sure I got it right, so if you or someone you know is familiar with the Devanagari alphabet, please double check my spelling. I have written this so that people who don’t have Hindi vowel-rendering turned on (which I suspect is the majority of my readers) will see this correctly, while anyone who actually has a computer set up to read Hindi/Sanskrit/&c will think the ि and व should be swapped. I’m aware of the problem, but can’t fix it for everyone.

Unicode is surprisingly intricate: like x86 machine code, UTF-8 (the most common encoding of Unicode, since it’s backwards compatible with ASCII) and UTF-16 use a variable-length encoding for characters, so that common character sets like ASCII take up less room than uncommon ones like Braille (which is not as widespread on the Internet as it is elsewhere). Unicode text files typically start off with a Byte-Order Mark, which describes the basic unit size of characters along with the endianness of the machine on which it was encoded; these BOMs are partly why it’s such a universal encoding system. Unicode actually raises some pretty challenging questions in terms of “alphabetical” sorting and accent placement, and even presents some security problems by opening the way for homograph phishing attacks (for instance, see this Shmoo article on IDN attacks, which mentions that www.pаypal.com can be registered with a Cyrillic first ‘а’ and could be full of scams. Yes, I have written both the URL and the ‘а’ with the actual Cyrillic letter).

Yes, it’s totally dorky to learn about Unicode, but it’s actually kinda cool at the same time.

For all you computer security types

I’m sure all the CS people reading this (and maybe even some of the non-CS types!) are familiar with buffer overflow attacks, and know how to both protect against them and exploit them in other people’s code, or at least have a vague idea about how to do it. However, fewer people have heard of format string attacks. Here’s a fairly detailed explanation, but I’ll summarize:

If, in your C or C++ code, you write printf(foo) (where foo is typically a const char*), it will just print foo to the screen. The one exception here is when foo contains the percent sign, in which case it prints corresponding things from the stack. If there are more %’s in the string than there are other things in the stack frame, it will begin printing out previous parts of foo itself. If foo was defined as input from a clever yet malicious user, they can craft strings that do nasty things to your program. Most importantly, they can read from (using %08x) and even write to (using %n) arbitrary locations in memory. Given that, they can pretty much do anything they want on your machine. Nifty!

The simple and obvious way to avoid this attack is to change all instances of printf(foo) in your code to printf("%s", foo) instead. The less obvious but much better solution is to not code in C or C++ ever again, and instead use a modern, high-level language like Python or Java (or if you’re Michael and worry about the speed of your program, use an actual low-level language like Assembly).

Potpourri (and remember to vote!)

There is a fantastic tech talk about how to teach computer science to kids. Too often, they see the name and think it’s about programming, and are consequently turned off to the subject. This kiwi teaches CS without using a computer, but has all sorts of fun, hands-on activities for kids to do as they learn about sorting and compression algorithms, error-correcting codes, DFAs, and other parts of CS. If you ever need to inspire kids, this video is definitely worth a watch!

Speaking of videos to watch, check out this Dove commercial. I’ve gotta give them props for that.

On a newsier topic, Bush has begun to admit that the war in Iraq is going poorly and is starting to accept the parallels between this war and Vietnam. Might this be the beginning of someone in the Republican party taking a look at reality and then accepting responsibility for what they’ve screwed up? Not likely, but a man can dream, can’t he?

By the way, please, please register to vote (and then actually vote) in the elections on November 7. In California, you need to register (which can be done at your local DMV) by October 23 (this coming Monday). As John Stewart once quipped, “this country is run by extremists because moderates have shit to do.” However, voting doesn’t take up much of your time, and can help shape which direction the country will go, even if it’s still being run by extremists. No matter which parties/candidates you support, please vote. and please take 10 minutes and read up on the parties/candidates you plan to vote for, and make sure that they really do represent your interests; too often people are elected by an ignorant population that doesn’t realize what it’s doing. You want to vote for the communist party? That’s fine, so long as you know what they stand for and agree with it. You wanna vote for someone because the politicians tell you to? that’s not so good.

So learn about your favourite party, and then vote for them!

Protected: More info about my job

This content is password protected. To view it please enter your password below:

The shoes all live in Shoetown, Lamps Plus has lamps and more…

I was surprised to discover that less than 2 miles from my house is a shop called the Giant Robot Store! Unfortunately, they seem to sell only comic books and Japanese action figures. Oh, well. I think I’m gonna try getting into another FIRST program here… it looks like the Lego League actually focuses on autonomous robots, which could be pretty great. In case you haven’t heard of them, The Bobs are a pretty fun a cappella group, who inspired the title of this post.

I read an article the other day about Chrysalis, a company that helps the homeless find jobs. They believe that many homeless people want jobs, but cannot get them because they have no home address (which they can’t afford without a job…). Chrysalis finds them a local, unskilled job (and gives the employing company a guarantee that they will have a worker). I believe that many of the people seen on the streets advertising local restaurants and businesses with large signs are employed through Chrysalis. According to their website, people who go through their program really change their lives. They operate throughout the western LA area. I think I might print out some “business” cards about Chrysalis to give to the people on the street who ask me for money.

Finally, I’ve mentioned tries to several different people (it’s a data structure that’s kind of like a cross between a tree and a hash table). If you’re interested in learning more about them, here is the first paper I read about them. It’s not the original trie paper, but it’s a pretty good overview, if you overlook the fact that this person doesn’t really know English. On a semi-related note, here’s the original paper on V-Lists, which are like a cross between a linked list and an array.

A bit too advanced…

I figured out the “answer” to my TeX question from yesterday—it’s undecidable! There is no way I can get my code to perform correctly and still finish up in a finite amount of time. The way to show this is similar to the proof that the halting problem is undecidable.

A quick sketch of the proof →

An advanced TeX question

Also posted to tex-latex (page can be found here)

Try typing in the following code:

x  y % two spaces between x and y

% The argument of \eatandx is ignored.
% Note the space after the x.
\newcommand{\eatandx}[1]{x }
\eatandx{z} y

The second paragraph outputted has an extra space between the x and the y. I’m pretty sure this is caused by the fact that \eatandx takes an argument. Ideally, I’d like both paragraphs to look the same in the PDF (like “x-space-y”). Is there any way for the space at the end of the result of \eatandx and the space just after it to be combined into a single space (so it looks like the two spaces from the first paragraph that turned into a single space in the output)? I don’t think \unskip has the right behavior, because \eatandx{z}y should also become “x-space-y” due to the space after the x.

I suspect that this behavior happens because TeX parses the spaces in the first paragraph with its “eyes” (to use Knuth’s term), while macro expansion occurs later (possibly in either the “mouth” or the “bowels” of TeX, I think?). This would cause the first line to become the token list “x-space-y” while the second becomes “x-space-space-y” instead. However, I have no idea how to correct this and get the behavior I want (or even if that’s possible).

Any insight would be most welcome. Thanks very much!

My own personal Joe, Sentenza, and Tuco, if you will…

The good news: the president of Diebold resigned after enough problems with the ethics of his company, his products, and his personal business dealings. I maintain that if people are hell-bent on electronic voting (and I personally am hell-bent against it), the system should be transparent and open-source so that anyone can both verify that it is correct and formulate improvements to the system.

The bad news: President Mahmoud Ahmadinejad of Iran claims that the Holocaust never happened. This is significantly less forgivable than claiming that Israel is a blot in the middle of the Arab world that should be wiped off it. It makes my blood boil to hear people say things like this. I might be able to understand the claims if the Holocaust had happened centuries ago, but some of the people who were in it are still alive today! Is President Ahmadinejad actually trying to claim that my grandfather did not get shipped to France to fight the Nazis? Is he claiming that 20 million Russians did not actually die in some fictional “World War,” and have merely been hiding in their basements lo these past sixty years? He certainly seems to be purporting that everything from Kristallnacht to Auschwitz is an elaborate hoax. Argh!

The ugly news: I have 3 tests to take, a 10-page paper to write, 4 assignments to grade (with 20-ish people turning in each one), 7-ish ACM problems to code up, and 3 more grad school apps that need to be finished this week. I wish I could say I’m almost home, but I’m not.