Are German words longer than English ones?

There are many jokes about the Germans, but one of the few I can post here is that oft-told adage that at the present rate of increase, eventually the German language will consist of a single, extremely long word.

Ok, so that’s a bit mad, but a lot of people think that words in German are on average longer than those in English. But is it really true? I have wondered about this for a long time, but it’s actually quite hard to get sensible data on this sort of subject. I mean, who wants to search through a dictionary counting all the letters?

Well, actually, now it’s possible to do just that, and really easily. I just got the latest version of Mathematica - Version 7 - which includes a whole host of curated data not available in previous versions. One of the new datasets is a German dictionary to go with the English one we’ve had since Version 6. To see what languages are available, we just type:

In: DictionaryLookup[All]

which gives:

Out: {”Arabic”, “BrazilianPortuguese”, “Breton”, “BritishEnglish”, \
“Catalan”, “Croatian”, “Danish”, “Dutch”, “English”, “Esperanto”, \
“Faroese”, “Finnish”, “French”, “Galician”, “German”, “Hebrew”, \
“Hindi”, “Hungarian”, “IrishGaelic”, “Italian”, “Latin”, “Polish”, \
“Portuguese”, “Russian”, “ScottishGaelic”, “Spanish”, “Swedish”}

It’s also easy to look up how many words are in each dictionary. For example, to see how many words are in the German dictionary, we just type:

In: DictionaryLookup[{"German", "*"}] // Length

Out: 76155

which looks like a sensible number of words to do a comparison. The English dictionary has 92,518 words, by the way. Breton (yes, really!) has only 32,733 words… I’d never thought of Breton people as being particularly concise.

Anyway, the built-in functions in Mathematica mean that the question that has bugged me for ages can now be answered in two lines of code:

In: Mean[StringLength[DictionaryLookup[{"English", "*"}]]] // N

Out: 8.39372

In: Mean[StringLength[DictionaryLookup[{"German", "*"}]]] // N

Out: 11.6281

So there you have it: quantitative evidence that German words are longer than English ones, on average over 3 letters longer, which is quite a lot if you ask me! Some of the words are much longer, as you can see from the accompanying plot. If you think my methodology is flawed, please let me know with your quantitative results!

English versus German

English versus German

Torres is a floppy-haired, diving, git

So I finally made it to Anfield for the first time yesterday, for the Liverpool/Fulham Premiership game. Now, I’m neither a Fulham nor Liverpool supporter: my wife’s uncle has a Fulham season ticket but couldn’t make the game, so I went with my father-in-law. The atmosphere at Anfield is really great - you can see why call it a fortress, and the sight of the Kop from the visitor end was something else - especially with a capacity crowd of around 45,000 around them. It made up for the bitter cold that blew in around half an hour into the match.

The match itself was pretty good for a 0-0 draw. Fulham outplayed Liverpool during the first half, and Rafa must have given the home side a good talking to during the break as it was more even the second half. Little Andy Johnson was the man of the match for me (it’s amazing how he holds his own against guys much bigger than him) but Reiera was pretty good too.

Only one thing let down the game, and that was Fernando Torres. What is it about imported players from southern Europe (think Ronaldo too here). Torres is really talented, and made some clever little passes, but why does the overpaid tosser like to munch quite so much grass? I haven’t seen so much diving since the Beijing Olympics! Never mind the cynical ones where he was looking for penalties, the worst was when he was at least ten feet from any Fulham player, and he literally threw himself into the mud. Paintsil’s complaint earned him a sneaky headbutt from Torres while the ref wasn’t looking.

It disappoints me to see how brazen the cheating can be amongst some of these guys - I can’t imagine doing that in front of 45,000 people, even if most of them this time would have turned a blind eye. And Torres is not nearly as good-looking as he obviously thinks he is. He should get a proper haircut, and stop cheating so much.

Big Sister is Watching You - Part 1

As a new parent, I am worried about the safety of my child, as any normal parent would be. I am also concerned about civil liberties and privacy. However, these two concerns are colliding together in various ways in our society right now, and you should be worried about it. Let me try to convince you that your lovely UK government wants to treat all of you - yes, even you there on the third row - wants to treat all of you as a terrorist threat. In this post, I will give you the first example: children’s fingerprinting.

Fingerprinting of children is growing at an astonishing rate in England’s schools. Touting fingerprinting as a way to make access to library books simpler (just put your thumb on the reader), Micro Librarian Systems for example sell a fingerprint recogniser for primary schoolchildren. On their website, they use the following enticements as to why it’s a good idea to buy and use such as system in the school library:

No more lost or damaged reader cards!

No more lending of ID cards between borrowers!

No more bar codes being washed or tumble dried!

In other words, spend about £20,000 on a fingerprint system, fingerprint all the children in the school, and it will make it slightly easier to control the ditzy little b*stards reading habits. After all, it’s too hard to write down what books they’ve borrowed in something as simple as a book, and the children need their fingerprints scanned so the teacher can remember what their names are.

Sounds unconvincing said like that, doesn’t it? Unfortunately, plenty of schools have been taken in by the technobabble the companies spout about convenience, and about 2 million children have now had their fingerprints recorded, mostly without the knowledge, and therefore without the consent, of parents. And it’s not just for library books; such systems are also being used to keep track of lunch payments, again with the vague notion of improving efficiency, however that is defined.

Of course, the companies that sell these systems claim that the fingerprint data is secure, but having seen the entire Child Benefit database disappear on a CD a few months ago, I don’t believe that for a moment, and neither should you. And neither did Fionna Elliot, who has campaigned for parents to be informed that the security of their childrens’ identity is being compromised. As a surprisingly well-informed Daily Mail article points out, schools are even worse than governments in taking care of personal data, and most teachers are unaware that a scrapped fingerprint computer would be very handy for ID thieves. Once your fingerprints are copied by someone, you cna’t use them to identity yourself any more. Think it’s hard to copy a fingerprint? Actually, it’s quite easy, and can be done with a few simple ingredients.

LTKA has more information on all of this, but the message is pretty simple: tell your kids that if their teacher asks them for their fingerprints, say no. This message, and the fact that fingerprinting children is anyway illegal (it’s a breach of human rights, goes against the Data Protection Act, and can breach the 2002 Education Act), is finally getting through to some people, and councils are starting to distance themselves from possible litigation.

What does central Government have to do with fingerprinting in schools? UK.Gov is subsidising fingerprinting technology formerly through the DFES curriculum online project, and now through Harnessing Technology funds. This is seen by a number of people as softening up our younger population to accept intrusive identity checks as a normal part of life, in preparation for the dreaded Identity Cards.

By the way, it’s not just kids that are being asked to present their thumbs to identify themselves. Parents at a nursery in Kent are also being asked, as are those at a nursery in Swansea. Whatever happened to using your eyes and brain to identify the person coming through that front door? And anyway, as anyone who has used a security gate knows, you can just tailgate behind someone who is authorised to get in. Totally pointless.