dirk hovy

2017/09/19, 06:46AM
BLOG: Conference sizes in NLP are growing exponentially, which will likely affect how we review, organize, and experience conferences in the future. Some thoughts based on my observations at ACL and EMNLP.

2016/10/22, 04:59AM
BLOG: I had some time and analyzed the US presidential debates from a quantitative point of view. Turns out the candidates differ even beyond their messages.

“I service society by rocking.”
School of Rock


At very irregular intervals, you’ll find impressions, updates, and random thoughts about what I am doing. The blog does not claim to be complete, up-to-date, or meaningful. I tried to post all entries in both languages, and recently switched to English only, but there are some that are only available in one or the other language. In those cases, there was probably too little time, or I was just lazy and all that translating a hassle ;)

Some Thoughts on the Future of NLP Conferences

(2017/09/19, 06:46AM)
If you have been to ACL in Vancouver or followed the news on Twitter, you know that it was the biggest conference of its kind. And EMNLP, held in Copenhagen in September, could again claim the same for itself, too, with more than 1200 participants and more long paper submissions than ACL. Given the recent interest in all things NLP, this development is not too surprising, and if current trends hold, we should expect further growth and more record attendances to come on a regular basis.

Personally, I think that is a good thing: there are still plenty of open questions in NLP, and the field as a whole can only benefit if more people devote their thoughts to the questions in our field. However, it will (have to) affect the way we hold conferences.

While we are still a long way away from conference sizes as seen in medicine and economics, with up to 10,000 participants, we will likely soon regularly reach attendance of 2000 and more participants (as is already happening in the ML community). Even at this level, it is clear that our conference model needs to be open to growth.
As far as I can tell from my involvement in EMNLP, this growth will affect (at least) three aspects of conferences: reviewing, organization, and structure of the actual conference.

There has been a lively debate about the state of reviewing in the community (something I like a lot about NLP as a field: people are always willing to tinker with the status quo). Without weighing in on the arxiv debate (which deserves its own discussion), I think it is becoming clear that reviewing is a both a bottleneck and a control mechanism. It is getting ever harder to find reviewers for all submissions, and the quality of the reviews varies a lot. If we reach 2000, 3000, or even 5000 submissions (remember that there will always be way more submissions than papers), we will need enormous PCs to guarantee three reviews. Some people have entertained the idea of open reviewing, but I believe there is a danger that it becomes a popularity contest rather than a fair quality assessment. So far, reviewing is voluntary, but thankless, despite attempts at reviewing prizes. One incentive could be the requirement to review if you submit: since most papers have more than one author, this could go a long way to reducing the reviewer problem. It would not address quality, but it is hard to see how to tackle this better.

Even if we solve the reviewer problem, we are still left with the decision how to accept papers: right now, conferences aim for 20-25% accepted papers per area. To me, this is the crucial measure to control the ultimate conference size. If we keep the current ratio and submissions keep growing, so will conference sizes. The alternative is to set a size limit instead: the top N papers (by score, meta-review, and random tie break) across all areas get in, the rest is rejected. The problem with this approach is of course that it will drop acceptance rates precipitously, and make it so much harder for good work to get published, but it would allow us to have an upper bound on the size.

As for organization, I do enjoy the personal touch that the involvement of community members as general, local, publication, and other chairs brings to conferences. However, from my own experience I can say how hard it is to combine it with your day job as researcher, and once we pass the 2000 mark, it will become almost prohibitive. Given how tight a schedule many researchers already have, this does not sound like a feasible way forward. Certain positions should always be held by researchers, of course (program and are chairs, for example), but it should be possible to outsource some of the other positions.
A relatively straightforward solution to this problem would be another full-time position in the ACL exec, to support the people who are already there, and to professionalize certain responsibilities currently held by researchers (for example local, handbook, or publicity chair). The increase in salary costs for ACL would most likely be offset by the increased participant numbers.
This would help reduce variability: none of us are trained conference organizers, and since it is always somebody new picking it up as they go, outcomes vary quite a bit. The permanent members of ACL provide guidance and continuity, but while they are doing an outstanding job, they too feel the brunt of increasing conference sizes (probably even more than others). An additional position would help address this. Fewer organizers would help streamline communication.

Lastly, larger attendance numbers will change the actual structure of the conferences as we know them. We might have to envision a completely new model, where conferences become more of a discussion forum to exchange ideas than a presentation medium.
Increasingly, the parallel track model is reaching its limits (although parallel poster sessions seem to work well), and we will either have to introduce even more parallel sessions (unpopular, since attendees will have more conflicts of presentations scheduled at the same time), extend the conference duration (also unpopular, especially for members with family and small kids), or shorten talks. Personally, I am for a 5min talk limit: as it is, most talks can not convey all of the information contained in the paper anyway, so there is little reason for them to be so long. The best a talk can do is serve as appetizer for the paper, which you then want to read in your time. That, however, can be accomplished in 5min just as well as in 12 or 15. I think the exchange in QA sessions is great and should be kept as much as possible, but I am not sure it will be feasible. A better exchange is the direct chat during poster sessions, so increasing poster acceptances is an obvious solution (although maybe not an easy one). Personally, I increasingly like the focused topicality and exchange of workshops, but I am not sure it will be possible to scale them (even at an average workshop size of 50 participants, we would have to have dozens of workshops, which again raises the time vs. parallelity problem, not to mention space issues).

Either way, the growing field will present us with new challenges and affect the way we do conferences. However, I am confident that we will find a way as a community, and am more curious than concerned. The only thing that is for certain is that conferences as we know them will become a thing of the past.

It’s All About the He Said, She Said―A Quantitative Analysis of the Three Presidential Debates

(2016/10/22, 04:59AM)
It’s All about the He said, She said―A Quantitative Analysis of the Three Presidential Debates

The question what constitutes an acceptable sentence is a matter of taste, and would elicit very different answers from a moral philosopher, a linguist, and a logician. In natural language processing (NLP), the answer is much simpler (or simplistic), and quantifiable: any sentence that could be generated with some probability from a language model.

As the US presidential debates have drawn to a close, much has been said about acceptable and unacceptable language. While NLP is woefully ill-equipped to make moral decisions on what the candidates said, it is pretty useful to analyze how much was said, and how unusual it is. So I spent an afternoon analyzing the transcripts of the three debates, and quantified the findings.

I downloaded the three transcripts, separated out the answers of the two candidates, split them into sentences, and analyzed them with a language model. Without going into to much technical detail: a language model is a statistical model which has been induced from a large collection of text. To use the model, we give it a sentence and ask “How likely is it to generate this sentence?” The model then returns a probability between 0 and 1. 0 means that the model would never produce this sentence, and 1 that it always would. In practice, neither of them really occur, but numbers are somewhere in between.

The exact numbers depend on what and how much text you used to train the model, how many words in sequence you look at, and how similar your training data was to the texts you analyze. I used a 5-gram SRILM trained on a corpus of 2,584,929 English review sentences. No, this might not be the best model one could use, and if I used it in an application, I would certainly train on something closer to the debates. However, I used the same model for all three debates and both candidates, so independent of the absolute values we get, we can compare the two politicians quantitatively.

So what do we learn?

First of all, Donald Trump says more (1950 sentences, compared to Clinton’s 1136), but he uses fewer words: the median Trump sentence has 11 words, a Clinton sentence 16. The graph below shows the relative distribution of sentence lengths for each candidate (I accounted for the fact that they uttered different amounts of sentences).

Because the bars are sometimes a little hard to see in front of each other, I also overlaid them with a smoothed curve (kernel density estimator). The dotted lines show the respective median length in words.
We can see that Trump utters more short sentences (under 15 words), and few longer sentences. Clinton, on the other hand, has a lot more of her sentences in the 15-30 word range.

What about the language model? Let’s first look at the likelihood of the sentences. Some explanation: since the probabilities get very small and hard to distinguish, the likelihood of the sentence is typically given as logarithm of the probability. That makes it a larger, but negative number. The closer the number is to 0, the more likely a sentence is under the model.

We again see some noticeable differences: Trump’s sentences are usually more likely than Clinton’s. This is both an effect of the words the two use, but also of the sentence length (longer sentences become less and less likely), and we have already seen that there are noticeable differences in sentence length.

So let’s normalize each sentence likelihood by the sentence length. That gives us the average log probability per word (note that the x-axis scale is much smaller than before).

Even here, on a per-word-basis, we see that the model is more likely to produce Trump sentences rather than Clinton sentences (you can actually use language models to generate sentences, often to great comical effect, but there isn’t enough training data for each candidate to really come up with much. I tried).

So what do the different sentences look like? Well, the two highest scoring sentences (measured by logprob/word) for each candidate are “Because I was a senator with a Republican president .” (Clinton) and “Horribly wounded .” (Trump). The most “average” sentences are “But let s not assume that trade is the only challenge we have in the economy .” (Clinton) and “When we have $ 20 trillion in debt , and our country s a mess , you know , it s one thing to have $ 20 trillion in debt and our roads are good and our bridges are good and everything s in great shape , our airports .” (Trump). Both of these buck the length-trend. The least likely sentences of each candidate, however, do follow what we have seen before: “Donald thinks belittling women makes him bigger .” (Clinton) vs. “Trump Foundation , small foundation .” (Trump).

So independent of what the candidates are talking about, the way how they talk can help us separate them to some extent. In fact, if we use only the number of words and logprob as features, we can train a logistic regression classifier that distinguishes the two candidates with an accuracy of over 65% (10-fold cross-validation). That’s only slightly better than the majority class (about 63% accuracy) and again not good enough to build a system, but interesting given that we have not even looked at what the candidates are saying.

Does this tell us anything about the likely outcome in November? No. But it shows that the differences between the candidates’ rhetoric styles go beyond what they say in a quantifiable way: sentence length and predictability.

Science’s genius complex

(2016/07/06, 04:27PM)
In a recent article in the New Yorker, James Surowiecki outlined how, back in the 1960s, professional athletes considered strength training akin to cheating: either you were good at sports, or you weren’t―training had nothing to do with it. Practice was seen as just a way to stay in shape, not to get better. Today, this notion sounds quaint, naive, and a little bit stupid. We expect professional athletes (and any remotely serious amateurs) to have a rigorous training regimen, including fitness, nutrition, and rest schedules.

When it comes to scientists, however, we still think along the same lines as the athletes in the last century. Da Vinci, Einstein, Curie: We like to think that these people had an innate “gift”, a knack for science, that they were just brilliant and needed no training. Yes, they were extremely smart, but nothing could be further from the truth. Da Vinci and Tesla had sophisticated sleep schedules to maximize efficiency, Einstein arranged his personal life around his science (with no regard for the people around him), and Curie literally worked herself to death.

The notion of inherent brilliance, however, does not only pertain to the all-time greats. Scientists are generally portrayed as geniuses, who possess a preternatural insight and operate on a different plane from mere mortals. This is an overcome and elitist notion, and to perpetuate that stereotype is not only disrespectful to all the hard-working researchers, but also damaging to science.

Every great academic I know has indeed an unusually good understanding of their subject matter, but mostly, they do because they work a substantial amount. And the more accomplished they are, the more they work. None of them could get by on talent alone, neither to get where they are, nor to maintain that level.

There are of course no well-known training regimen for scientists (What is the equivalent of endurance training for researchers? How should one eat and rest to achieve maximum performance?)
However, many researchers I know exercise regularly, both to maintain their health, and as counterbalance to their academic routine. And while not many academics eat an athlete’s diet, many of them follow scrupulous caffeination rituals.

More importantly, though, the best researchers are constantly finding ways to identify and improve their weaknesses. And the only way to do so is by investing time. Lots of time.
A job at a prestigious university these days comes with the implicit understanding (from both sides) that you put in 80+ hours a week, not necessarily that you are brilliant. Work-life balance be damned.

This approach has some serious side effects, with alcoholism and burn-out unusually common among academics. Yet we still try to make it look easy on the outside, slave to the genius fallacy. We hope to convince people of our brilliance, while simultaneously fighting back the impostor syndrome, and wondering how the others do it so effortlessly. Truth is: they don’t.

This is even more infuriating since academia is already set up as a series of escalating training rounds, and would benefit from acknowledging that. The genius complex is holding us all back, devalues hard work, and makes it difficult for young researchers to accept their limits, to acknowledge that their accomplished colleagues got to where they are by years of hard work and scrupulous training, rather than by mere natural talent.

Sports and music have abandoned the genius notion in favor of dedicated training and hard work, and consequently, performance has improved across the board over the last few decades. And while the overall quality of science has improved by the same mechanism, we still cling to an overcome notion that brings more harm than good. It’s time we abandon it as well.

Fun with Movie Titles

(2015/09/18, 10:16AM)
As mentioned before, I sometimes use my academic knowledge of natural language processing for purposes other than research.

A friend recently told me about an entertaining game where you gerundivize words in movie titles (i.e., you add -ing at the end) to completely change the meaning. Some of my PG-13 favorites include “Jurassic Parking”, “Ironing Man” and “2001: Spacing Oddysey” (you can get equally entertaining, yet NSFW result with other titles, but I leave that as an exercise to the reader).

Being the spoilsport I am, I decided that it would be fun to see how much NLP could help me with that. It certainly won’t be able to decide whether an altered title is funny or not (that’s how lame AI still really is), but it would at least help me generate all possible versions.

I downloaded a list of the 1000 best movies ever (at least according to The New York Times) and then took each title, went through it word by word, and checked (against the Brown corpus), whether adding ’-ing’ to the end resulted in an English word. If so, I printed it out.

The only tricky part was to deal with the spelling alterations for gerunds: -e is always removed ( “take” becomes “tak-ing”), consonants are usually doubled ( “put” becomes “put-t-ing”, but “model” becomes “model-ing”). For the latter case, I actually generate both the duplicated and a non-duplicated version (the rules for the game are not quite clear on what to do here). That’s how I got “Beverly Hills Coping”, which I think sounds much funnier than “Beverly Hills Copping”.

I didn’t check for grammaticality of the entire title, which results in nonsense such as “Alice Doesn’t Living Here Anymore”, while other results are just minor changes in meaning ( “Dial M for Murder” vs. “Dialing M for Murder”). Some of the results are pretty entertaining, though: “The Counting of Monte Cristo”, “Lasting Tango in Paris”, “Gosford Parking”, “Gone with the Winding”, “Lone Staring” (creep!), “Odd Man Outing”, “Oliver Twisting”, “Totaling Recall”, or “Body Heating”. You can download the script here and play around with it to get the full list, or modify it with your own titles.

And that’s it, I wasted another perfectly good hour using NLP for my personal entertainment.

How usable is sentiment analysis?

(2015/09/18, 02:57AM)
Recently, I was interviewed by two students for a study on the business application of natural language processing technique called sentiment analysis. Sentiment analysis takes as input a text (which can be anything from a sentence, a tweet, or a paragraph, up to an entire document), and tries to predict the general attitude expressed therein: usually divided into positive, negative, or neutral.

For many businesses, this is an appealing application, since it promises to detect how people think about the company’s products and services, and because it can potentially be used to evaluate stock options.
However, potential and reality differ, and as far as I see it, there are currently three problems that limit the general applicability of sentiment analysis, and their commercial use:

1. The labels:
The labels are fairly coarse (positive, negative, neutral), while there is still an ongoing debate in psychology on how many basic emotions there are (see here). More fine-grained labels (Facebook let’s you label your status with more than 100 “emotions”) might provide better leverage, but the question is: what would they be? Another problem is the relation between text and labels: we recently had a paper accepted which shows that a common approach to labeling (by using ratings) is not strongly correlated with the text, i.e., models (and humans) can’t guess correctly how many stars somebody gave, only based on the review text. This is largely what we expect of our statistical approaches, though. Which brings us to the second problem:

2. The models
The models are usually trained on a particular domain (say, movie reviews), where they learn that certain features are indicative, say for movies the word ’hilarious’. However, when applied to another domain (say, restaurant reviews), this word does not at all indicate positive sentiment (a “hilarious” meal might not be what we hope for).
In technical terms, models overfit the training data. For a negative result on this, see here. Models need to be better regularized, i.e., de-biased from the training data, in order not to put too much faith in spurious features. Which finally brings us to the third problem:

3. The features
The problem of most approaches is the reliance on individual words, rather than on global sentiment and a deeper understanding of the text. Many models still rely on predefined word lists, or dictionaries, but individual words do not do the problem justice. Things like negation, sarcasm, or metaphors can completely distort the sense of a phrase, even though the individual words seem unambiguous. Even seemingly clear positive or negative words can often express both sentiments when seen in context, cf. “sincere” in “sincere answer” vs. “sincere condolences”, or “cold” in “cold look” vs. “cold beer” (see Flekova et al.). This doesn’t even begin to cover the problem that different age and gender groups express positive or negative sentiment very differently, yet that the models treat all text as coming from the same demographics.

In sum, our approaches are currently too simplistic to capture the complexity of entire texts, thus making results brittle. The over-reliance on individual words and the lack of model regularization exacerbate this problem.
This is not to say that sentiment analysis does not work at all, but all of this limits the commercial use of sentiment analysis to fairly clearly denominated domains (see also the assessment of Larson and Watson).
To improve make sentiment analysis viable for a wider range of contexts, though, we have to start improving all of the three areas above.

Wow: Such Meme, Much NLP, Very Generate!

(2015/09/01, 11:29AM)
I love natural language processing, I really do. I think it has the potential to make the world a little bit better, and like all things worth exploring, it also has the potential to do evil. But it can be tiring wielding all this awesome technology for such serious causes, and sometimes, a man just wants to have a little fun with his subject. After all, why not get a laugh out of all the time invested?

Turns out, you can have a lot of fun with NLP, and it usually only takes a few lines of code and some data. An internet age ago (i.e., last year or so), the Doge meme took the Web by storm, and after seeing it enough, I realized that it followed a certain pattern, and that much of the humor derived from the ungrammatical use of intensifiers with certain word types (I’m a lot of fun at parties, I swear). Essentially, the pattern is
“Wow, such ADJECTIVE! Much NOUN! Very VERB!”

All I had to do now was to get all nouns, verbs, and adjectives out of an annotated corpus (I used the Brown corpus, which comes with the NLTK library), and then randomly pick one of each category to fill the slots. Oh, yeah, and I converted all verbs to their infinitive (i.e,. ’jumped’ and ’jumping’ become ’jump’).

The results range from the mundane and stupid to the insightful and funny. My favorites are the jaded look on counter culture in “Wow, such underground! Much intentions! Very smell!” and the strangely place-appropriate “Wow, such scandinavian! Much concrete! Very constitute!”

And that’s it. The entire script is 6 lines of Python code. You can download the script here and play around with it. Enjoy!

Are our models ageist?

(2015/07/10, 04:07AM)
A number of comedies hinge on the premise of a young and an old person switching bodies. When a 45-year-old business woman says “’sup dudes?”, we find this funny (at least some of us), because it goes against our expectations of how people speak. We do have fairly clear ideas of how 45-year-old business women speak, and how teenage guys do, and that these two ways are not the same. To us, this is such a common and intuitive fact about language that breaking it intentionally can have comedic value.

For the models we use in natural language processing (NLP), however, this fact is not at all clear. To them, all language is the same, because we have not taught them about the difference. When I say “taught”, I don’t mean that we sat them down and explained how language works, of course. We train a model by presenting it with a bunch of input sentences, and with the correct output analyses we expect for them. If the machine has seen enough of these input and output pairs, it can learn a function that maps from an input sentence to the output analysis.

The problem is that almost all of these training pairs came from newspaper articles from the 80s. And that most of these articles were produced by (and for) a specific demographic, which is, broadly speaking, old, white, and male.

We would expect 60-year-old men to have difficulties understanding “the kids these days”, and so it’s no surprise that our models have exactly the same difficulties. When we give them input sentences from today’s teenagers (e.g., Twitter), they produce incorrect analyses. Of course, tweets are written very differently from newspaper articles, and so for a while now, the field has investigated the influence of the genre on performance. However, genre is not the whole picture: if we feed the models sentences from older people, they do a lot better than on sentences from younger people, even when the genre is the same for both groups.

Again, this is not too surprising, since language changes with every generation. What is surprising, however, is the depth and magnitude of this change. Younger people do not just use different words than older people. If it was that simple, we could just have the machine learn a number of new words. It turns out, however, that the differences go deeper: younger people even put their words together in ways very different from older people.

In order to test this, we ignored the actual words and looked instead at pairs of the words’ parts of speech (noun, verb, adjective, etc.) So “Dan cooks” “Mary writes” or “Frank measures” all become “NOUN VERB”. We found that in both German and English, the pairs from the older group were much more similar to the training data than the pairs of the younger group were. In other words: younger people use word combinations that are unlike anything our models have seen before. Older people don’t. Consequently, our models are much better at predicting part of speech sequences for the older group. We tested this for both German and English, with the same results.
Pairs of parts of speech are one thing, but linguistically speaking, they are still a fairly shallow phenomenon.

We also looked at the deep syntactic structure of grammatical functions (subject, verb, object, etc.), where words do not have to be adjacent, but can be on opposite ends of the sentence.
These analysis were interesting in two ways: from a linguistic perspective, and from an NLP perspective.

Linguists have suspected for a long time that syntax changes with age. However, since syntax is very complex, this was hard to prove: we can put words together in a great number of ways, and you have to observe lots and lots of examples to see even the most common ones. Even then, it is hard to pin down the exact differences, if you don’t know what constructions you are looking for. We got around both problems by analyzing millions of sentences from people whose age we know. Among those, we selected only the most frequent syntactic constructions and compared them. That way, we did not have to specify beforehand which constructions to look for. The pattern analyses was of course less than perfect (remember, our models are biased), but by analyzing large enough numbers and by focusing on frequent constructions, we were able to get enough reliable observations to find significant differences. We expect the differences to be even more pronounced if the analyses were better.

The result of all this is that even the word-order (syntax) of young people is radically different from the older group. So different, in fact, that seeing a certain construction can give the machine a good clue as to how old the person is. Just as it would us humans.
This does of course not mean that one group uses a syntactic construction and the other group doesn’t. It just means that one group uses a construction statistically significantly more often than the other group.

And the differences don’t just extend to age: we found similar differences again between men and women. What’s even more startling is the fact that these patterns occur in up to 12 different indo-european languages.

The other way in which these findings were interesting, namely for NLP, was that it showed that our models do pick up on demographic differences, albeit in a bad way. There is, however, nothing inherently ageist to the model algorithms: they are not consciously aware of these differences. They simply transform input sentences into output analyses. However, due to their training, they pick up on the language characteristics of the training data. And when the models get new inputs, they expect the language to be the same as before. This shows how brittle our models are, and how susceptible to the language characteristics (the bias) in the training data.

In fact, we found that our models not only did consistently worse on data from young people (in both German and English), but that they also performed worse and worse the more markers of African-American vernacular English (AAVE) were in a text. (They did not, however, perform worse on different genders―at least.)

So you got a bunch of bad analyses, you could say―so what!

Indeed, if it was just for academic purposes, this would be annoying, but on the whole inconsequential. However, NLP models are increasingly used as go-to tools for unstructured data analysis, both in business and political analysis. If all of these models expect language to come from old white men, and then perform poorly on texts from other demographic groups, we risk systematically ignoring or, even worse, disadvantaging these groups.

Luckily, there are ways to prevent this problem. For one, we can simply train our models on data from more and more demographic groups. In a recent paper, I showed that if we encode age and gender in the models, we get better performance, even under very controlled settings. This means that there is enough signal in the language of different demographic groups that our models can learn to differentiate, and to produce better analyses on a variety of tasks and languages.

This requires, though, that we have enough samples from all demographic groups, and their correct analyses. Both assumptions are unrealistic (for now), because collecting the data and producing the correct analyses takes a lot of time and effort. What’s more, there are dozens of demographic variables: age, gender, education, ethnicity, class, income, etc., and we are only starting to see which ones impact our models.

If we want to address the problem in earnest, we can’t afford to encode each of these variables explicitly. However, we can also just tell our models to expect demographic differences, and figure out the rest themselves.

In the future, we need to find ways to automatically detect all kinds of variations, and to reduce the impact of them on training our models. We need to teach our models that language varies along demographic lines, but that all of these variations are valid.

Not only will we improve the quality of our models, we will also produce fairer analyses that benefit everyone the same.

The papers I described can be found here, here, here, and here.

What I do: Learning whom to trust

(2015/01/11, 11:02AM)
Here, I’ll try to explain another paper I have worked on in generally understandable terms. This time, it’s about learning whose opinion we can trust.

Say you are thinking of watching “Interstellar”. You have seen mixed reviews and want to poll your friends for an opinion. So you ask 6 people. If five say “don’t bother” and one “yay”, it’s pretty clear. This is called “majority voting”.

However, more often than not, you will get a tie, i.e., three people say “don’t bother” and three “yay”. This gets worse the more answer options there are: imagine asking six people whether to see “Interstellar”, “The Hobbit”, “Fury”, or “Unbroken”.

One way to break ties is by flipping a coin (or rolling a die, if you have more than two answer options). However, this has a 50% or higher chance of picking the wrong answer.

If you knew, however, that one of your friends simply likes all movies and always says yes, and another one has the same taste as you and said no, you would be able to weigh their answers differently. Instead of flipping a coin to break ties, you’d use a weighted average to get the best answer.

In Natural Language Processing, we often ask people for their opinion. Usually, it’s about the part of speech (noun, verb, adjective, etc) of words, or whether a sentence is positive, negative, or neutral. This is called annotation. The annotated text is then used to train statistical models. If the annotation is wrong, the models won’t work as well.

We assume that most annotators are generally trustworthy, but that some annotators get it wrong. Either because they did not pay attention to the explanation, or because they don’t care about the task and just want to get paid. If we could down-weigh bad annotators, we would get better annotations and thus better statistical models.

Unfortunately, we usually don’t know the annotators, and thus we don’t know how much to weigh each annotator’s answer. If we did, we could just compute a weighted average over the annotators and get the most likely correct answer.
If we already knew the correct answers, we could count how often each annotator gave the correct answer, and use that fraction as weight.
But we don’t know the correct answer, and we don’t know the weights, so we are stuck in a circular problem.

The way to address this circular problem is by using an algorithm called Expectation Maximization (EM). It works in two steps that are repeated until we reach a satisfying answer.
Initially, we give each annotator some random weight. In the first step, we then calculate the most likely answers based on the weights. Now we can compute how many answers each annotator got right, and assign them a weight based on that fraction. This is the second step.
With the new weights, we re-calculate how many answers each annotator gets right, and again update the weights. And again, and again. At some point, the weights don’t change much any more from round to round, and we are done. We now have weights and can compute the most likely final answers.

We also use an additional technique, called Variational Bayes EM, that essentially tells the model that people either do a good job or they don’t care, but nobody cares a little. This is called a “prior belief”, or just “prior”. Technically, this works by adding pseudo-counts, i.e., when computing how many answers each annotator got right and wrong, we add a small number (we used 0.5) to both. The reason why this works is complex, but it essentially relativizes the influence of the counts a bit, unless they are pretty strong. In the end, it prevents the model from giving too low weights to actually good annotators and improves performance.

Using the final weights for each annotator, we can compute a likelihood for each answer under the model (i.e., given that particular set of weights, how likely is a particular answer). The product of all answer likelihoods gives us a measure for how good the model is. Ideally, we would like to have a model with high likelihood.

Since we started out with random weights, there is a chance that a bad annotator ends up with a high weight. It’s a small chance, because the training process tends to correct that, but it exists. To eliminate that chance, we run the training several times, every time with different starting weights. Once a training run finishes, we measure the overall likelihood of the model. We then pick the final set of weights that resulted in the highest overall likelihood.

We implemented all this in a model and called it MACE, which stands for Multi-Annotator Competence Estimation, because a) that describes what it does and b) we thought it sounded funny to say “Learning whom to trust with MACE” (yes, this is how scientists work).

When we tested MACE on data sets where we already knew the correct answer, we found that MACE correctly finds more than 90% of the answers, while majority voting (with coin flipping to break ties) does much worse.

In real life, we of course don’t know the correct answers, but we found in several annotation projects that the MACE answers produce better statistical NLP models than when using majority voting annotations. We also found that annotators who get a low weight usually also produce bad work, while the ones with high weights produce good annotations.

Since we have probabilities for each answer, we can also choose to focus on the ones with high probabilities. If we do that, we see that the accuracy for those answers is even higher than for all. This is interesting for the case where we have some more annotations than we need, but would like to know that the ones we choose are of especially high quality.

When asking people to annotate, we can also smuggle test questions in there where we know the correct answer. These are called control items (because we can control how good the annotators are). That way, we can sort out bad apples even more accurately. If we use even just a few control items in MACE, accuracy goes up even further.

When I gave a talk about MACE, one of the listeners asked what would happen if my annotators were a bunch of monkeys: would MACE still find the “experts”? The answer is no, but it’s a good question, and we actually did test how many “good” annotators the model needs to find good answers. We simulated 10 annotators and varied the number of good ones: those would get 95% of the answers correct (this is roughly the percentage of the best annotators in real life). The rest of the simulated annotators would pick an answer at random or always choose the same value. We found that even with just 3 or four good annotators, MACE was much better in recovering the correct answer than majority voting. Luckily, in real life, it is pretty unlikely to have such a bad crowd of annotators. Just don’t use monkeys.

Whether we have control items or not, we can use MACE to get better answers for our annotation projects, and learn in the process which annotators are doing a good job.

The paper I explained here is this one, and MACE can be downloaded here.

What I do: Significance Testing

(2015/01/10, 08:19AM)
As much as I love languages, one of the things that frustrated me in linguistics was the seeming arbitrariness of theories. There was no way of knowing which one was better than another. That did not stop people from arguing about exactly that, but there was no way of proving it.
One of the things that most drew me to natural language processing was the possibility to measure and quantify how good a model (and thereby its underlying linguistic theory) was. I was overjoyed. Unfortunately, nothing is that easy.

It turns out that the closer you look, the more difficulties there are. However, there are also solutions. One of them is significance testing. It’s very powerful, but very easy to misunderstand.

While it is easy to compare two models A and B on the same data set and decide which one is better, this says very little about which model is better in general. Model B might be better on this particular data set, but bad on all others (this is called overfitting). We can get a better picture if we compare the two models on more data sets and average over them. Most likely, however, the difference between two good models will be small.

So even if we used several data sets, there is still a chance that the difference between A and B is just due to some unaccounted peculiarities, or pure coincidence. This chance gets smaller with more data, but it does not disappear. Can we quantify that chance? This is what significance tests are for.

In general, significance tests estimate the probability that the claim that the difference between the models is not coincidence is false. I.e., how likely am I wrong when I say “the difference is not due to chance”. This probability is the p-value. A p-value of 0.01 thus means: even though we have shown that the difference between the models is not coincidence, there is a 0.01=1% chance that we were wrong. If our significance test value is lower than this 0.01, then we can say that the difference is “statistically significant at p=0.01”.

Naturally, the lower the p-value, the better. The important point is that significance is binary: either your result is significant at a certain p-value or it isn’t. This is why this list of descriptions for failed significance tests is rather hilarious.

Ok, great. So does that mean if I see a significant result at a small p-value in a paper, the model is good? Unfortunately, no. Because there are a lot of things that can influence the p-value. Here are some.

The most obvious is the test we use. Some tests only work if your data is normally distributed, i.e., if you plot the data, it looks like a bell shape. This is almost never the case in language. Most data looks like a Zipf curve, i.e., it has a steep decline and then a long tail. Any test that makes the normal-distribution assumption is thus bound to give a wrong result.

A good significance test to compare two models is bootstrap sampling: pick a random sample from the data (instances can be repeated) and compare the two models on that. Do this 10.000 times or so. Count how often B is better than A and divide that by 10.000. That’s your p-value. If the result is small, A is probably a better model.
It does not matter how your data is distributed, this gives us a good estimate.

Ok, so are we done now that we have a good test? Again, no. There are more factors, and even if we pick a certain p-value threshold and report significance, we could be wrong.

Say my models analyze sentences. Maybe I need to restrict my analyses to short sentences (say, less than 20 words) for computational reasons. If A does better than B on this sample, I still have no idea whether it will also be better on longer sentences. My significant result is thus only valid for sentences shorter than 20 words. Unless I say this explicitly, my significant result is misleading. If I wanted to deceive people into thinking my model is great, I could look at different lengths and choose to just report the one that gives me a significant result.

Another issue is the measure I use to compare the models. When analyzing the performance of two models on sentences, I can look at how many sentences each gets right, or at how many words. Or I can just look at verbs. Or rather than the correct items, I can look at the error rate of a certain category. Or a whole number of other measures. All of these can be interesting, but if I get a significant result for one measure, it does not mean I get a significant result for all the others. If I was an unscrupulous researcher, I could test all measure and then just report the ones that look best.

Typically, the larger the data and the bigger the difference, the easier to get a low p-value. Somewhat counterintuitively, this does not mean that increasing the sample size will give a significant result. Maybe I just add more examples where A and B are the same, or more where the weaker model is stronger, and so the differences wash out.

Ultimately, all that a positive significance test can tell us is that the difference between models for this particular data set, under the given conditions, for the given measure is significant at a certain level. That’s a lot of qualifications.
The best we can do under these circumstances is to use several data sets, several measures, a clear description of what conditions we used, and an appropriate significance test with a low p-value.

That way, when we say A is significantly better than B, we can be more sure that others will be able to replicate that. It’s not much. But it’s much better than guessing.

The paper I am talking about here is this one. If you got interested, please see the references for a number of other good papers on the subject.

How to be a Good Grad Student

(2013/08/19, 12:53AM)
After almost 5 years, I am finally done with grad school! I just started a postdoc, so this is probably a good time to look back. Two friends recently asked me what my “grad school story” was. What had I learned during my PhD, apart from the obvious technical and academic skills? What were the things I wish I’d known before I started. It got me thinking: what would I tell myself if I got to go back in time? Here is what I came up with:

Take breaks
When I started grad school, I believed exhaustion, all-nighters, and 14h-days were hallmarks of a good grad student. Well, turns out they are not! I worked 12-14h every day, 8h on weekends. After 3 months, I was tired all the time and got sick on a biweekly basis (flu, stomach bugs, etc.). After 6 months, I was burned out, deeply unhappy, constantly sick, and seriously considered quitting. Worst of all: my productivity had constantly decreased. It was time to rethink my believes.
Grudgingly, I learned to accept those limitations. I concentrated productive work (coding, writing) in my peak hours, and used the rest for “busy work” (reading, setting up experiments). Good time management is one of the key skills to learn in your PhD, and one of those they never teach you. It is simply impossible to produce quality work without taking a break every now and then. I have heard estimates that you can only be really productive for 6 hours each day.
The most important part of that were the breaks! I exercised, walked, had lunch with friends, read the newspaper. I was still thinking about my work. But getting some distance from it helped me see errors I overlooked when working constantly. Cutting back on my hours not only made me happier and healthier, it also made me more productive.
It also helped me to set myself an end time. I had some of my most productive times when I was dependent on a shuttle service and could only work until 5pm. I made every minute count, and went home in the evening without regrets. I got a lot done. Working 14 hours straight did not accomplish half as much―and felt a lot worse.

Know your advisor
When I started out in grad school, I thought of my advisor as this superhuman being who knows everything. Apparently, I am not alone. This honeymoon period can last a while. Inevitably, though, everyone reaches a point where they disagree with their advisor. It can be a bit of a shock to learn that advisors are only human, too.
At the end of the day, though, it is good to remember that an advisor is the person who keeps you in business. They speak up for you in quals and screenings, vouch for you academically, introduce you to the right people, and help pay your tuition, travel expenses, and conference fees. They are busy people, and it is not their job to hold your hand, nor help you with the daily nitty-gritty. They have, however, spent a lot of time in the field and can help you find the right direction. It is up to you what you make of it, though. I know some people who have been discouraged by their advisor to pursue a project, only to find a paper on it at the next conference.
It helps to know what their strengths are and benefit from them, and find somebody else for the things they cannot teach you. The latter are often hands-on solutions and technical issues. Most of their hands-on experience is several years old and might be outdated, especially in a fast-paced field like computer science. That’s what fellow grad students are for….
It is your job to keep your advisor happy. Do the project work, help with classes, and listen to their counseling. But decide for yourself what applies to you. Part of your PhD is becoming your own researcher.

Talk to people
Even though it often felt like it, it was important for me to realize that I was not alone in the PhD! I was surrounded by other grad students and researchers, either directly or in my general community. When I started, I was lacking many of the computer science skills my peers had. I had the choice to either envy them or learn from them. The latter worked much better. I have learned more from water cooler talks and by asking colleagues than I have from most classes. I also learned a ton from collaborating on papers, and it’s less work for everyone. Internships and visits are a another great way to meet other people and get exposed to new ideas. I went to IBM and CMU, both for 3 months, and came back invigorated and full of new ideas and impressions.
Learning how to talk to people also means giving good presentations. We need to share our ideas to get plenty of feedback. It helped me to find out how others perceived my research and to check whether that’s what I wanted to convey. If they didn’t get it, I reminded myself that it was probably my fault for not explaining it well enough: the audience is always right. This is especially true when talking to non-scientist friends: if I could explain it to them, I knew I had gotten to the core of the problem (this is sometimes also called the elevator pitch: can you explain your work to someone in the time you spent together on an elevator?). It’s difficult, but it helped me to think outside the box: there is always something in your work that relates to people’s everyday experience (even if it is remote). I think it also helps with writing papers―if you can explain what your work is about in a few simple sentences, people will be more willing to read your paper. Even scientists like a simple explanation better than a convoluted one.
The more specialized my work got, the more important I found it to keep an open mind in general. I found that somebody who works on something completely different can often offer an objective view or an alternative approach to the problem. I made it a point to go to talks outside my area, read papers on related topics and general science. What others do can be as interesting as your own work (but don’t fall into the trap thinking what you do is less interesting than everybody else’s work). Also, it helped me overcome the misconception that being opinionated is equivalent to being smart. It’s an easy mistake to make, but it’s still wrong. And yes, I did it, and I’m not proud.
Last but not least, keeping an open mind helped me to learn other things as well. Even though I felt challenged with all the demands of my research, I found that over time, my mind got used to challenges. It made it easier to pick up some new non-scientific skills along the way (I learned dancing, cooking, and how to ride a motorcycle), and I’m glad I did (again: it helps to take a break sometimes).
It is impossible to be a good researcher when you never leave your room.

Get a hammer
In fact, get a whole toolbox! Early on, I was told to find an approach, algorithm, data set, resource, or other method that I liked. For me, this was the EM algorithm: I love how you can solve a circular problem (if I knew X, I could solve Y, and vice versa) by just starting out somewhere and then refining your model step by step. It’s similar to how children learn about the world, and it can help with a range of problems.
Once I had that, I started looking for problems I could solve with it. I applied it to problems you cannot solve with it. That helped me understand why it works for some and not others. I learned a lot both about the problem and my hammer. It also expanded my technical expertise and helped me produce results more efficiently (and thus write more papers).
It’s important not to get too hung up on one thing, though! Not everything is a nail, and nobody likes a zealous one-trick pony. While it sometimes seems that academia rewards single-mindedness, it often leads those people down a path of no return when the paradigm shifts or their technique becomes obsolete. I learned to accept the limitations and explored alternatives.
I tried to put as many things in my toolbox as possible, and to learn when to use them. This is an immensely fun part of the PhD, and I don’t even think I’m done yet.

This is probably the most important point. When I was so fed up with the program that I considered quitting, I paused and thought about why I put myself through this. Why did I do a PhD? And for whom? I realized that I was not in it for my advisor, for my family, or society as a whole, I wasn’t doing this for others―I was doing this only for myself. Because it is what I always wanted to do! If I didn’t become the next superstar in my field, so what? I was in it because I loved it. Not every second of it, for sure, but as a whole: that was enough to make those difficult times pale to insignificance in the grand scheme of things. Around that time, I went to a talk by Tom Mitchell, on how to predict what people had read by looking at their brain images, and I remember walking out thinking “There are so many more cool things I haven’t even started on, I can’t possibly quit now” (this is another reason why it is good to keep an open mind and check out other fields).
When you’re in a PhD program, you are doing something very few people get the chance to do: you are at the cutting edge of research and work with interesting people on cool problems every day. Everybody gets down once in a while, and pretty much everybody considered quitting at some point. It’s good to remember what you’re excited about. And that you have every right to be excited!

So that’s it. This is what got me through my PhD. If I had to do it all over again, this is what I would focus on.

There are of course other good documents out there on how to make it through grad school, one of the better ones is this one by Hanna Wallach and Mark Dredze. Check it out.

MACE available for download

(2013/04/05, 05:22PM)
Our software package MACE (Multi-Annotator Competence Estimation) is out! It provides competence estimates of the individual annotators and the most likely answer to each item. All you need to provide is a CSV file with one item per line.
In tests, MACE’s trust estimates correlated highly with the annotators’ true competence, and it achieved accuracies of over 0.9 on several data sets. Additionally, MACE can take annotated control items into account, and provides thresholding to exclude low-confidence answers. Feel free to check it out. Comments welcome!

Fake social network names won’t protect your privacy

(2012/11/21, 01:15PM)
I have noticed that some of my friends (mostly Germans) use a fake name on social networking sites. This started a few years ago, when it became clear that a) the security of these sites isn’t exactly Fort Knox and b) their business model includes selling your data. I assume therefore that the fake names are meant to protect your private information. While I understand the sentiment, I think this is futile, and just makes it harder for your friends to find you. Here is why.

The basic problem might be anthropomorphizing companies. If we assume that social networking companies use the same approach to searching for our information as you and me, a false name could throw them off. (It would be tempting at this point to speculate about the age-old belief that knowing somebody’s name gives you power over them, but that’s beside the point here)
However, these companies don’t use humans to search for your data―they use machine learning. And for that, a fake name is just one little piece of data. One of many…

Say you just opened an account and put down a nickname. Can you ensure that all your friends will address you with that name, mention you with that name, and that you will sign all messages with it? Are all your stated relatives using the same moniker? If you ever found a long-lost friend on the site and wanted to contact, can you avoid to sent a message saying “Hey long-lost friend, this is really Dirk Hovy, I am using a nickname, but I would like to re-connect.”?
If you answered “no” to any of these, all you managed is to make it a little harder, but not impossible, to get your real name. You more or less openly provided a decryption key that voids all your attempts at keeping your name safe. Just because you put something into a private message does not mean it is invisible. It’s just data, after all.

Even if you managed to keep all of your communications under control: are you sure your account is not linked to any other sites that contain your name? Did you not sign up with this account when you bought something, have you not liked something with it, or linked it to some other account that contains your true name? If you have done any of the above, it will be the easiest thing in the world to find your true name and link it to your data. It is all a matter of connecting the dots. There are whole industries and research branches devoted to it, and the more dots there are, the easier it gets.

I’m not trying to sound Orwellian, and I don’t mean to imply that those companies are evil by their nature. But their―more or less publicly―stated objective is that in exchange for letting you use their service, they get your data and sell it for profit. They are not in it for philanthropic reasons. They have bills to pay. You implicitly bought into that model when you signed up. You might have even explicitly agreed to it, provided you read all 25 pages of the end user agreement and were able to decipher the legalese. One can object to that model, but one cannot ignore the fact that it is reality.

The most secure option is obviously to not use any social networking sites, or the internet, for that matter. While this is 100% safe, it is also not very realistic.

So in the absence of that option, it is probably better to be more aware of what we put out there, and how easily it can be found. And if it is out there, it will be found and used. Don’t try to hide from a person if a machine is looking for you.
Using a nickname just makes it hard for your friends to find you.

Trimming Papers

(2012/09/25, 11:54AM)
Writing your papers in LaTeX is great and you should definitely do it. It makes everything better (with the possible exception of grammar), but you have to trust it with the formatting. This is where it gets tricky. Most papers have a page limit, and while LaTex makes sure everything lines up perfectly, it does not care about how many pages it takes. Trimming the paper to a certain page limit thus becomes a familiar headache before every deadline. Luckily, you don’t have to rewrite the whole paper to make the limit. Here are some simple tricks I found helpful to save a lot of space.

Don Metzler showed me a great and easy technique to trim your paper considerably:
- find all paragraphs that have three or fewer words on the last line
- shorten those paragraphs so that the last words advance into the previous line
You can leave paragraphs with more than three words on the last line alone, so instead of rewriting everything a bit, you can focus on a few paragraphs.
This only eliminates one line per paragraph, but due to the way LaTex spaces out paragraphs over the page, this actually shortens the overall paper quite a bit. Treat three or four paragraphs that way and you might cut your paper by half a page.

So, how do you shorten those paragraphs? A good way is to get rid of redundant or “empty” expressions. One that I found myself using way too often is “especially in the case of”, as in “This is annoying, especially in the case of long paragraphs”. The expression is perfectly grammatical, but we can convey the same meaning by just using “especially for”, as in “This is annoying, especially for long paragraphs”. “Especially” already singles out a special case, so we don’t have to say it again. We don’t lose any information, but save three words, and―what’s more important in LaTeX―three white spaces. LaTex spaces out words evenly across each line, mainly by varying the size of spaces (it also varies character spacing, but to a lesser degree). So having fewer characters and white spaces shortens the line, which in turn shortens the paragraph, which in turn shortens the page, which in turn allows you to keep your page limit.

Other phrases that can be shortened: verb plus nominalization, if there is a proper verb for it. I find myself using lots of these constructions. Instead of saying “we used this for our evaluation”, just make it “we evaluated this”.
The Chicago manual of Style also identifies these candidates:
“due to the fact that” = “because”
“in connection with” = “of”, “about”, or “for”
“at this (point in) time” = “now”

The best way to save space, however, is to delete useless phrases. Many papers include a paragraph which starts with “The remainder of this paper is structured as follows:…”. I automatically skip ahead if I see this. In a 8-page paper, you do not need an overview: I can get that by just flipping through. After all, that’s what section titles are for. And do sentences like “We first introduce the problem in Section 1” or “Section 5 concludes the paper” really add anything to my understanding? Do they need to tell me that the section titled “Evaluation” will “present the evaluation of the experiments”?
Leaving this overview-paragraph out saves a lot of space, and does not take anything away from the content.

Of course it’s good to pay attention to these things while writing, and express yourself as clearly and succinctly as possible. But a few of these cases will creep in anyways. And when it is time to trim the paper, they are a good starting point.
If you have more tips or suggestions, please share! Let’s make meeting page limits in LaTex less scary.

The Art of Good Presentations

(2012/09/14, 03:41PM)
I have talked before about how important it is for scientists to express themselves well, and the most important aspect of that is to give good talks. I am far from being a good speaker, but I am a little obsessed with learning what makes one.

So I recently went to a workshop on presentation, and came away with some good tips:
- use dark background. It is much easier on the eyes of your audience, broadens your screen estate, and prevents you from casting weird shadows when you stand in front of it (some people dislike it, though, because it’s so dark)
- shape your talk like a glass: start broad and then narrow to the details (the cup of the glass), stay on them for a while (the stem), and end broadly (the foot)
- maximize the axis space of graphs to fill as much of the screen as possible. Push the legend and title into the graph area, in blank spaces
- do not use a laser pointer. If you want to point something out, circle it on the slide

One of the best ways to get better is to watch good presentations and note what they do. Here are a few presentations I particularly enjoyed, and what I think makes them interesting:
- Dick C. Hardt on “Identity 2.0”. I have no idea what “Identity 2.0” is, and I don’t think it caught on, but the rapid-fire presentation style is fascinating and easy to prepare. Though hard to pull off…
- Guy Kawasaki’s talk for entrepreneurs uses minimal slides, and a lot of great lines. Some of what he says is even relevant for presentations, but mostly, it is fun to watch and easy to follow.
- Chip Kidd talks about book covers, but he drives home an important point: show or tell, but not both―your audience is not stupid. “And they deserve better.”
- The previous talks are about big ideas, and thus a bit abstract. Hans Rosling shows how you can take hard data and make its presentation palatable and fun. This takes a lot of work in preparation, but it shows you that you don’t always need the same old boring graphs.
- Similarly, David McCandless shows how information can be conveyed in interesting and appealing ways. Maybe not always achievable for the average scientist, but worth thinking about, and looking at.

What comes through for me in all the good talks is this: keep it simple. Use pictures more than words. Your slides should be secondary to your talk. They do not need to be interpretable without you. That’s what a paper or a handout is for.

I recently tried cutting text as much as possible in my proposal talk, and got very positive feedback about the slides. It is harder for scientific presentations than for general talks, since you want to convey a lot of detail and nuances, but it helps to focus the attention. I plan to reduce to the max.

Orange Chicken

(2012/04/15, 10:06PM)
Together with my roommate, I found a great way to prepare chicken. It cooks quickly, stays moist, and is almost impossible to mess up. I made variations of it four times during the last week. Here goes:
- Take chicken breast, pat dry and cut into small strips.
- Put strips in a ziplock bag and add salt and a few table spoons of corn or tapioca starch.
- Heat a pan, add oil.
- Shake chicken strip in a strainer to get rid of excess starch.
- Fry the chicken in pan.

You can stop here and eat the delicious, juicy chicken. The starch creates a thin layer of insulation between meat and pan, so that the chicken cooks more evenly and doesn’t dry out. You can add other spices to the starch, if you are so inclined. Allspice is pretty awesome.

Or you can go in to make an orange chicken that beats the crap out of anything you get at a Chinese fast-food restaurant:

- Mix the juice from 3 limes with the same amount of orange juice and some fermented chili sauce.
- Add to the cooked chicken strips.
- Reduce until strips are just coated with a thin film.


In Other Words

(2012/03/29, 10:40PM)
While writing on my thesis and various papers, I found that there sometimes is a disconnect between my perception and what others make of the same data. I started thinking about why that is and how it could be solved. I found a simple, yet effective solution: have other people tell you their version of the story. Here is why.

One of the most important aspects of research is communicating your ideas. It does not help the world if you are brilliant but cannot convey your thoughts. It is also one of the most difficult tasks. What you want and what your average reader wants differs slightly, and while you know your needs, in the end, it is the needs of your readers you have to cover in order to convey your idea.

By the time you are ready to publish, you have spent a lot of time setting up experiments, tweaking parameters, searching related work, and collecting data points. You have devoted a sizeable part of your life to this, you know all the details, and you are very attached to the outcome. You want the world to know how much work it was, and to be able to understand it in all its complexity.

The awful truth is: most people do not care about the details of how you reached your final results. At all. They want a take-home message they can readily understand themselves and relate to others. And they should get one!

Dwelling on the details might make your paper very reproducible, but it is also a surefire way to drive away your readers. They will soon lose interest and skip the details, trying to find what they are looking for. Or stop reading altoghether. If this happens, all your work was basically in vain. They won’t get your idea, and they won’t tell others about it.

So how can you meet your readers needs?

A solution that worked surprisingly well for me was to simply ask them. Tell your friends/colleagues the general problem, give them a few data points, and ask them what they think the paper looks like (obviously, don’t give them your version yet). You’d be surprised how much the stories can differ.

Your friends are unburdened by the details, and still able to see the forest instead of the trees. If they ask you for more information, supply it. You will learn which parts only you saw (because you spent so much time on it), and you can go back and make them clear(er).

Pay attention to how they would present your findings. What do they emphasize, what do they leave out, what is the story they spin? If they reach another conclusion, maybe you need to give them more information, or you have to re-evaluate yours. Don’t reject their outline thinking they did not understand it. If they don’t, neither will your reviewers!

If you do it with enough people, you will find things that pop up again and again, and the holes that need to be filled.

This solution is obviously not foolproof. You have to be able to let go of some parts you really liked, and you have to be able to draw attention to some important your helpers might have skipped. It can not guarantee you an accepted paper, but it will help you to make it more readable, and convey your idea better. Also, it’s a good way to let your friends know what you’re working on.

Science and Showmanship

(2011/10/16, 04:11PM)
German researchers have drafted a position paper in which they demand science be decelerated in order to improve its quality. Their points are (Die Zeit article 4/14/2011):
- worldwide reduction of publications to allow scientists to survey the field and ensure quality
- research needs a basic funding, yet cannot be economically evaluated like a company
- funding should be based on content, not projected success
- authors should only appear on papers if they contributed to it
- scientists have to write their own grant proposals, no agencies should do that or even correct the scientists
- experiments need to be more transparent and reproducible
- good science is only possible with long-term grants

While I agree with most of the ideas (I do think that a base funding would be a Very Good Thing for a couple of less flashy disciplines, and I do agree that science should be about substance first), I take issue with the latent notion that science is too fast, too competitive, and that presentation is overrated.

Science is all about ideas, even half-baked ideas, and, more importantly, sharing them. No major work was created by one person out of thin air, but resulted from building on what other people have done before, however small it was. If those other people had waited to publish it until they thought it was complete, it might have never seen the light of day. Or, more likely, it would have, but published by somebody else, who was not as hesitant. Of course you should wait until you are reasonably sure your results are sound, but there is a point where it turns into procrastination. If you do not publish, nobody knows you are brilliant (they also won’t know if you are clueless…)

Part of a scientist’s skill set is to navigate and assess the body of work in his or her field. There are increasingly more tools to help you achieve that. Scientists know which journals are hard to get into, and which ones will print anything as long as it has a title. Researchers will assess work also based on where it is published. Both quality and quantity matter. Someone who has had only one paper in 10 years, but in Nature or Science, is not much better than someone who has cranked out four papers a year in obscure journals over the same time span. Granted, the first guy has substance, but who tells me he could do it again? With the other one I know at least what he was up to, and that his ideas were bad. Luckily, most people will lie somewhere in the midlle. So the flood of publications is actually a boon rather than a bane.
By artificially restricting the number of publications, you do not necessarily improve quality (transparency and fairness of acceptance criteria is an issue to itself), but take away a lot of breadth and information.

And, yes, science is about presentation: if your idea is too complicated to explain it, chances are it is not worth explaining anyways. Some people maintain that you should be able to explain your whole research idea during an elevator ride. A lot of the great ideas are exceedingly simple, and a lot of good papers are good because they explain their point well. A brilliant mind that cannot communicate its brilliance is no use to the academic world, least themselves. The fact that the occasional showman gets a grant although his ideas are not very deep should not stop us from rewarding good presentations!

You might not like it, but I am sure that fast, competitive, and presentable science improves our general knowledge and understanding of the world. Artificial boundaries and regulations do not. The times when researchers could sit in their study and worry about one thing for years are gone. Now, you have to go out and present it, for money, for visibility, and ultimately also for the advancement of science.

One wish

(2011/04/26, 06:12PM)
One thing I wish I was, apart from brilliant, is to be fascinated by boring things.
Think about it: it would have so many advantages. Like that linear algebra class you had in high school when you could barely stay awake, and now you try to remember how to invert a matrix. Or the list of all the resources that everybody on you project agrees would be really useful to have, but nobody wants to actually sit down and compile it, because the thought alone makes half your brain fall asleep.

If you were excited by all of that, you could get a lot of good work done. On the downside, you might also become the go-to guy for everyone with a boring task. Hey, you can’t have everything! At least you wouldn’t be bored…

Language change

(2011/04/15, 12:53PM)
After some deliberation, I decided to write my future posts in English only, to speed up my blogging freqeuency.

Translation took up too much time, and in some cases prevented me from posting at all. I will catch up on these posts now.

Since my German readers have excellent English skills, this solution leaves nobody out.

You can expect more posts in the future.

Remembering the Dead, 1

(2010/11/20, 05:44PM)
My grandmother could never throw anything away. Occasionally, my mom and her sisters would clean out the pantry, and my cousins, siblings, and I would stand by and bet on the oldest item. A ten year old ketchup bottle? A pack of custard powder several years over its due date? Or maybe something that had to be bought with food stamps?
My Grandmother had a very different approach to food than we do. She raised six kids on a budget, during and after the war. If something grew moldy, she would cut out the soiled parts and declare the rest perfectly edible, and, in fact, eat it. I don’t think she ever got sick. She told us that dirt made you healthy, and my mother related to me that as kids they belived one pound was the acceptable amount of dirt per year.
My grandmother had learned cooking professionally, and if I say professionally, I mean efficient, not fancy. For our family dinners, she cooked for an army, and her meatloafs and patties are legendary. My mom’s were good, but these were heavenly, probably because my grandma believed that more fat was good in anything. She was a round woman with rosy cheeks, and in my memory, she always wore a flowery dress and a grey wig. She would roll down hills with us kids and could pop out her dentals, which we thought was the coolest thing ever. If we were grumpy, she would give us “laughing pills”, what other people called Mentos.
Her two worst memories were the night they bombed the neighboring city, when the horses screamed in their stables ( “Have you ever heard horses scream”, she would say with a shudder), and the Polish soldier who stole her strawberries. When she grew old, she was cared for by a polish nurse, and I think Poland was able to redeem itself in her opinion. After the death of my grandfather, she got very sad and only wanted to be reunited with him.
She died during the night, with one eye open and a surprised look on her face.
I often think of her when I cook, and what she would have done. I collect and filter the grease whenever I fry bacon. I buy cheap cuts of meat, like shoulder or chicken hearts. I think fat makes everything better. When I cook, I cook for an army.

New York I love you, but you’re bringing me down

(2009/12/15, 10:15PM)
Like millions before me, I arrived in New York with a sense of wonder. I had been reading on the train ride into Penn station, so my first glance of the city was when I stepped out onto seventh street. It was a clear midwinter afternoon, and the low sun illuminated the tips of the skyscrapers and filled the streets with a soft light. I was immediately captivated. I had been meaning to come here for a long time, and finally I had made it.
Unless millions before me, however, I had not come from distant shores to build a living here. I was only visiting for an afternoon from New Jersey.

My first action was to find a Starbucks, something I imagined to be a little easier in New York. Eventually, I succeeded, got a coffee, and left through the backdoor into some sort of mall. Uniformed pages showed people around, a group of women was taking pictures of the ceiling. I started to wonder…
Only when stepping out onto fifth street and glancing up the facade, I realized that I had just unknowingly visited the Empire State Building. They should put up signs…

New York is very different from LA. There are no Empire State Buildings, for a start. Also, new York is reigned by pedestrians. As soon as the cars slow down, they start crossing the street, no matter whether it’s red. If you do that in LA, you’ll get fined for jaywalking. Here, you only get fined for not walking.
Nobody asked how I was doing or wished me a good day, and I adapted quickly to the environment. I rolled my eyes at people slowing down, I snarled at people stopping to watch the shop windows. I’m sure New Yorkers have their bad reputation because of visiting Angelinos who enjoy being rude for a day.

Ok, not true, I did neither of the above. Without looking at anyone, I just walked at a quick pace up the street, headed for Central Park before it gets dark. I saw Macy’s windows and the Rockefeller Center christmas tree. I stopped at a cathedral that stood in stark and comical contrast to the high risers around it, seeking some rest from the bustling streets, but it was just as busy inside as outside. I strolled through the beginning of Central Park and got a bad Espresso and hot chocolate at a hip cafe. Eventually, I swam anonymously through the crowds down Broadway, washed into Penn station, and boarded the next train back to New Jersey. A state that is much nicer than it’s bad reputation. I’m almost sure it’s due to visiting Angelinos…

Belated Birthday Wishes

(2009/12/02, 08:44PM)
Sometimes, you see things only from a distance. Birthdays, for example (whoever had their birthday lately knows that I need a little longer). When the Wall fell, I was eight. At that time it seemed nothing special. I only remember that everyone was very excited and watched a lot of TV. This was rare (the watching television), and therefore had to mark something special. I was told that there had been a border in Germany, which was now gone. That did not impress me much. When we went on holiday in France, there was a border, too, and that was always a lot of fun. Also the fact that they spoke German on the other side of the border I found little remarkable. I had an aunt in Austria, and there was a frontier, too, and on the other side they spoke German. With eight, geopolitics is still rather simple…

Only when I look back now, things seem more remarkable. And more complex. Germany and Europe are what they are, last but not least because of those days in the fall, when my family watched a lot of TV, and everything that ensued. And not, as some here would have you believe, because a senile ex-actor proclaimed “Tear down that wall” (even less because of a third rank star with fake chest hair humming silly little song about freedom. But then, nobody but him believes that anyways). And it is good the way it is. At least, from a distance, it does not look half as bad as one would have it at home.

Sometimes you only miss things from a distance. Things that you have not previously noticed, or found ridiculous. Many little things: long train rides through wooded hills, deli-meat-specialist saleswomen, autumn fires, bakeries, Feierabend beer, shop talk about football, the deep rooted belief that everything in this world can be solved efficiently in a very specific way.
But of course, mainly the loved ones you left behind, in that reunified country on the other side of the globe.

And before you know it, you find youself sentimentally murmuring what Hoffmann von Fallersleben wrote down nearly 170 years, Einigkeit und Recht und Freiheit… Happy Birthday, Germany! Be well, and stay as you are. You are ok the way you are.

Like I said, I always need a little longer for birthdays…

Food Nerd

(2009/09/24, 01:00AM)
For some time now, I have been a subscriber to the magazine “Cooks Illustrated”. A friend described it as “food porn”, but the focus is less on sensual pleasure than on scientific analysis (you know that a food magazine is serious if they print in two columns, and only in B/W).
Recipes are varied ingredient by ingredient, tested and the outcome reported, until you get the best possible result. In addition, you get tricks and tips and kitchen tool reviews.

My absolute favorite by now is a recipe for ricotta. Super easy, fast, and very delicious! And a worthy replacement for the unobtainable Quark. Here’s my version (original in Cooks ilustrated Oct 2009):

Heat a gallon of whole milk with a tablespoon salt to 185F, or until surface slightly ripples. Take off the flame and add 1/3rd cup of lemon juice. Let stand for 5 minutes. If the consistency is not curdly enough, add another tablespoon of lemon juice. Repeat until there are no more changes in the consistency. Skim off the mass with a strainer and put it in a colander lined with kitchen towel. Put over a bowl and leave overnight in the fridge.

The next morning, you have fresh ricotta, somewhere between cream and cottage cheese. Tested in pasta and with honey for dessert.

Guten Appetit!


(2009/09/06, 03:47PM)
And then, suddenly, it is summer. The days are only a little bit hotter, yet the nights are warm and mediterranean. If Angelenos sat outside, this is when they’d do it. Yet people are fleeing LA for the long weekend, clearing the freeways, leaving the city behind.

Fires rage on the hills encircling it. At night, you can see their orange glow on the slopes. By day, an unwavering pillar of smoke marks their position. It mingles with the smog of downtown and tints the sunsets pink and orange. It rains soot over the city and can be smelled as far as the coast. And it greets the people coming back into LA as they fly through it: You might leave for a weekend, but the city is still here. With its fires.

And so is summer…

Science vs Engineering

(2009/08/20, 06:18PM)
There are two kinds of researchers: scientists and engineers. Faced with a problem, the scientist will say “How interesting” and proceed to abstract and classify it, develop experiments to reproduce it, and come up with a theory to understand it.
Faced with a problem, the engineer will say “How can we solve this” and proceed to measure and discretize it, build a model, refine and rework it, until it solves the problem.
Thanks to science, our understanding of the problem has increased. Thanks to engineering, we have one problem less. Ideally, these two disciplines should work hand in hand. Scientists analyze the problem to help understanding it, with an eye on possible solutions. Then engineers use that knowledge to solve the problem more efficiently. And indeed, scientists often try to sound more engineering, and engineers more sciency. But that is mostly wishful thinking.

In reality, the two camps know little and think even less of one another. Scientists easily get absorbed and sidetracked by fascinating details, producing knowledge for the sake of knowing. Engineers get just as fascinated, yet with task specific details, tailoring their solutions so exactly to the problem at hand that they have to start almost from scratch when faced with a similar one.
“What do I care why it works, as long as it does”, says the engineer. And more often than not, it involves some hack to get there.
“What do I care how it works, as long as we learn something”, says the scientist. Yet sometimes, even that is not guaranteed. The main difference between the two is the nature of the outcome. Engineering sells better, science―not so much.

Linguistics is clearly a science (I could not think of any product linguistics has fostered). Computer science, despite the name, is mostly engineering (unless it’s theoretical CS). Somewhere in the middle, there’s Computational Linguistics, and there, waving, is little me…
In this ambivalent field, one seems to have to choose a side. Being a linguist by training and nature, I am primed on sciences. Yet getting a PhD in CS often incurs being asked to solve engineering problems which have a linguistic component. It’s like trying to make me an engineer with a knack for language. It took me the better half of a year to realize that that is not who I am: I want to us the toolkit of CS to understand linguistic problems. A scientist with a knack for engineering, maybe. A fine distinction, yet an important one. At least for the person who makes it… But what does it matter, as long as we can bridge the gap between the two!

Thanks to engineering, this is the first post I wrote in English and then translated back into German. And thanks to science, I was able to know which parts I should correct. And why…

Bratwurst mit Sauerkraut

(2009/08/12, 09:59PM)
I love eating! And I love cooking. Perhaps even more than eating. Whoever is surprised by this does not know me well. Cooking for me is far more exciting than eating the final product. Perhaps because it has no more secrets. The interesting part is to figure out if everything went smoothly, as I had hoped for. And rather than for me alone, I cook for others―the more the better. Eating is the most direct way to show someone you like them, and I haven’t yet met anyone who does not appreciate that.

Food is―besides language―one of the most salient features of a culture. But more often than not, the most famous dish is not everyday food. When it comes to national cuisine, we all go back to stereotypes: as a German, of course, I love beer, miss bread with crust, and would die for bratwurst with sauerkraut.
Darkly, people have prophesied before my arrival in LA that I will be a wabbleing lardass in no time, thanks to a diet of hamburgers, French fries and donuts. What else do Americans eat…?

But wait! Americans love good food. They celebrate it! There is a TV station which broadcasts nothing else, there are countless journals, websites and local groups, sharing insider tips, recipes and restaurants. And California is especially ideal for this. You get everything fresh: meat, fish, vegetables, fruit―the producers are only a few minutes drive away. Accordingly, there are Farmer’s Markets in each district, and the supermarkets carry everything your heart desires. And at any time.
Each immigrant group has brought their own recipes, adjusted them to the local produce, and all peek into their neighbor’s pots. Americans are new to the international market of cooking traditions, but they have no inhibitions, they’ll try everything. And without the ballast of tradition, they pick from each recipe the best and continue from there on.

Californian wine can easily keep up with Bordeauxs, but costs only half as much. And, it has to be said: American beer is excellent! I have found a lot of great beers, often by small local breweries, which are not exported. What you get in Germany is waht nobody here drinks. (And no, German beer is not automatically the best in the world. Purity law or not: some insipid brews I have chugged out of patriotism were no advertisement for the German brewing tradition).

Downtown, there is restaurant with the beautiful name “Wurstküche”, and from Knack to Bratwurst and rattlesnake-rabbit links, they serve everything you can cram into guts.
Only the bread here still needs practice. No crust and fuzzy consistency―this can pass as “bread-like pastry” at the most. I’m doing my part to educate and provide home-baked goodies for my colleagues. Multigrain with spices, coriander with apricot and pistachio, or chocolate with cranberries―I have adjusted my palette to the local palate, spread the recipes and wait for it to bear fruit. The reactions are unanimously positive. The only setback: The crust was too cross, I was told. But there I am not willing to make compromises! As I said, we still practice …

The problem with German stereotypes, actually, is that although they are not representative, they all apply to me… Bread, sausage and beer? Bring it on!
My eaten Donuts, on the other hand, I can count on a few fingers, for the Hamburgers I need two hands. In fact, I have lost 10lbs within 3 months, simply by eating more consciously (and a little stress). Even in America there is salad…

Movie World

(2009/06/29, 11:08PM)
Apparently, I just live down the road from the bar that was the inspiration for Moe’s tavern in the Simpsons. I will check that out…
Can I talk to Mr. Freely? First name I.P…

Insights of a travelling salesman

(2009/04/28, 10:51PM)
When I arrived here, my attitude was the same as every European’s fresh off the boat―a feeling of calmly assumed cultural superiority. The same kind of feeling you have towards high school kids talking about poetry, or the new guy in your office droning on about workflow. Experience, time, let’s face it: History is on your side! Surely, coming from a continent that has such a diverse culture, such a cornucopia of wars, famines, great thinkers and glorious artists makes you a more sophisticated human being than these youngsters? I laughed at “historical buildings” that are barely 200 years old, I chuckled at the subject of American history.

But working in America is a humbling experience. People get up early and go to bed late. There is no sentimentality lost, work has to be done, no matter what. The only thing you are judged by is the impact of your work. And you realize: All your sophistication, culture and history buy you nothing! The people who came here more often than not did so because they wanted to leave their old lives behind. Together with the history, the culture, and the prejudices. Coming here was making a clean slate―your religious beliefs, your philosphical views were secondary to your ability to make a life. History was what you made for yourself. You can knowledgeably talk about medieval poetry, Romanticism and dialectic? Good for you! Now, concerning that deadline…

And then, what kind of history would a German and a Chinese immigrant have to share? The first point in time they both could relate to was the time they arrived here. What each of them had thought of as historical facts was just an interesting story from another place and time to the other. Even if you did not want to leave the past behind, it was something you shared with a much smaller group, something private and reserved for special occasions. So while many Americans treasure their heritage and take interest in the countries of their ancestors, they do so in their spare time. History is something that happened in the past, but we are living now!
And just in case it becomes history some day, we better do a good job in the meantime…

Picture this…

(2009/04/12, 09:36PM)
Picture the late morning sun over LA, lazily shining through the open front door of a single storey house. As it crosses the threshold, it passes waves of Tango music, floating out into the Easter Sunday. As it hits the battered hardwood floor, legs move swiftly through the beam, wearing high heels and dancing shoes. On a table in the kitchen wait banana pancakes, tamales, fruit, and orange juice for the dancers.
What a perfect way of celebrating Easter…

After rain comes sunshine

(2009/03/02, 10:41PM)
“You write so infrequently”, I hear quite often. “I have too much to do,” I then reply. That is true, but there is of course more. Everyone likes to hear good news, or at least exciting ones. And in recent weeks, there was not much of either of them. Not only the global economy, also my life showed signs of unhappy development. And the Californian sun was behind thick clouds.

Back from Germany, I got really aware of how far away I am. Not only geographically, also mentally. My attempt to tell people here about Germany showed me how much I’ve taken for granted, how little things I questioned or had consciously perceived. How does the German insurance system work, what exactly does the Bundespräsident do, why are there Haupt and Real schools (and how do they translate), and where is the difference between the Bundesrat and Bundestag?

At the same time I am still a foreigner here: I know no American lullabies, was spared from high school hell, and so far I have never thought about my credit report. I felt like sitting between all chairs. I even thought my English was deteriorating.

In addition, my research inched forward slowly, I long puzzled over my schedule, and there were some things in my private life I had to set straight. Nothing great, but in sum unnerving. My morale was struck. To make matters worse, in February, I had an accident. No physical consequences, but the car was out, and that in LA is synonymous with disaster. The low point was reached.

Perhaps a good thing, because from there on it went uphill. After a shock, you can see things more clearly: I have reorganized my week, redefined my targets, and concentrated on fewer things, but with more energy. And that did it. I am happier and more content than before, university and job are fun again, and the thing with the car was also taken care of. The insurance was helpful and friendly, many people have given me advice and practical help, and since yesterday I’m glad the owner of a bright red Nissan Versa. I am still a foreigner, but I am not the only one. And it can be quite charming, too. Just what the deal with Hauptschulen is escapes me still…

So, this is it! After every rain there’s sunshine, even and especially in California…

Have a nice Vorurteil!

(2009/01/31, 08:37PM)
One of the most frequently encountered prejudices against America in Germany is the superficiality: In America, nobody is really friendly, that is just superficial. Service in restaurants lasts only until the check is brought. Admittedly, the greetings in America are much more cordial than in Germany, and still the phrases are less comitting. Nobody is really interested in “how I am” (I started relating at length how I felt once and was met with blank stares). Is that necessarily worse, though? I had to un-train myself wishing random people a nice day when I was in Germany, just to avoid being eyed with suspicion.

I don’t really know what people in Germany expect: If I meet someone in an elevator, I don’t want to share their most intimate thoughts, a “How is it going” is sufficient. It does not hurt to say something friendly, and it is always nice to hear it. If I enter a restaurant, I do not intend to malke friends for life, I just want to be served promptly and correctly. Maybe with a smile, why not?

And though people look down on the American attitude, nobody is really fond of the “emotionally authentic” German service. Maybe waiters there are more authentic, yet if I have to wait 20min for some sourpuss to bring me the espresso I ordered twice, which then consistently does not turn up on the check, I have to say that I don’t give a damn about emotional depth and authenticity! After all, German waiters do not even have to worry about the expected tip being subtracted from (already minimum) wages, so it should be much easier for them to smile some times. The friendliness of American service personel might be partially due to the fear of losing one’s job, but what difference does the motivation for friendly service ultimately make for the customer? I do not need to questions someones psychological motivation if he treats me nicely. And just for the sake of completeness it should be mentioned here that Americans are indeed capable of genuine friendliness and helpfulness…

So as long as it only concerns everyday encounters and not interpersonal relations, I am all for a little bit more superficiality!
Have a nice day!

See Europe in two weeks…

(2009/01/04, 11:38AM)
Europeans often make fun of the American tendency to see Europe in five days ( “If it is Tuesday, this must be Paris”). Of course you cannot fully appreciate the complexity and nuances of a country in such short a time, but in the meantime I know why one would still want to do it. And Europeans usually have more than two weeks worth of holidays per year…
If you have only a restricted time, you try to cram as much into it as possible―if you like to travel, countries, if you like people, meetings.

Over the last two weeks I have had a meeting marathon which sometime got me quite dizzy ( “If I meet X today, it must be Tuesday”), and still I did not manage to see everyone I wanted. A fortnight is far too short a time for all the wonderful people you meet over almost three decades…

This always leaves the feeling, though, to just scrape at the surface and not do everyone justice. You can not fully appreciate the complexity and nuances of a person in such short a time, and still you try. Because even a brief meeting is better than none. And you take home so much more than Facebook or Skype could ever tell you.

To those I met: Thanks for the time with you, it was great to see you again! And to those I did not manage to meet: Please don’t hold it against me, it was not on purpose. Just on a tight schedule…
In any case I would be stoked to see you in LA at one point. Maybe if you are on a US trip? In that case, why not make it a Tuesday?

Speak in tongues

(2008/11/27, 05:08PM)
Language is one of the things you always take with you, no matter where, when, or what your baggage restrictions are. Our phonological system is hard to fool (or, as Christoph says “phonology always works”). If you don’t believe it, try for one hour to exchange all Fs and Ks. Should you succeed, you are a genius. If not, you’ll have a lot of fun.
This leaves us with the realisation, though, that we will always be spotted a s foreigners: German final devoicing and the whole trouble with “th” and “wh” are dead giveaways, and my “vowels are not American”, as a friend pointed out.

Language is not only grammar and vocabulary, but also pronunciation subtleties: American “sh “s have less friction than German ones, less rounding, and the “a “s in “aber” and “garden” are absolutely not the same. Over the past few months, I have been identified as South African, British, or, well: German, but nobody ever seriously considered me to be American.
Even as a linguist you cannot beat your own system: The brain happily abstracts and throws everything into neat categories. Don’t bother it with details…
I have tried to pick up a few Chinese phrases, but my Chinese friends have either smiled politely or sadly shaken their heads. Even when I thought that I had repeated everything I heard, I hadn’t, since Chinese not only uses sounds, but also tones, and if you are not trained, you frankly don’t hear them…
It is only slightly consoling that on the other side, foreigners never get a German “ch” right.

It gets even worse when we get to meaning: Subconsciously, one a builds up a fine grained taxonomy of meaning nuances: Langauge is like a well worn rapier, which can pinpoint a meaning and win an argument.
Only when you start argueing in another language you realize you are suddenly handling a club. Sure, you can hit at the general meaning area, and you can win an argument, provided you hit first and hard. Yet it has no elegance or style, and too often, one is left searching for the right words to express a thought.

There are words, though, that I would like to have in both languages: “random” is such a word. I know that it can be translated as “zufällig”, but that does not cut it in a sentence like “That comment was so random!” And why does English not have an equivalent for “doch “: “yes, it *is*” is clumsy, and does not guarantee that satisfaction to prove someone wrong with just one word (I was also told that “jein” should be introduced, being an indecisive mix of “yes” and “no”).

One of the biggest obstacles in learning Engish is the fact that over the years, it has acquired Scandinavian, Germanic and Romance influences and mixed it all up. There is irregular inflection, yet not consistently: goose-geese and foot-feet, yet not moose-meese or wood-weed. I try to advocate the innovative use of “one shoop, two sheep”, yet people seem reluctant to take it on.
The only way out of this seems to me founding my own language which incorporates all these wonderful concepts. As a result, nobody will understand what I am saying any more, but I guess that is the price you have to pay if you want to express yourself clearly…

Shake it, baby

(2008/11/14, 12:55PM)
Yesterday, we had an earthquake drill. At 10:00am, we were supposed to “drop, cover, and hold on” (does “duck and cover” sound familiar). Since I am all for this kind of prevention, I participated. Apparently, I was about the only person who did… After five rather dull minutes (no people playing wounded, no paramedic coming to check on me), a rather puzzled colleague inquired what I was doing and whether I was ok. Apart from the fact that the space under my desk is rather dark and claustrophobic, I was, and I would have been in case of a real earthquake. That is, unless the building collapsed. In that case, you should sit next to your desk and hope that the falling debris forms a cave around it.
Given that I work on the fourth floor of a 12 storey building, I am not convinced that would help much…
It might be better to be in an elevator, since those swing freely in a concrete shaft inside the building (outer walls are the most dangerous ones). Unless the cable snaps or a fire breaks out.
I guess you just have to hope the architects did their best and you manage to stay away from the windows and outer walls when an earthquake hits. And if “drop, cover and hold on” helps―so be it.
I’d still rather not try it…

News from Behind the Mirror’s Glass

(2008/11/08, 05:35PM)
I am happy to announce that the little spider living behind my left rearview mirror―despite a major car wash―continues to weave her net everyday.
In case you were worried…

It’s over

(2008/11/04, 09:36PM)
As a friend said: “I only believe when Fox announces Obama as president”. Sen. McCain just acknowledged Obama’s victory in a fair and moving speech―and Fox confirmed it…
That’s it: America has a new president!
Probably not all will be well now, but hopefully many things a lot better. Thanks for voting, America. I am a happy alien…

A classic

(2008/11/01, 04:27PM)
Since the first mentioning of my car here, it has not been washed. As LA is a very dusty city, however, so it was about time to change that. I thus welcomed the fact that a sorority of the local college held a car wash in my street to raise money for a school of blind children. Getting my car washed for a small fee for a good cause―that seemed ike a pretty good deal to me.

Probably a bit too good to be true. For about two hours, I was the happy owner of a shiny car. Then, the long awaited rain set in and washed all the dirt out of the air―and onto my car.

In retalliation, I went and had my hair cut―that was also long overdue, yet it won’t be ruined by the next downpour…
That’ll show the wheather…


(2008/10/25, 07:24PM)
Behind the left rear mirror of my car lives a little spider, which comes out every night and weaves her net between the door and the mirror. And every morning when I get into my car, I remove the net.
It is nothing personal from my side, and in secret, I even admire her persistency, but it has become some sort of ritual.
Maybe there is something I can learn from it: Get out every morning and weave your web…
Or probably I just have to start to wash my car properly.

At least…

(2008/10/17, 04:38PM)
The last few weeks have been pretty intense (workload and stress level-wise). The next weeks will see more of that, but at least I am better prepared now.

Also, I found out that I can see the Hollywood sign from my office window.
Well, that’s something…

The town that wasn’t there

(2008/09/18, 10:38PM)
Ok, so I am in LA: Everything is loud, big, and exciting. I have finally a room and all the accompanying bits and pieces, and life goes its way. Besides all that organizational matters I did at least not have to worry about getting to know the city. Which is an advantage. Not a very individual one, granted. We all know it. We all have been here. At least if we had a television…

The A Team, Baywatch, Beverly Hills 90210, and countless other series portray LA, even without stating that explicitly. Hollywood is not only a place, but a trademark and―to some―a way of life.
Musicians from Cypress Hill over Presidents of the USA to Sheryl Crow or System of a Down sing about the city, telling tales about the hard life in South Central or the bars in Downtown, and A Tribe Called Quest in an older song bemoaned the loss of their wallet in El Segundo.

Yes, LA is exactly as we always imagined it to be. We have all been here, we know how it looks. Other than Bielefeld it exists, it has a place in our heads. That also means, however, that everyone has an exact picture about how I live. They differ somewhat, but an astounding number of them involve pools, stars, yet also thugs and shopping carts.

My reality is somewhat different: Sure, everything is as seen in the movies, all the places do exist. There are thugs, there are stars, and there are ghettos and glamour side by side. Reality, however, is less glamorous: I go to university in South Central, El Segundo is just a 15min bike ride away, Beverly Hills and I are separated by just a few numbers in our ZIP code, and Hollywood is a dingy quarter full of tourists, hookers, and souvenir shops. I live in two cities…

Up close, the city loses much of its screen character and becomes something else: A tangle of streets and houses, of beach and highways. It is many small cities in one, all full of interesting, busy, and mostly nice people from all over the world. It is a myth that keeps re-inventing itself, an unsentimental giant with a disposition for drama. A contradiction in itself.

No, this LA is certainly not a pretty city, yet an exciting and―mostly―a friendly one.
And currently my home.

The Art of Self-Contradiction

(2008/09/08, 09:49PM)
Currently reading a book which states that you should not repeat yourself.
To make sure I got that they say it several times…

My house, my car, my…

(2008/08/24, 05:22PM)
“When I arrived, all I had were two bags full of stuff, and now I am the owner of…”

We all know the stories starting out like that. They are part of the topos America, just as the rich uncle and the dish washer turned millionaire.
In my case, however, the story is not that compelling yet―I have neither founded a global trading empire nor bought a villa with celebrity neighbours in Beverly Hills. In fact, I still do not even own a room yet.
Yet my possession has grown considerately since the notorious two bags.

Since yesterday, I own a new iPhone and―just borrowed, though―an old car. What the car lacks in glamour is made up for by the phone: It is chic, shiny, easy to use out of the box―and devours battery like there is no tomorrow.
The car, in contrast, is rather energy efficient, an old Honda that does not need much.

I hate to admit it, yet with both I am pretty much in line with the current L.A. trend: While a year ago the streets were full of cars that consumed more fuel on starting than a weekend trip through Europe did, fuel efficiency is now the new cool. And with the radical consequence that characterizes Americans and especially Californians, they have changed the outlook of their streets.
I do not dare to tell people that they still pay half the price they would in Germany―if they set their minds on energy conscious living, who am I to stop them? My car will play its part, and my conscience can be at peace. Yet is my phone a green one?

So, having entered the world of the propertied with all the entailing moral consequences, all I lack now is a room to put all that stuff into. And a moral guideline to buy stuff…

If a bit under the weather…

(2008/08/22, 02:46AM)
One of the questions posed most frequently to me over the last few days was “How’s the weather?” Quite understandable if you are from a country that―due to changing metereological conditions―has something like weather.

Here in L.A., however, the weatherforecast is about as interesting as election results from a totalitarian regime: You know beforehand what it’s going to be. In this case: 80F plusminus 5, a little cooler in the mornings and evenings.

That may sound great, yet after a few days you get used to it. You also find out quickly that the clouds you see in the morning will have disappeared until noon.
As exciting as the city is otherwise, the weather is rather bland.

At least this has the advantage that you can spend more time worrying about other things than umbrellas and warm socks. For example forms. But that is another story. One you just have to weather…


(2008/08/16, 04:00PM)
Well, I knew that das passieren würde, but when it strikes, you feel immer a little lost. After zwei oder three Tagen, your Gehirn does not know genau, which language to take. Es ist no longer German, aber it ist not yet Englisch. You concentrate on eine Sprache, but you keep switching back und vor.
It does nicht mal help if your host understands beide Sprachen, because you will not know, welchen Ausdruck man what zuordnen muss. I feel wie ein Aphasiker, der things zwar beschreiben can, yet is unable to recollect the Namen. It is schlimm genug, einen expression in the foreign language nicht zu wissen, yet I forgot them in either der beiden Sprachen!
At least this will sort sich out after a few days, aber until then I will re-enact the babylonian Sprachgewirr in meinem head…


(2008/08/14, 07:15PM)
After a lot of toil and trouble, much waiting and even more forms, I have finally arrived in Los Angeles. Let’s start…


(2008/08/14, 04:28PM)
Two bags. That’s it. Two bags, max. 32kg each, more does the airline not allow. Two bags to take everything with you. Not only the stuff you need, but also everthing that separates travelling from living.
It’s strange to confine your former life to a certain weight and some baggage dimensions, but what can you do...

Möchten Sie diese Website lieber auf deutsch lesen?

Valid XHTML 1.0 Transitional Valid CSS!