I like the approach of the authors of this paper. They first automatically identify phrases which occur most often in Republican speeches, and least often in Democrat speeches. Phrases like "Boy Scouts," "War on Terror," and "Death Tax." Then they do the same for Democrat speeches, finding phrases like "Trade Agreements," "American People," and "Tax Breaks." Then, they compare the frequency of use of these phrases in newspapers.
They " find that the New York Times, Los Angeles Times, and Washington Post are similar to one another and to a fairly liberal congressperson; find that USA Today is somewhat closer to the center than these papers... the Washington Times is signicantly to the right of the other newspapers they consider... we identify the Wall Street Journal as fairly right-leaning."
These seem like reasonable results to me.
When I talk about "machine learning," this is the kind of "understanding" that is going on. It is an ability to bring out statistical truths. Obviously in this case the machine had no understanding of anything about the phrases other than their level of political slant. But one could imagine a much more ambitious project to understand connotations of English words in general. If a machine can successfully pull out the same connotations from a sentence a human would, as well as representing its literal meaning, what is missing from understanding? Only the subjective experience, I think.