Não são poucas as vezes em que sinto medo do futuro. Mas junto com o medo vem uma autocrítica, em que me sinto um primata, que não quer sair de jeito nenhum da sua zona de conforto. Pois bem, leiam o texto abaixo e vejam se o medo é ou não infundado:
… scientists are finding—to their surprise—that useful information can actually be mined from the tedium of the blogosphere.
Andrew Gordon and his colleagues at the University of Southern California’s Institute for Creative Technologies in Los Angeles have been trying to teach computers about cause and effect. Computers are not good at dealing with causality. They can identify particular events but working out relationships is more difficult. This is particularly true when it comes to using computers to analyse the human experience.
But it turns out that computers can learn a lot about causality by reading personal blogs. Of the million or so blog entries that are written in English every day, most are comments on news, plans for activities, or personal thoughts about life. Roughly 5% are narratives telling stories about events that have recently happened to the author.
A reportagem da The Economist explica como funciona esse data mining:
To enable their computer system to learn from blogs, the team followed a two-step process. The first step was for humans to flag thousands of blog entries as either “story” or “not story”. People use different words with different frequencies when they are telling stories, as compared with other forms of discourse. By tallying up the frequencies of parts of speech such as pronouns (I, she, we) and past-tense verbs (went, said, thought) in these flagged blogs, it is possible to distinguish between the two types—regardless of what the story is actually about, says Dr Gordon. His computer system could then look at other blog entries and work out whether they were narrative or not.
The second step was to teach the system to identify causal connections. Here the team used much the same technique. Dr Gordon and his students read thousands of random blog entries and specifically pointed out phrasing associated with causal relationships (such as “I did X so then Y happened”) for the computer to pick up on. Identifying such phrases in blog entries then enables the computer to pick out and categorise those sentences that contain a cause and an effect, such as “I slammed on the brakes but ended up smashing into the car in front of me” or “The doctor scolded me for eating too much fat and risking a heart condition.”
Até aí nada demais, né? Mas é a partir daí que o futuro começa a ganhar feições assustadoras e, arrisco eu, totalitárias:
The idea is that this will eventually lead to a system that can gather aggregated statistics on a day-by-day basis about the personal lives of large populations—information that would be impossible to garner from any other source. Ultimately, Dr Gordon expects the analysis of personal stories in weblogs to be used much like Google’s flu tracker, but on a much grander scale. Google’s flu-tracking scheme can detect early signs of influenza outbreaks by mining search data for flurries of flu-related search terms in a particular region.
The web could be mined to track information about emerging trends and behaviours, covering everything from drug use or racial tension to interest in films or new products. The nature of blogging means that people are quick to comment on events in their daily lives. Mining this sort of information might therefore also reveal information about exactly how ideas are spread and trends are set.
Esqueçam aí a questão comercial, de recomendação de novos produtos e atenham-se à possibilidade de prever comportamentos, como uso de drogas e tensões raciais, como cita a reportagem. Trata-se de um futuro em que vc é julgado não pelos seus atos, mas pelo que você pode vir a fazer.
Isso nas mãos de um governo autoritário é uma merda sem tamanho.