Skip navigation

Tag Archives: dados

…inclusive porque já estamos na situação em que a torrente de informação gerada por pessoas e sistemas, informatizados e conectados, já não tem nem mais onde ser guardada, como mostra o gráfico ao lado. e isso há tempos: desde 2007, se cria bem mais informação do que os sistemas de arquivos podem armazenar. mas não é só a criação que importa, é o fluxo de informação também: segundo a CISCO, o tráfego global na internet cresceu 45% em um ano [entre 2009 e 2010], chegando a 15 exabytes por mês. e os sinais são de que tal fluxo será quatro vezes maior em 2014, chegando a 767 exabytes no ano. leve em conta que um exabyte é um quintilhão de bytes, algo com nada menos que 18 zeros depois do 1… e pense no volume de dados indo de um ponto a outro na rede mundial.

…lembrando que os últimos dados consolidados que a empresa tem são de 2009, 2010 estando para sair. mas pense: dados móveis, a parte roxa do histograma, que ficava escondida em 2009, será quase 40 vezes maior em meros cinco anos. isso tem a ver com você, eu e todo mundo que a gente conhece usando a rede de dados móveis ou querendo usá-la assim que pintar na sua localidade ou tiver como pagar por ela.

resultado ? uma pesquisa [limitada, é verdade] da magnify, especialista em “curadoria de informação”, descobriu que…

Via Sílvio Meira

For Varian, everything – including his culinary choices – can relate to data. Last year, while looking to buy a pepper shaker online, he hit upon the idea of a Google Price Index (GPI). It uses Google’s web shopping database to create a daily measure of inflation. It could, one day, be a complement – or competitor – to the official, yet less frequent, Consumer Price Index (CPI).

There’s a systemic gap, Varian points out, between the low-frequency data employed by governments and the high-frequency data of business. Government is working on it, though. “It’s now using supermarket scanner data to predict inflation rates,” notes Varian. How did it predict them before? “They used to send people out with notebooks to write it down.”

More communication between government and business clearly benefits both, says Varian. Business can provide more real-time data. “If you look at most businesses now, pretty much everyone – think of UPS, FedEx, MasterCard – has a real-time database. And that’s powerful.” Government can, in turn, aggregate information, giving businesses insight into their industry and the economy as a whole.

Se você não conhece Hal Varian, envergonhe-se.

Via Think Quartely, uma puta publicação do Google feita na Inglaterra.

O Firefox anunciou em Janeiro planos para criar uma ferramenta chamada “Do not track“, permitindo aos usuários que seus dados de navegação não fossem u$ado$ por determinadas empresas. E olha que essa ferramenta ainda precisava que a empresa concordasse em não usar esses dados, mesmo que o usuário solicitasse que seus dados não fossem usados. Mas nem isso foi suficiente para que os idiotas do mercado entendessem. Em uma reunião com o mercado para apresentar a nova ferramenta, o CEO da Mozzila, Gary Kovacs, ouviu coisas como:

‘You’re breaking the web. It’s an economic model,’

‘If you do this, you’re single-handedly breaking the web. It’ll be a great place for a non-profit, but you don’t understand the web.’

Aí o CEO tratou de esculachar os míopes:

‘So you’re telling me your entire business model is based on your users not knowing what you’re doing with them? Is that how it works?’

A resposta foi um silêncio ensurdecedor. Gary continuou:

‘I’ll assume that’s a no. So then you’re reaction must be that you don’t think you can create an experience great enough that they’ll actually overtly subscribe to it. Is that true?’

E dá-lhe mais silêncio. E dá-lhe Gary:

‘So what else do we have to talk about? Why don’t we talk about how we solve this problem?’

Só me resta fechar o post com o simpático macaquinho.

Já ouviu falar nisso? O The Guardian explica:

Every month more evidence piles up, suggesting that online comment threads and forums are being hijacked by people who aren’t what they seem.

The anonymity of the web gives companies and governments golden opportunities to run astroturf operations: fake grassroots campaigns that create the impression that large numbers of people are demanding or opposing particular policies. This deception is most likely to occur where the interests of companies or governments come into conflict with the interests of the public. For example, there’s a long history of tobacco companies creating astroturf groups to fight attempts to regulate them.

After I wrote about online astroturfing in December, I was contacted by a whistleblower. He was part of a commercial team employed to infest internet forums and comment threads on behalf of corporate clients, promoting their causes and arguing with anyone who opposed them.

Like the other members of the team, he posed as a disinterested member of the public. Or, to be more accurate, as a crowd of disinterested members of the public: he used 70 personas, both to avoid detection and to create the impression there was widespread support for his pro-corporate arguments. I’ll reveal more about what he told me when I’ve finished the investigation I’m working on.

O mais interessante é que lá pelo meio do artigo é mencionada a empresa HBGary Federal. Esta empresa foi contratada pelo Bank of America para minar o Wikileaks, que contém dados sigilosos e embaraçosos do banco e promete jogar a merda no ventilador em breve. E eles fizeram isso rasgando um por um os artigos da constituição americana. O mais bacana é que uma das formas de minar o Wikileaks era indo pra cima do Anonymous, um grupo de hackers justiceiros que atacavam empresas que boicotavam o Wikileaks. A HBGary desvendou alguns de seus integrantes, e recebeu como troco uma invasão no seu sistema e a publicação de dados que mostram bem a forma anti-ética como eles trabalham. Essa matéria da Wired conta tudo. Abaixo, um pouco da metodologia da HGBary:

• Companies now use “persona management software”, which multiplies the efforts of each astroturfer, creating the impression that there’s major support for what a corporation or government is trying to do.

• This software creates all the online furniture a real person would possess: a name, email accounts, web pages and social media. In other words, it automatically generates what look like authentic profiles, making it hard to tell the difference between a virtual robot and a real commentator.

• Fake accounts can be kept updated by automatically reposting or linking to content generated elsewhere, reinforcing the impression that the account holders are real and active.

• Human astroturfers can then be assigned these “pre-aged” accounts to create a back story, suggesting that they’ve been busy linking and retweeting for months. No one would suspect that they came onto the scene for the first time a moment ago, for the sole purpose of attacking an article on climate science or arguing against new controls on salt in junk food.

• With some clever use of social media, astroturfers can, in the security firm’s words, “make it appear as if a persona was actually at a conference and introduce himself/herself to key individuals as part of the exercise … There are a variety of social media tricks we can use to add a level of realness to fictitious personas.”

Concluindo:

Software like this has the potential to destroy the internet as a forum for constructive debate. It jeopardises the notion of online democracy. Comment threads on issues with major commercial implications are already being wrecked by what look like armies of organised trolls – as you can sometimes see on guardian.co.uk.

The internet is a wonderful gift, but it’s also a bonanza for corporate lobbyists, viral marketers and government spin doctors, who can operate in cyberspace without regulation, accountability or fear of detection. So let me repeat the question I’ve put in previous articles, and which has yet to be satisfactorily answered: what should we do to fight these tactics?

Com este nome, é claro que estou falando de Hans Rosling.

Via UoD

Aperitivo de um PDF fresquinho da The Economist sobre dados e o tal dilúvio informacional.

Análise foda em cima de possibilidades sobre o Linkedin:

The company has some 70 million members. That’s data on 70 million careers. Conceivably, the company could provide a service showing each one of us the paths that others took when they were in the same position we’re in now. It could diagram where those choices led. … “Maybe he ends up deciding to be a high school math teacher,…” Nishar [Vice President of Products and Services at LinkedIn] says. In that case, he could find current math teachers who have followed that path and debrief them.

Esse trecho é tão foda que me lembrou da palestra mais foda que eu já vi na minha vida, e que já postei aqui. A forma como os dados são trabalhados é estonteante.

Danah Boyd é referência obrigatória para entender conceitos como privacidade e transparência, tão em voga nos dias de hoje e cuja importância certamente crescerá nos próximos anos. Nessa palestra, ela aborda esses dois temas tendo como pano de fundo o conceito de “opengov”.

1. Information is power, but interpretation is more powerful

2. Data taken out of context can have unintended consequences

3. Transparency alone is not the great equalizer

A Time tá com uma matéria foda sobre sistemas de recomendações baseados em dados, como os da Netflix. Uma parte que fala sobre seu efeito na cauda longa me chamou a atenção:

The general effect of recommendation engines on shopping behavior is a hot topic among econometricians, if that’s not an oxymoron, but the consensus is this: they introduce us to new things, which is good, but those new things tend to be a lot like the old things, and they tend to be drawn from the shallow pool of things other people have already liked. As a result, they create a blockbuster culture in which the same few runaway hits get recommended over and over again. It’s the backlash against the “long tail,” the idea that shopping online is all about near infinite selection and cultural diversity. It has a bad habit of eating its own tail and leaving you back where you started.

Obviamente que isso não se resume às compras:

How far will it go? Will we eventually surf a Web that displays only blogs that conform to our political leanings? A social network in which we see only people of our race and religion? Our horizons, cultural and social, would narrow to a cozy, contented, claustrophobic little dot of total personalization.