Archive for February, 2011

Games and Decisions | take #02

February 12, 2011
  • Understanding incentives in online news distribution ? How about “social sharing” incentives ?  Consider the “like” game on Facebook and derive equilibrium conditions.

daily snapshot | 10 february 2011

February 10, 2011
  • Reverse incentive – bootstrap the writing by introducing “motivational” example as core problem
  • Cloudera | Flume – handling thrift dependencies -> we should be able to build without system-level Thrift support
  • Troubles with ANTLR runtime and interpreting in-line Flume query syntax – even a basic syntax errors result in unreadable exceptions – something to work on
  • Stream splitting using arbitrary keys – “signaled” in Decorator-time with sink writing data to multiple outputs
  • Millennial generations used to ever-changing reality began to sense the world – transient, unstable, capable of change in the tune of their ideas. For the first time – no big questions – no irony in existence. Just like science a century ago – technology brought in a new religion – belief in the potential for change of the very fiber of things – controllably manipulating senses, twisting the data, creating new worlds. Eternal contempt for the flesh, once again sensed a way out. It is clearly visible how the media of the time had a genuine belief that they have made it this time  – that the filters and colors put in place to create these synthetic realities are finally making us free – that we have won. So what happened ? Why, only a decade later, we’re back waging the old war, touching decaying bodies, sensing the aging skin and learning to walk again. Why is it that we’re once again trembling in fear, thankful for every fragment of sensation we can get our hands on. And once again dancing in oblivion, waiting for new hope to arrive ?
  • flume node_nowatch -n dump -c “node:src|snk;”
  • Adding new Flume decorators ? Standard ones are defined on reserved-keyword level.
  • Adding new data sources ? SourceFactoryImpl holds a static list of source->class mappings – not available on the conf-level
  • Establishing simple host-to-host console pipe:

host01$ flume node_nowatch -1 -s -n dump -c ‘dump: console | agentBESink(“host02″);’
host02$ flume node_nowatch -1 -s -n dump -c ‘dump: collectorSource(35853) | console;’

daily snapshot | 5 february 2011

February 5, 2011

  • Experimenting with node.js in creating simple in-memory micro-data aggregation & classification engine
  • “As more and more people start generating social content – game-theoretical analysis becomes a necessity”
  • Event loop as a general mean of achieving cost-effective scalability. Distributed event provider/handlers ? Means of achieving (decentralized) distributed polling ?
  • Impact of javascript and “modern” web on the perception of “efficiency” of modern hardware ? As time progresses – we expect the Moore to work it’s magic and keep improving the speed of computing. However – most of the time the average consumer spends on a computer is within the browser and we’re seeing developers being “creative” with all sorts of flashy .js often built in non-optimized manner. This is resulting in the fact that as we move forward – benefits of hardware improvements are resulting in sublinear improvements in the perceived average speed for the consumer.
  • “Unoptimized JavaScript is killing Gordon Moore – one wasted cycle at the time”

daily snapshot | 4 february 2011

February 4, 2011
  • Game-theoretical “Web Algorithms” ? Social “games” clearly represents a ideal playground for experimental exploration of the theoretical concepts.
  • Yahoo! Econ & Social Sci research group
  • Using Game Theory to understand news dynamics. If we assume that there exists a competition between news sources in terms of the timing and news content – the challenge of creating ideal “news aggregator” service requires understanding the competitive relation between data sources. In general – this problem can be extended as a general paradigm in designing any service that aggregates partially-overlapping data from a number of sources in the same market. Learning the both the general equilibrium relations between sources and their short-term impact on the new data can help us learn how to rank, classify, describe inter-dependencies and “timeline” of the data lifecycle.