Tachyon 1.0 rebranded as Alluxio with fame of providing 30X performance boost to baidu

February 26, 2016

Chinese search giant Baidu.com claimed to have got 30X performance boost by using opensource project from AMPLabs called Tachyon used in a 1000-node clusters fetching data remotely from different data centers. It claimed to have got its query time down from 100-150secs to 5 seconds by leveraging distributed caching mechanism of Tachyon using 50TB RAM space to provide a distributed memory. This is one of the success stories at the 1.0 release and rebranding of Tachyon as Alluxio (www.allluxio.com) this week.

Screen Shot 2016-02-25 at 5.03.48 PM

Alluxio solves a network bottleneck in a big data processing. As depicted in above diagram, it provides an abstraction over various filesystems and provide a a data foundation to various processing engines like Spark, Impala, Flink, HbAse, Presto, …

Alluxio 1.0 comes up with few good announcements. One of them I like is Alluxio key value api for immutable pairs. We can now store KV pairs without applications need to access data as files /block.

Alluxio 1.0 also come up native integration with OpenStack Swift and Alibaba’s Aliyun filesystem.

It also provides a security model as user/group and ACL.

Anyway, a lot to try out, including improvents of Tachyon, sorry, Alluxio deployments with Mesos or YARN.


Notes On Nebula, RightScale and Scalr from their pitches at CloudCamp

November 9, 2011

Yesterday, at Cloud Camp, I had an opportunities to hear pitches from Nebula, RightScale and Scalr.

Scalr’s young CEO Sebastin positioned Scalr for Auto Scaling, Recovery and Server management for Apps. In a lively 5 mins pitch, he covered how Sclar concentrates on Apps management and does the seamless recovery and auto scaling.

Sean Chuo from RightScale focused on RightScale positioning as a layer between Apps and lower level infrastructure like server, storage etc. With its templates, RightScale is a great tool for implementing your own cloud.

Most interesting presentation was from Nebula’s Chris Kemp. He covered the hostory of Nebula project at NASA and how it helped to tame down 7B huge investment with cloud infrastructure where scientists can use computing on demand and quickly by reducing. Since he is out of NASA now, he told how some senators refused this optimization in fear of loosing jobs. However, how this project got support from Vivek Kundra and then how President Obama’s trasparent government project USASpending.gov was on cloud where Mr. President himself was a consumer! He had a picture showing President Obama using the site.

Google’s People Management Rules

March 14, 2011

NYTimes published an article today on the people management project that Google HR undertook to analyze its managers. This project, called as Project Oxygen came up with following rules and pitfalls.


google's 8 Rules - Source Google / NY Times

Any Googler can comment on this? How is it working out?

Enterprise Wide Testing (EWT) or CST (Customer Scenario Testing)

July 3, 2009

When I was an architect in Annuncio (acquired by Peoplesoft in 2001), there was an issue; how to assure that customers production systems run without any issues.  Steven Plaza, a smart QA Director, then came up with a plan which later called as Enterprise Wide Testing (EWT). The concept was to “continuosly” run target customers few complex scenarions in a “compressed” way. The scearios should include all different types of operation happening on the system. e.g. launching new applications, clearing some caches, purging, heavy load of inbound requests, large concurrent batch programs etc pounding on. The aim is to simulate production like scenario.  Then the second aspect, if customer needs say 7  or 14 days uptime without a glitch. How many calendar days we would need if the runs need to be repeated as the product improves by resolving encountered issues. A solution is to give higher load to compress the schedule. For example, if you need to compress 7 days assurance in 1 day, the loads of all types of operations need to be increased by roughly 7 times (assuming it is linear).  This approch is very appropriate for assuring the uptime of complex softwares.

Oracle launched Fusion Middleware AS11

July 1, 2009



Daily status /SWAT meeting

June 26, 2009

SWAT Meetings

Read the rest of this entry »