Do Cloudfoundry and OpenShift have a Future?

August 3, 2011

I am wondering if Cloudfoundry and OpenShift will become success stories (like Amazon Web Services). The idea of an open source application serving platform sounds intriguing, however, the community pages are somewhat deserted and documentation is poor… After downloading the Cloudfoundry source code, it looks more like a prototype. What do you think?

 

Notes from geekSessions – Network and Infrastructure Scalability

July 27, 2011

Here are my notes from today’s geekSessions 2.2 in San Francisco:

Allan Leinwand, CTO Zynga

Allan gave a short talk on Zynga’s infrastructure, in particular Z Cloud, and Amazon EC2-compatible private cloud. Seems like another proof that AWS is the de-facto standard, at least for compute cloud and storage cloud solutions. If you want to build a hybrid cloud solution, better make sure that it integrates with EC2…

Big Switch

Next up was a tech guy from BigSwitch who promoted an open source network virtualization software, named OpenFlow.

Mike Christian, Business Continuity Planning Yahoo!

Mike reminded us that data centers sometimes go down. When you manage 45 of them, probability is high that one of them disconnects once a week or so, due to a multitude of potential failures: network instability, HVAC failures, UPS failures (apparently a big problem), generator failures – and more mundane issues, such as a leeky roof or a hungry squirrel.

The advice: focus on impact duration, not incident duration, i.e. being able to fail over traffic from one DC to another within minutes, use DNS-based Global Server Load Balancing, degrade service gracefully.

Gleb Budman, CEO Backblaze

Gleb showed how to build an Internet-connected backup server for $5/month. Backblaze targets consumers and small businesses, and does not enforce a storage space limit. Average users store a bit more than 50 GB. Backblaze certainly is cheaper than Amazon S3, on the other hand does not offer (geo-)replication but only RAID redundancy and other nice things like a Web service API, et cetera. Well, you get what you pay for. Not everybody needs a cloud.

Cliff Moon, Co-Founder Boundary

Cliff gave a very entertaining talk, complaining about old-fashioned (client, app, OS, and network) monitoring tools and evangelized the next generation of monitoring tools, like OpenTSDB, and – you guessed it – Boundary.

Open Source PaaS – Solving the Hold-up Problem in Cloud Computing

June 8, 2011

CloudAve contributor Krishnan Subramanian discusses the recent trends in the enterprise PaaS space. He distinguishes three models of service delivery: the Heroku Model, the Amazon Model, and the Federated PaaS Model. The Heroku Model is a closed monolithic platform, the Amazon Model is a closed modular platform, and the Federated PaaS Model is a platform that can be set up across multiple infrastructure providers.

Open Source PaaS and the Hold-up Problem

We all have experiences with the first two models (e.g. App Engine and AWS) and I guess that most people would agree with me that building scalable and robust Web applications has never been easier and more fun. However, both models have a significant disadvantage as they are (must be?) based on proprietary software. Michael Schwarz and Yuri Takhteyev argue that proprietary software creates economic inefficiencies and that open source is a solution to these inefficiencies in some sectors. I strongly believe that the enterprise cloud computing marketplace is one of these sectors.

Schwarz and Takhteyev explain that proprietary software is a source of economic inefficiency because it can lead to the hold-up problem and thus cause underinvestment in complementary technologies. Let me give you an example. You build an application (complementary technology) and deploy it to a closed PaaS. A few months later, the platform service provider sends you a letter to inform that unfortunately the prices of the service offering will rise. If you anticipate this situation, you might not wish to build the application in the first place (underinvestment). The main problem is that when you build the application, the bargaining power of the platform service provider increases and he can negotiate higher prices or support fees ex post.

Why is the hold-up problem a particularly significant problem in cloud computing? Because off-the-shelf software does not work for large-scale applications. This is the reason why every major Internet company built their own (usually proprietary) distributed data stores (GFS, BigTable, Dynamo, Cassandra, …). These software systems are highly customized for specific types of applications. Now back to Schwarz and Takhteyev: proprietary software (binary) is an excludable good; source code is de facto non-excludable (you can copy it). This means that you can only get access to the proprietary software and execute modifications through the software vendor (same argument for services by replacing “software” with “service”). With open source software, you do not depend on the software vendor to implement the features that you need for your specific application. You can go ahead and build it yourself.

By the way, Salesforce explores a different path to solve the hold-up problem: vertical integration of Heroku. I guess this makes sense for Salesforce, considering the closed nature of their platform.

Conclusion

The closed PaaS model creates economic inefficiency, not only because of monopoly pricing but also because of the hold-up problem. The hold-up problem increases the bargaining power of the platform provider when the platform user makes complementary investments. This problem can become particularly unpleasant for enterprises who want to innovate and cannot depend on the platform provider to implement the technologies they need. It can also be a problem for enterprises that compete in a similar sector of the IT industry.

Open Source PaaS (Cloudfoundry, OpenShift) is a solution to this problem. You can set up the platform in a federated environment and migrate to a different infrastructure provider (data center, compute cloud) if the cost structure changes. Moreover, you can modify the software and adjust it to the particular requirements of your applications.

Nevertheless, the Open Source PaaS model can only be a success if the community provides the necessary input to make these platforms as stable as App Engine and Heroku.

CAP Revisited

January 11, 2011

PACELC

I just found out about an excellent blog post on the CAP theorem by Daniel Abadi. I agree with his assertions that the CAP theorem is somewhat confusion and like the concept of PACELC. In my opinion, the main problem is an unclear understanding of what C, A, and P actually mean.

My 3 cents on consistency, availability, and partition-tolerance:

Consistency

Imho, it does not make much sense to talk about consistency without a specific application or system architecture in mind. For example, in the case of a relational database, consistency usually equals integrity constraints & atomic transactions; in the case of Dynamo/Cassandra it is specified by the N-R-W configuration of the quorum protocol (or hard-wired consistency level); et cetera.

Availability

In a similar sense, when defining availability, there should be a time constraint (e.g. if the ping does not return after 2 secs, the app is down). Everything in between [0, 2] is the “availability spectrum”, aka latency. Apparently, it is again application-specific what the upper bound should be and when we talk about latency as opposed to availability.

Partition-tolerance

I agree that from a client perspective there is no difference between unavailability due to server failure and unavailability due to network partition. However, different repair mechanisms will be used in either case so it might make sense to differentiate when looking from a system perspective (?)

What do you think?

Is Facebook worth $50Bn?

January 5, 2011

Is the Goldman Sachs valuation of $50Bn for Facebook reasonable? Here is what we know about Facebook:

  • > 500 Mio active users
  • 50% of active users log on to Facebook in any given day
  • The average user has 130 friends
  • People spend 700 billion minutes per month on Facebook

Sanity check:

According to the Goldman Sachs valuation

  • each active user is worth $100
  • Considering a lifetime of Facebook of T years*
    • for T=1: one useryear is worth $100
    • for T=5: one useryear is worth $20
    • for T=10: one useryear is worth $10
  • each minute/month spent on Facebook is worth 50/700 = $0.0714
  • Considering a lifetime of Facebook of T years*
    • for T=1: each minute is worth $0.0059
    • for T=5: each minute is worth $0.0012
    • for T=10: each minute is worth $0.0006

Let’s say, Facebook will be around for another 10 years with a stable user base. Is it reasonable to spend $10 per useryear, or 0.06 cents per userminute, respectively?

What are major sources of Facebook’s revenue?

Crunching numbers:

  • 8 minutes spent on Facebook equals 1 ad impression
  • Every year, the average user consumes 2000 ad impressions via Facebook
  • The current CPM for Facebook ads is max $2 (1tr * $2/1000 = $2Bn)
  • Considering a reasonable average CPM of $3, this makes a potential future revenue of $6 per user per year
  • Every year, the average user spends $1-$1.6 on Zynga’s virtual goods and ads, out of which Facebook gets $0.3-$0.5

This rough calculation shows that Goldman might actually be right with their Facebook valuation. Time will show.

Scala – Getting Started (2)

December 31, 2010

Scala by Example

Methods

object HelloWorld {
	def greet(name: String, greeting: String = "Hello") =
		greeting + ", " + name
}

import HelloWorld._

println(greet("Markus"))
println(greet("Markus", "Whazzup"))

Output:
Hello, Markus
Whazzup, Markus

In Scala methods, you can define default values, such as greeting: String = "Hello" in the example shown above. This is a nifty alternative to overloading methods.

Scala – Getting Started (1)

December 29, 2010

There are many operating systems and IDEs. I will cover my favorites briefly.

Installing Scala on Ubuntu (10.04 LTS)

sudo apt-get install scala
wget http://www.scala-lang.org/downloads/distrib/files/scala-2.8.1.final.tgz
sudo tar -C /opt/ -xvzf scala-2.8.1.final.tgz
PATH="$PATH:/opt/scala-2.8.1.final/bin"

Now re-start your shell.

scala -version

Reference:
http://tipstank.com/2010/11/26/install-scala-on-ubuntu/

Installing Scala on Mac (10.6.5)

First of all: Install MacPorts.

sudo port install scala28

and then

for n in $(ls /opt/local/bin/scala*); do sudo ln -s $n /usr/local/bin/$(echo $n | sed -n -e 's/.*\/\(.*\)-.*/\1/p'); done

Reference:

http://www.feastforeyes.com/2010/08/installing-scala-on-mac/

Installing the Eclipse Plug-in

Go to http://www.scala-ide.org/ and get the “Update Site” URL. Open your Eclipse IDE, click on the “Help” menu, select the “Install new Software” item and paste the URL. Then select the plug-in and install. (this process can look a bit different depending on the Eclipse IDE version that you use – I tried it with Eclipse for Java EE Helios). The Eclipse plug-in seems to be somewhat buggy, though…

Installing the TextMate Bundle and Plug-in

1. Download the Scala bundle and install it:

git clone git://github.com/mads379/scala.tmbundle.git
open scala.tmbundle

Add the SCALA_HOME variable in TextMate (Preferences > Advanced > Shell Variables).

Paste the following code into your ~/.ctags file

--langdef=scala
--langmap=scala:.scala
--regex-scala=/^[ \t]<em>class[ \t]+([a-zA-Z0-9_]+)/\1/c,classes/
--regex-scala=/^[ \t]</em>trait[ \t]+([a-zA-Z0-9<em>]+)/\1/t,traits/
--regex-scala=/^[ \t]*type[ \t]+([a-zA-Z0-9</em>]+)/\1/T,types/
--regex-scala=/^[ \t]<em>def[ \t]+([a-zA-Z0-9_\?]+)/\1/m,methods/
--regex-scala=/^[ \t]</em>val[ \t]+([a-zA-Z0-9<em>]+)/\1/C,constants/
--regex-scala=/^[ \t]*var[ \t]+([a-zA-Z0-9</em>]+)/\1/l,local variables/
--regex-scala=/^[ \t]<em>package[ \t]+([a-zA-Z0-9_.]+)/\1/p,packages/
--regex-scala=/^[ \t]</em>case class[ \t]+([a-zA-Z0-9<em>]+)/\1/c,case classes/
--regex-scala=/^[ \t]*final case class[ \t]+([a-zA-Z0-9</em>]+)/\1/c,case classes/
--regex-scala=/^[ \t]<em>object[ \t]+([a-zA-Z0-9_]+)/\1/o,objects/
--regex-scala=/^[ \t]</em>private def[ \t]+([a-zA-Z0-9_]+)/\1/pd,defs/

2. Download the SBT plugin and install it (apparenty assumes that you use SBT)

Create a shell script with the code posted below next to your sbt-launch-*.jar and create a shell variable SBT_PATH in TextMate (see part 1) that points to the script.

java -Xmx1512M -XX:+CMSClassUnloadingEnabled -Dsbt.log.noformat=true  -XX:MaxPermSize=256m -jar dirname $0/sbt-launch-0.7.4.jar “$@”

 

 

Reference:

http://www.sidewayscoding.com/2010/08/using-textmate-for-scala-development.html

Playing with the Interactive Scala Interpreter

Open your shell, then type scala. Now you can do awesome things like this:

scala> val foo = "bar"
foo: java.lang.String = bar


scala> println(foo)
bar

Running a Toy Program

object Greeting {
	def main(args: Array[String]) = println("Hello World")
}

That’s it for now. Stay tuned…

WipeCoin Startup

November 24, 2010

As a serial entrepreneur, my goal is to start one company per year. This year, two friends and I invented WipeCoin, a revolutionary cleaning mechanism for smartphones: the combination of a case with a microfiber-coated “coin”. The coin can be clipped off the case for wiping the screen clean. After cleaning the screen, the coin can be clipped back into the case.

Our first product line of WipeCoin coins, plus exclusive cases for iPhone 4, is in the making. You can support us by joining our WipeCoin facebook fan page, follow us on Twitter, and vote for you favorite WipeCoin colors.

Stay tuned…

Gwtissandra: GWT + Hector + Cassandra

November 23, 2010

A Twitter Clone Clone

I built a toy application (Gwtissandra) to better understand Cassandra’s data model and new features of Cassandra 0.7, such as secondary indices and flexible schema updates. Gwtissandra is basically a GWT/Java port of Twissandra, a Twitter clone clone, so to say.

The main parts that Gwtissandra is based on:

  • a Google Web Toolkit (GWT) front-end,
  • a Cassandra Java client, Hector, and
  • a Cassandra server.

Next Steps

I am currently busy with a couple of other projects but I hope that over Christmas I will find a bit time to

  • clean up the code and make the app look nicer (using MVP or so),
  • port the app to Scala to make it even more cleaner and nicer,
  • once there is a stable object-mapper library, try it out and replace awkward queries,
  • set up build scripts etc. (not sure if this is a good idea, though),
  • make a short video/screencast.

If you have suggestions or would like to help, drop me a message and I will come back to you. Thanks.

Cassandra – migrating from 0.6 to 0.7

November 15, 2010

I recently moved from playing with Cassandra 0.6 to playing with Cassandra 0.7 beta. One of the problems I ran into was this one:

keyspace does not exist

This is why:

Prior to 0.7, cassandra loaded a set of static keyspaces defined in a storage-conf.xml file. CASSANDRA-44 added the ability to modify schema dynamically on a live cluster. Part of this change required that we ignore the schema defined in storage-conf.xml. Additionally, 0.7 converts to YAML based configuration.

If you have an existing storage-conf.xml file, you will first need to convert it to YAML using the bin/config-converter tool, which can generate a cassandra.yaml file from a storage-conf.xml file. Once you have a cassandra.yaml, it is possible to do a one-time load of the schema it defines. 0.7 adds a loadSchemaFromYAML method to StorageServiceMBean (triggered via JMX: see https://issues.apache.org/jira/browse/CASSANDRA-1001 ) which will load the schema defined in cassandra.yaml, but this is a one-time operation. A node that has had its schema defined via loadSchemaFromYAML will load its schema from the system table on subsequent restarts, which means that any further changes to the schema need to be made using the system_* thrift operations (see API).

Here is what you can do:

Open your shell, type jconsole. The jconsole opens.

1) Select the org.apache.cassandra.thrift.CassandraDeamon

2) Select org.apache.cassandra.db > StorageService > Operations

3) Click loadSchemaFromYAML

Done.


Follow

Get every new post delivered to your Inbox.