Finding a needle in 20 million haystacks: CERN’s Computing Grid creates a vision for Cloud Business Services

In order to understand better the business potential of Cloud-esque environments to drive innovation, you could do a lot worse than have a look at the scientists who’ve been using their own version of Cloud for years:  Grid Computing.   When you get into areas such as particle physics, these folks need all the computing grunt and pooled brain-power they can muster to succeed. 

At HfS, we’ve partnered with the Outsourcing Unit at the London School of Economics (LSE) to determine the future potential of Cloud Business Services by studying the needs, concerns, intentions and views of business-line executives, and not solely the IT department. 

There’s been so much noise focused on the technology implications of Cloud, and not enough attention placed on how business executives intend to apply Cloud services within their own business environments.  At the end of the day, some firms will succeed in driving down IT infrastructure costs using Cloud models, but the real momentum will come from the business processes that can be delivered to organizations that have all the associated application workflow and infrastructure already provisioned in the Cloud.    

We’ll be launching a study very shortly with the LSE and will appreciate all of you taking part, but first we wanted to talk about the LSE’s experiences with the Worldwide LHC Computing Grid (WLCG); a global collaboration phenomenon that links grid infrastructures and computer centres worldwide. Its purpose is to distribute, store and analyse the immense amounts of data generated by  a gigantic scientific instrument on the Franco-Swiss border, called a Large Hadron Collider (LHC) at The European Organization for Nuclear Research (CERN), which is used by physicists to study the smallest known atomic particles.  This LHC is the largest scientific instrument on the planet, producing 15 Petabytes (15 million Gigabytes) of data annually, which thousands of scientists around the world access and analyse.

The idea is to provision a data storage and analysis infrastructure for the entire high-energy physics community – not too dissimilar from a Private Cloud environment where users can plug in to the shared environment and access the  applications they need, without stacks of IT hardware in the basement to house the data, or IT personnel on site needed to maintain and support the infrastructure.  Today, the WLCG combines the computing resources of more than 100,000 processors from over 130 sites in 34 countries, producing a massive distributed computing infrastructure that provides more than 8,000 physicists around the world with near real-time access to LHC data, and the power to process it.

OK – that’s a lot of numbers, so we managed to grab the LSE’s Dr Will Venters as he was venturing off the squash-court to his local pub, to explain more to us business philistines why this project is to relevant to Cloud services and outsourcing…

Dr. Will Venters, the London School of Economics

Phil Fersht: Will, you’re involved in a fascinating research study that focuses on how thousands of particle physicists around the world are collaboratively using the grid – a microcosm of the cloud – to capture, process and analyze huge volumes of data being produced by CERN, Europe’s particle physics laboratory in Geneva. For readers who aren’t well-versed in particle physics, would you please give us a brief overview so we can understand the importance of using the grid for the data the physicists are working with?

Dr. Will Venters:  Sure Phil. Particle physicists recreate the conditions just after the “big bang” and analyze particle collisions to discover the mechanisms by which the universe, and therefore the atoms and molecules that form all matter came into being. They reproduce these collisions in CERN’s Large Hadron Collider, or LHC, which produces vast numbers of three-dimensional pictures of particle collisions for the physicists to analyze for “new physics” events. One of the most interesting examples is to discover the Higgs Boson, the so-called “god particle,” which could provide an explanation for “mass” in the universe, hence linking gravity into the standard physics model. But finding one Higgs is like finding a needle in 20 million haystacks, so the physicists must analyze a massive number of pictures – which equates to 12 to 14 million gigabytes of data per year – if they are going to find enough evidence to prove it actually exists. So the LHC Computing Grid, which consists of many distributed computers, CPUs and disk servers at over 170 computer centers around the world, was created to give 8,000 physicists in 34 countries the ability to draw on very large amounts of computing power to collaboratively review and analyze all the particle collisions created in the LHC.

As a lecturer in information systems at the London School of Economics, I was fascinated by how they coordinate themselves and go about developing and managing this widely distributed resource. I therefore got a research grant and employed a team to follow the particle physics community in the UK and CERN as developed its grid for the LHC..

Phil Fersht: So what have you observed during your research?

Dr. Will Venters: The particle physics community has a long history of developing quite advanced prototype computer systems. It’s also a community with a long history of collaborative work practice. And they collectively understood the only way they were going to be able to realize the data from the LHC was to get the grid to work. One very interesting thing we observed is they didn’t go about it way normal project managers would do it. They approached it as scientists and as a scientific endeavor, rather than as developing a large-scale computer system the way a big systems integrator might. They have very informal organizational structures. There is a strong hierarchy in that somebody is the leader, but they don’t have the power or muscle to drive things. They just use more charisma and soft leadership type techniques in order to drive the project forward. But it’s a project being collectively driven by a very committed group of people. Interestingly, they use pretty un-advanced collaboration tools. They use blogs, wikis and very simple video conferencing – but they use them an awful lot. They’ve developed a way of working with these relatively simple web tools that not only helps pull the project together but also helps hold the sense of community together in a much different way than the formal control type management you might see elsewhere.

Phil Fersht:  The dynamics within this collaborative community sound fascinating. Can you talk a bit more about more about the scientists go about organizing discussions, learning from each other, sharing findings etc.?

Dr. Will Venters:  We developed a distinctive description of the physicists work practices based on the idea of paradox and tensions… the only way we can effectively describe what this community is doing is the idea of paradox. While they’re individually being quite fluid and flexible, they were also quite tightly focused on developing their grid and getting it to work so they could produce data. But there was tension coupled with anxious confidence due to the community’s long history of previous creative and successful work. One of the things we observed was the idea of learned improvisation – that you don’t improvise just because you can. You actually learn how to do it in the same way you learn play to jazz, and even though jazz is highly improvisational, there are actually themes running through it. Similarly, this community had themes running through it, and the members improvised based on many things they’d done and learned in the past. Another thing we observed was the tension between wanting to organize, control and have strong collaborative structures, versus the need to say, “We’re all clever individuals and work really hard, and so we should all be allowed to have individuality and the ability to work on our own.” When you visit CERN, you see the kind of rocket science side of things, this massive, great experiment. But parts of CERN are like a 1950’s university campus with drab offices and basements filled with old bits of rusting technology. I think that well describes how they are collectively comfortable with accepting bits of imperfection as long as the important parts are working.

Phil Fersht:  When you look at the project in its entirety and where it is today – what has been achieved that wouldn’t have been without the grid?

Dr. Will Venters: They wouldn’t be able to do the extremely high level of precision analysis required without access to the grid. The huge volume of data produced by the LHC needs something in the form of grid technology to allow the physicists to keep track of it and to do the analysis. They couldn’t do it with clusters of computers or individual computers – they would just get lost in a jumble of data.

This project has also driven forward the science agenda in other sciences. In some sense, they’ve shown leadership in how to develop grid computing which has led to new developments in other areas of science.

Phil Fersht: What’s next for this project? Do you think this grid will move into more of a Cloud-based environment, or do you think it’s going to build upon its own infrastructure?

Dr. Will Venters: There is a move to see if the National Grid Service (electrical power) in the U.K. should become more of a Cloud type of resource for supercomputing. They are looking at whether they should be using cloud for peak demand and when the demand outstrips the capacity of even their grid. They are also looking at whether they should be providing a cloud resource to other areas. But once they can do the data analysis out of the LHC, their interest in the development of the grid will start to wane as working with 12 to 14 million gigabytes of information will become a trivial challenge in the long term. Their experience on previous experiments, and their hope, suggests that 10 years down the line they could buy a commodity piece of hardware, sit it in a machine room and it will probably be able to do the analysis on the LHC data on its own. Then the next experiment will come along demanding something new and different, and they’ll start developing something new themselves.

Phil Fersht:   In terms of the business world and what we see going on commercially with the development of Cloud, etc., what do you think are going to be the key opportunities and challenges for businesses trying to move into these types of collaborative networks?

Dr. Will Venters:I think a huge benefit they have in working in distributed collaborative ways is a sense of working and collaborating together, being open rather than closed. But the challenge is learning how to coordinate a group of individuals who have individual aspirations and motivations to a higher goal or bigger aim. Another challenge is supporting an unstructured network – what we call a knowledge infrastructure – not only their website, the wikis, the blogs and the communication infrastructure but also their sense of history and their sense of organizing themselves, who they communicate with and how they organize themselves into clusters of competence around particular areas. But the benefit comes from understanding they don’t need to be constrained by how they organize what they’ve done in the past, and how they manage that history and culture alongside these things so they can capitalize on what they know, and develop new knowledge, new techniques and new technologies. I think that the knowledge infrastructure around their work is the key part of it, and perhaps something the businesses would benefit from learning. But I also think doing so would require a large amount of cultural change to achieve what the particle physicists have. There are dramatic differences in culture and history of collaboration.

Phil Fersht:  In many organizations, it can get a bit political when we dare to question the stranglehold that many IT departments have over managing these networks and their infrastructure. Do you think we’re still many years away from these types of Cloud networks  becoming a mainstream business reality, or do you think it is closer than we envision, given the speed in which the LHC grid was developed?

Dr. Will Venters: I think at some level it’s a big challenge for business. To put it into perspective, a person I know recently told me a story about when he was shown an amazing usage-based piece of computer software at CERN which was written by a post-doctorate. My friend asked, “What happens if the post-doc falls under a bus?” The physics professor didn’t even blink, and said, “Well, we would find another post-doc straight away and get him or her to do something different.” The particle physics community is accepting of the incredibly challenging and experimental nature of their work – they readily accept a good enough, kind of messy around the edges but ultimately very innovative, very new thing. The concern I have about the debate around cloud for business is that we get too bogged down around the safety, the belief that we must massively mitigate risks. The conservativeness you sometimes see in businesses, and particularly in IT departments, will cause impediments to speedy cloud adoption. But I think we’re seeing stuff with the cloud happening in innovative parts of businesses; they’re just not necessarily being led from the more conservative IT departments.  There is a serious risk that competitors and innovators will collaborate using cloud resources – this risk of competition is something we should consider in our cloud models alongside risks of security, cost, lock-in etc.

Phil Fersht.  Will – thanks for your time with us – we’re excitied to be working on this upcoming study with you and the team!

Dr Will Venters (pictured) is a Lecturer for the Information Systems and Innovation Group  at the London School Economics. His research is centered on the development and use of IT technologies to support collaborative working. He is currently researching the development and use of Grid computing technology among experimental particle physicists for the LHC experiments at CERN.  More details on his publications can be accessed here.

Bookmark the permalink | Leave a trackback: Trackback URL

7 Comments

  1. Stephen Cohen
    Posted September 1, 2010 at 1:31 pm | Permalink

    Bravo! This is best discussion on how Cloud can work in real situations I have read – I look forward to reading your study,

    Stephen

  2. Steve Allen
    Posted September 1, 2010 at 1:58 pm | Permalink

    Will,

    Very good insights – especially around the cultural change these environments necessitate for successful collaboration.

    How long has it been taking for the scientists to adapt to use the grid – and for those that are new to it, are they now adapting faster than those early users during in the early days of the project?

    Steve

  3. Paul Schneider
    Posted September 1, 2010 at 2:33 pm | Permalink

    Phil and Will –

    It’s amazing how quickly people will seek to find new ways to collaborate, when their prime goal is knowledge and discovery, as opposed to money!

    Great discussion and thanks for airing,

    Paul Schneider

  4. Gaurav
    Posted September 1, 2010 at 8:27 pm | Permalink

    Great article.

    Understanding how environments like the LHC Grid can be applied, shows us how service providers can develop shared computing environments for multiple clients, where they can tap into common processing and resources. As Dr Venters points out, the fear of sharing resources and “collaborating” with competitors is one cultural aspect customers need to overcome,

    Gaurav.

  5. Posted September 1, 2010 at 8:35 pm | Permalink

    Gaurav: good point re the “competition” factor putting off business clients. Our recent research shows clients are much happier collaborating and sharing best practices with firms outside of their immediate vertical sector:

    http://www.horsesforsources.com/vertical-silos-070810

    Hence, for vertical Cloud solutions in industries such as Life Sciences, or Financial Services, I’d imagine the early adopters are going to be focused heavily on their data security and “private” Cloud environments,

    PF

  6. Posted September 2, 2010 at 3:09 pm | Permalink

    Thanks for all the comments and support. A couple of questions/issues were raised which I will try to respond to…

    **Steve asked: “How long has it been taking for the scientists to adapt to use the grid – and for those that are new to it, are they now adapting faster than those early users during in the early days of the project?”

    This is an interesting one – we are actually writing about this at the moment. They have been developing their Grid for many years, and users have slowly moved to it once they found their work could no longer be done using existing computing systems (e.g. Cluster Computing). Even though the LHC has only recently started tacking data they have been using their Grid for simulation and Monte Carlo data production (random simulated data to compare new data against).

    That said some of the physicists find it cumbersome to use the Grid – and this has led groups of users to develop various user-interface systems to the shared Grid which reflect their needs. Rather than just accepting the failings of the new system (or grumbing about it) they used their own IT skills to respond. So that instead of needing to learn to use the problems of the system they are shaping it to reflect their own needs. Indeed these interface systems are having profound affects on how the Grid is conceived and used.

    This has parallels in industry – as an example of “democratizing Innovation” as Eric Von Hipple calls it where users in various areas help improve the failings of their systems. Indeed this is perhaps an extreme example of the workaround we see in most industries – though in CERN’s case the workarounds are serious pieces of software in their own right.
    In summary they both adapting the Grid to reflect the needs of their users – and adapting their users to reflect the needs of the Grid!
    ——–
    **Gaurav said : “As Dr Venters points out, the fear of sharing resources and “collaborating” with competitors is one cultural aspect customers need to overcome,”

    I think this is a key point in our analysis – at CERN almost everything is open and up for sharing and they have a culture of collaboration. Some industries have bits of this (I did research involving an architect’s practice once and the architect said “there is little point worrying about intellectual property when another architect can walk inside your innovation to see what it looks like”). What is crucial however is that fear of competition and competitiveness do not lead to ignoring the risk of inaction. Obviously competition law applies – probably a significant reason why, as Phil says, it is often easier to collaborate and share outside the vertical sector.
    Cloud Computing could enable more collaboration and sharing – we know it provides a useful space to work outside the company (how many of us have used Google-Docs on projects because it made it easier to share with others outside the company firewall). We could also see the emergence of specific cloud collaboration services – perhaps SalesForce could support collaborative ventures by allowing pooling of parts of companies’ CRM but securing other parts away from view. This is however a techno-centric argument – the key challenges to collaborative working are always cultural and social – it isn’t the tools which limit collaboration it’s the desire and trust – trust of both of the other party and the medium of collaboration.

    Finally in my current research on Cloud Computing I see many people worrying about Cloud providers losing their IP assets, or opening them up to attack – forgetting that their key assets and intellectual capital leaves the office every evening and will often work for their competitor in a few months time. This is the way competitors have always shared intellectual capital – through hiring it. And as individuals are more and more mobile and connected using Web2.0 the ability of firms to control the spread of IP will diminish. Apple is one of the most secretive companies on the planet – yet we all saw the iPhone 4 before release, and we “know” they are working on a TV product not because their cloud provider slipped… but because people talk/loose things – are human.

    Will Venters.

  7. Posted September 2, 2010 at 3:37 pm | Permalink

    PS: If you want to read more of my views on the Cloud see http://utilitycomputing.wordpress.com

    Will Venters.

3 Trackbacks

  1. [...] their research work with CERN and it’s parallels with a Cloud business environment (see our recent blog interview), we realized the industry was in dire need of a definitive study that looks at how Cloud computing [...]

  2. [...] their research work with CERN and it’s parallels with a Cloud business environment (see our recent blog interview), we realized the industry was in dire need of a definitive study that looks at how Cloud computing [...]

  3. [...] their research work with CERN and it’s parallels with a Cloud business environment (see our recent blog interview), we realized the industry was in dire need of a definitive study that looks at how Cloud computing [...]

Post a Comment

Your email is never published nor shared.