<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.3.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>GridFactory</title>
	<link>http://www.gridfactory.org</link>
	<description>Distributed computing</description>
	<pubDate>Fri, 09 Dec 2011 10:25:05 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.3</generator>
	<language>en</language>
			<item>
		<title>Taking citizen cyberscience to the next level</title>
		<link>http://www.gridfactory.org/2011/08/19/taking-citizen-cyberscience-to-the-next-level/</link>
		<comments>http://www.gridfactory.org/2011/08/19/taking-citizen-cyberscience-to-the-next-level/#comments</comments>
		<pubDate>Fri, 19 Aug 2011 11:32:37 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/08/19/taking-citizen-cyberscience-to-the-next-level/</guid>
		<description><![CDATA[

Recently I&#8217;ve stumbled upon the terms citizen science and citizen cyberscience. The last term was apparently invented by Shuttleworth Fellow Francois Grey as a label for BOINC-based distributed computing projects like seti@home, folding@home and his own lhc@home. Grey is also behind the Citizen Cyberscience Centre in Geneva - on the web pages of which there&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<img class="alignleft" src='http://www.gridfactory.org/files/2011/08/citizen_science.png' alt='Citizen Science' align="left" />

Recently I&#8217;ve stumbled upon the terms <a href="http://en.wikipedia.org/wiki/Citizen_science">citizen science</a> and citizen cyberscience. The last term was apparently invented by <a href="http://www.shuttleworthfoundation.org/fellows/francois-grey/">Shuttleworth Fellow Francois Grey</a> as a label for <a href="http://boinc.berkeley.edu/">BOINC</a>-based distributed computing projects like seti@home, folding@home and his own lhc@home. Grey is also behind the <a href="http://www.citizencyberscience.net/">Citizen Cyberscience Centre</a> in Geneva - on the web pages of which there&#8217;s more information to be found about his motivation and ideas.
<br /><br />
All fascinating stuff IMO. Not so much because of the direct scientific impact these distributed computing projects may or may not have, but because of the potential of getting more people interested in science and &#8220;academic&#8221; knowledge. Like the <a href="http://www.khanacademy.org/">Khan Academy</a>, an example of how the Internet really <i>is</i> improving the world.
<br /><br />
Now, what does this have to do with GridFactory?
<br /><br />
A lot actually! I&#8217;ll argue that GridFactory is conceptually a logical evolution of the ideas behind distributed computing software like BOINC (there is also the older <a href="http://distributed.net/">distributed.net</a>). If you visit the <a href="http://boinc.berkeley.edu/">BOINC web site</a> or the <a href="http://en.wikipedia.org/wiki/List_of_distributed_computing_projects">Wikipedia list of distributed computing projects</a>, you&#8217;ll notice that despite what <a href="http://liftconference.com/distributed-computing-distributed-thinking">Grey</a> and <a href="http://www.sciencedaily.com/releases/2011/08/110808115331.htm">others</a>, say, distributed computing is not really enabling &#8220;ordinary&#8221; citizens to <i>do</i> science&#8230; yet.
<br /><br />
With BOINC-based systems citizens are passively witnessing their computer crunching away on some professional scientist&#8217;s problem. Yes, that might stimulate interest in the scientific problem at hand and science in general and that seems to be the hope and ambition of the enthusiasts behind these projects. A very commendable ambition IMO.
<br /><br />
But wouldn&#8217;t it be nice if the citizen could actually take a look at the script/code he&#8217;s executing, try to understand it, change some parameters, run the modified code himself on his own PC&#8230; heck, create his own computing grid and run his modified code on all participating computers?
<br /><br />
With GridFactory, all this and more is possible: collaborations are formed and destroyed on the fly; initiators of a collaboration can set up a software catalog (probably starting with a copy of someone else&#8217;s), a shared storage area and a compute cloud using their own, someone else&#8217;s, or even a central common server.
<br /><br />
Of course, if GridFactory were to be used for large-scale, voluntary, distributed computing, like BOINC - with legions of workers carrying out the computations of a few, i.e. used to create a large grid of untrusted workers, it probably would not be a good idea to make it too easy for the legionnaires to mess with the code. In such cases, jobs can simply consist in booting up a locked-down virtual machine from a trusted software catalog. 
<br /><br />
That said, one well-known problem of some BOINC projects is precisely the scale and the fact that contributors are so eager to contribute more than their peers, that some cheat and produce fake results. Therefore, smaller collaborations are not necessarily bad. The world may need global collaborations to solve global problems, but some level of fragmentation or compartmentalization may have its merits too. In smaller groups, people know each other and there&#8217;s less incentive to fake results and more opportunity for real collaboration and involvement. Sure, you&#8217;ll not be running on 500&#8242;000 CPUs like seti@home, but there are many &#8220;smaller&#8221; problems out there that deserve attention and that may not need millions of CPU-hours to yield useful results.
<br /><br />
The GridFactory equivalent of the BOINC client is the <a href="/share/">GridWorker</a>. Like BOINC, GridFactory also uses Apache and MySQL on the server side, but where BOINC implements the actual web service as CGI scripts, <a href="/services/">GridFactory&#8217;s web services</a> are implemented as Apache modules and designed to &#8220;talk&#8221; to each other, i.e. pull jobs from each other, and allow horizontal scale-out. But what really sets GridFactory aside, as compared to BOINC, is the integrated software catalog, the group functionality and the GridPilot GUI for creating and managing compute jobs.
<br /><br />
Finally, I&#8217;ll add that I&#8217;m not saying GridFactory could or should replace BOINC in large-scale voluntary computing projects - it is a far less mature software product; but I <i>am</i> hoping that the ideas and concepts behind GridFactory may serve as a source of inspiration for future developments of citizen cyberscience and help the overall democratization of the enterprise of science.
<br /><br />
<b>The <a href="/vision/">vision</a> of GridFactory encompasses precisely this: democratization of citizen science - allowing citizens to not only passively contribute to science, but to engage and actually <i>do</i> science.</b>]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/08/19/taking-citizen-cyberscience-to-the-next-level/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CERN School of Computing 2011 was&#8230; awesome</title>
		<link>http://www.gridfactory.org/2011/08/19/cern-school-of-computing-2011-was-awesome/</link>
		<comments>http://www.gridfactory.org/2011/08/19/cern-school-of-computing-2011-was-awesome/#comments</comments>
		<pubDate>Fri, 19 Aug 2011 08:24:25 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[General]]></category>

		<category><![CDATA[GridPilot applications]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/08/19/cern-school-of-computing-2011-was-awesome/</guid>
		<description><![CDATA[





As exemplified on the pictures above (from the photo gallery of one of the students), CERN School of Computing includes a good deal of extracurricular activities. Which probably goes a long way in explaining the good and humorous atmosphere in the classroom.



Having spent a lot of energy preparing the exercise with the great CSC team, [...]]]></description>
			<content:encoded><![CDATA[<br />

<center><a href="http://cscweb.cern.ch/csc2011/gallery2-csc2011/main.php"><img src="http://cscweb.cern.ch/csc2011/gallery2-csc2011/main.php?g2_view=core.DownloadItem&#038;g2_itemId=328&#038;g2_serialNumber=2" alt="Lecture" /><img src="http://cscweb.cern.ch/csc2011/gallery2-csc2011/main.php?g2_view=core.DownloadItem&#038;g2_itemId=73&#038;g2_serialNumber=2" alt="Nyhavn" /><img src="http://cscweb.cern.ch/csc2011/gallery2-csc2011/main.php?g2_view=core.DownloadItem&#038;g2_itemId=322&#038;g2_serialNumber=2" alt="CSC" /><img src="http://cscweb.cern.ch/csc2011/gallery2-csc2011/main.php?g2_view=core.DownloadItem&#038;g2_itemId=580&#038;g2_serialNumber=2" alt="Canoeing" /></a></center>

<br /><br />

As exemplified on the pictures above (from the <a href="http://cscweb.cern.ch/csc2011/gallery2-csc2011/main.php">photo gallery</a> of one of the students), <a href="http://csc.web.cern.ch/">CERN School of Computing</a> includes a good deal of extracurricular activities. Which probably goes a long way in explaining the good and humorous atmosphere in the classroom.

<br /><br />

Having spent a lot of energy preparing <a href="/2011/07/07/cern-school-of-computing-2011-exercise-1/">the exercise</a> with the great CSC team, it really was great fun to see my software in the hands of 60 energetic students from all over the world. Here you have the exercise as presented by CSC:
<br /><br />

<a href="http://cernvm.cern.ch/portal/csc#ex2">http://cernvm.cern.ch/portal/csc#ex2</a>

<br /><br />

In just a few minutes, these guys (mostly non-physicists) carried out the vision of GridFactory: created a classroom cloud and started simulating and anlyzing 500&#8242;000 collision events in the Large Hadron Collider at CERN.

<br /><br />

The actual student exercise was formulated as a competition (see <a href="http://cernvm.cern.ch/portal/csc#ex2">http://cernvm.cern.ch/portal/csc#ex2</a>) - the winner being the first to create a histogram showing the result of an analysis of 500&#8242;000 simulated TTBar events in the Large Hadron Collider.

<br /><br />

Worker nodes were fired up and jobs submitted. Obviously, if all had been equally fast in doing this and no collaboration was taking place, each might as well have run the exercise on her own PC. Equally clear, if there had been an overall aim of just generating one histogram with 500&#8242;000 events as fast as possible, all should have fired up worker nodes and only one should&#8217;ve submitted jobs. In the end what happened was that a few were very fast in submitting jobs and quickly  25 different jobs were queued and running, whereas the jobs of the rest were submitted later and ended up sitting in the queue. Interestingly, the ones who won were apparently not among the fast submitters. Instead they realized that all were running with the same credentials on the server and that therefore they could simply download the output files of the first 25 different jobs to finish and make a histogram. Congrats to the winners <img src='http://www.gridfactory.org/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> 
]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/08/19/cern-school-of-computing-2011-was-awesome/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CERN School of Computing 2011 - Exercise 2</title>
		<link>http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-2/</link>
		<comments>http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-2/#comments</comments>
		<pubDate>Thu, 07 Jul 2011 12:53:23 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[GridPilot applications]]></category>

		<category><![CDATA[High energy physics]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-2/</guid>
		<description><![CDATA[In this exercise we&#8217;ll solve the same assignment as in exercise 1, but using a prepackaged GridPilot application.


Download and install GridPilot
Start GridPilot and answer the initial questions - enable only the computing system &#8220;GridFactory&#8221; and set the submission host to gridfactory.nbi.dk
Import the app &#8220;ttbar_exercise&#8221;
Select the application/dataset &#8220;ttbar_exercise-100k-events&#8221; and click &#8220;Run&#8221;
Select the application/dataset &#8220;ttbar_exercise-merge&#8221; and click [...]]]></description>
			<content:encoded><![CDATA[In this exercise we&#8217;ll solve the same assignment as in <a href="/2011/07/07/cern-school-of-computing-2011-exercise-1/">exercise 1</a>, but using a prepackaged GridPilot application.
<br /><br />
<ul>
<li>Download and install GridPilot</li>
<li>Start GridPilot and answer the initial questions - enable only the computing system &#8220;GridFactory&#8221; and set the submission host to gridfactory.nbi.dk</li>
<li>Import the app &#8220;ttbar_exercise&#8221;</li>
<li>Select the application/dataset &#8220;ttbar_exercise-100k-events&#8221; and click &#8220;Run&#8221;</li>
<li>Select the application/dataset &#8220;ttbar_exercise-merge&#8221; and click &#8220;Run&#8221;</li>
<li>Open the merged output file with Root: <code>root ttbar-analysis.root</code> - inside Root, type <code>TBrowser b;</code></li>
</ul>]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-2/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CERN School of Computing 2011 - Exercise 1</title>
		<link>http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-1/</link>
		<comments>http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-1/#comments</comments>
		<pubDate>Thu, 07 Jul 2011 12:53:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[GridPilot applications]]></category>

		<category><![CDATA[High energy physics]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-1/</guid>
		<description><![CDATA[Exercise 1: LHC Monte Carlo event generation and analysis




One of the final plots.


This exercise was created for the 2011 CERN CERN School of Computing, hosted by the Niels Bohr Institute, University of Copenhagen.

Credits: physics content of this exercise by Jørgen Beck Hansen from the Niels Bohr Institute.



Assignment: Generate 500&#8242;000 LHC t-tbar events with PYTHIA and [...]]]></description>
			<content:encoded><![CDATA[<h1 id="toc-exercise-1-lhc-monte-carlo-event-generation-and-analysis">Exercise 1: LHC Monte Carlo event generation and analysis</h1>

<br /><br />

<div class="picture left"  style="width:267px;"><a rel="lightbox" href='/files/2011/07/canvas-1.png' title="pT diboson distribution. Root plot from the merged histogram file."><img src='/files/2011/07/canvas-1_small.png' /></a>
<br />One of the final plots.</div>

<i>
This exercise was created for the 2011 CERN CERN School of Computing, hosted by the Niels Bohr Institute, University of Copenhagen.
<br /><br />
Credits: physics content of this exercise by Jørgen Beck Hansen from the Niels Bohr Institute.</i>

<br /><br />

<b>Assignment</b>: Generate 500&#8242;000 LHC t-tbar events with <a href="http://home.thep.lu.se/~torbjorn/Pythia.html">PYTHIA</a> and analyze the data: find bosons and top hadrons, run jet-­‐algorithm: cone R<0.7

<br /><br /><br /><br /><br />
<h3 id="toc-prerequisites">Prerequisites</h3>
<br />

<ul>
<li>Download and install VirtualBox.</li>
<li><a href="http://cernvm.cern.ch/portal/downloads">Download</a> and fire up a CernVM. It&#8217;s probably easiest to use the installer.</li>
<li>Configure your CernVM to use the repositories hepsoft, grid and sft and setup a user account.</li>
<li>Login on your new CernVM.</li>
<li>Setup GridFactory:
<pre>
export PATH=$PATH:\
/cvmfs/sft.cern.ch/lcg/external/experimental/gridfactory.org/gridfactory_ui:\
/cvmfs/sft.cern.ch/lcg/external/experimental/gridfactory.org/gridworker:\
/cvmfs/sft.cern.ch/lcg/external/Java/JDK/1.6.0/ia32/bin
</pre>
</li>
<li>Start a gridworker:
<br /><em> in a separate shell:</em>
<pre>
gridworker.sh [-n] https://gridfactory.nbi.dk/db/
</pre>
<em>Hint: If you have X enabled, you can run in graphical mode by leaving out the -n</em>
</li>
<li>Setup Root:
<br />
<em>In a 32-bit VM:</em>
<pre>
source /cvmfs/sft.cern.ch/lcg/external/gcc/4.3.2/i686-slc5-gcc43-opt/setup.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/slc4_ia32_gcc34/root/lib
export PATH=$PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/slc4_ia32_gcc34/root/bin
</pre>
<em>In a 64-bit VM:</em>
<pre>
source /cvmfs/sft.cern.ch/lcg/external/gcc/4.3.2/x86_64-slc5-gcc43-opt/setup.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/lib
export PATH=$PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/bin
</pre>
</li>
<li>Get and unpack the exercise tarball <a href="/files/2011/07/TTBarExercise.tar.gz">TTBarExercise.tar.gz</a>.</li>
</ul>

<br /><br />
<h3 id="toc-help">Help</h3>
<br />

<h4>Compile PYTHIA + analysis code</h4>

<pre>
cd TTBarExercise/siscone-2.0.0 
      
./configure --prefix=$PWD/../siscone
make clean     
make
make install       
cd ..

rm -f *.o
make

g77 -o pythia pythia_exercise.f pythia-6.4.25.f
</pre>

<br />
<h4>Tests</h4>
<ul>
<li>Run PYTHIA: <code>./pythia</code></li>
<li>Analyze PYTHIA output and save result to root histogram: <code>./AsciiReader Eventsgen.ascii</code></li>
</ul>

<br />
<h4>Create batch jobs</h4>

<em>In a 32-bit VM:</em>

<pre>
mkdir jobs
for n in {1..25}; do
cat &gt; jobs/job$n.sh &lt;&lt;EOF
#!/bin/bash
source /cvmfs/sft.cern.ch/lcg/external/gcc/4.3.2/i686-slc5-gcc43-opt/setup.sh
export LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/slc4_ia32_gcc34/root/lib
export PATH=\$PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/slc4_ia32_gcc34/root/bin
./pythia $((19780503 + $n)) 20000
./AsciiReader Eventsgen.ascii
mv AsciiReader.root  ttbar_analysis_$n.root
EOF
sed -i "s|\$n|$n|" jobs/job$n.sh
done
</pre>

<em>In a 64-bit VM:</em>

<pre>
mkdir jobs
for n in {1..25}; do
cat &gt; jobs/job$n.sh &lt;&lt;EOF
#!/bin/bash
source /cvmfs/sft.cern.ch/lcg/external/gcc/4.3.2/x86_64-slc5-gcc43-opt/setup.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/lib
export PATH=$PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/bin
./pythia $((19780503 + $n)) 20000
./AsciiReader Eventsgen.ascii
mv AsciiReader.root  ttbar_analysis_$n.root
EOF
sed -i "s|\$n|$n|" jobs/job$n.sh
done
</pre>

<br />
<h4>Submit batch jobs</h4>

First check that GridFactory is setup correctly:

<pre>
psub -h
</pre>

If all is well, continue with submitting all the jobs:

<pre>
rm jobs.txt
for n in {1..25}; do
 psub -b gridfactory.nbi.dk jobs/job$n.sh -i pythia -i AsciiReader -e pythia -e AsciiReader -o ttbar_analysis_$n.root | grep -v submitted >> jobs.txt
 echo "submitted job $n"
done
</pre>

<em>Hint: if you have problems with the above, try running a single job:</em>
<pre>
psub -b gridfactory.nbi.dk jobs/job1.sh -i pythia -i AsciiReader -e pythia -e AsciiReader -o ttbar_analysis_1.root
</pre>

<br />
<h4>Monitor jobs</h4>
<pre>
pstat `cat jobs.txt`
</pre>


<br />
<h4>Get and merge output histograms</h4>
<pre>
mkdir results
for n in `cat jobs.txt`; do
 dir=`echo $n | awk -F / '{print $NF}'`
 mkdir results/$dir
 pget -o results/$dir $n
done

hadd ttbar_analysis.root results/*/*.root
</pre>

<br />
<h4>Open the final histogram with Root</h4>
<pre>
root ttbar-analysis.root
.
.
.
root [1] TBrowser b;
</pre>

<br />
<h4>Clean up</h4>
<pre>
pclean `cat jobs.txt`
rm -rf jobs results
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/07/07/cern-school-of-computing-2011-exercise-1/feed/</wfw:commentRss>
		</item>
		<item>
		<title>SSL/certificates ok again.</title>
		<link>http://www.gridfactory.org/2011/05/16/sslcertificates-ok-again/</link>
		<comments>http://www.gridfactory.org/2011/05/16/sslcertificates-ok-again/#comments</comments>
		<pubDate>Mon, 16 May 2011 11:05:52 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[General]]></category>

		<category><![CDATA[GridFactory installation and configuration]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/05/16/sslcertificates-ok-again/</guid>
		<description><![CDATA[I&#8217;ve updated the CA certificates on the public GridFactory servers and their attached worker nodes and made new versions of the software  with new CA certificates available for download.]]></description>
			<content:encoded><![CDATA[I&#8217;ve updated the CA certificates on the public GridFactory servers and their attached worker nodes and made new versions of the software  with new CA certificates available for download.]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/05/16/sslcertificates-ok-again/feed/</wfw:commentRss>
		</item>
		<item>
		<title>SSL trouble - NorduGrid CA certificate expired 12/5, 12:00</title>
		<link>http://www.gridfactory.org/2011/05/15/ssl-trouble-nordugrid-ca-certificate-expired-125-1200/</link>
		<comments>http://www.gridfactory.org/2011/05/15/ssl-trouble-nordugrid-ca-certificate-expired-125-1200/#comments</comments>
		<pubDate>Sun, 15 May 2011 12:59:20 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[General]]></category>

		<category><![CDATA[GridFactory installation and configuration]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/05/15/ssl-trouble-nordugrid-ca-certificate-expired-125-1200/</guid>
		<description><![CDATA[As the title says, there&#8217;s trouble.

Since I&#8217;ve not updated neither the servers nor the software, the GridFactory servers will not trust any NorduGrid client or certificates including their own - so they are effectively down - and also the software is useless until I get things upgraded.

Sorry for any incovenience.]]></description>
			<content:encoded><![CDATA[As the title says, there&#8217;s trouble.

Since I&#8217;ve not updated neither the servers nor the software, the GridFactory servers will not trust any NorduGrid client or certificates including their own - so they are effectively down - and also the software is useless until I get things upgraded.

Sorry for any incovenience.]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/05/15/ssl-trouble-nordugrid-ca-certificate-expired-125-1200/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Public beta!</title>
		<link>http://www.gridfactory.org/2011/05/09/public-beta/</link>
		<comments>http://www.gridfactory.org/2011/05/09/public-beta/#comments</comments>
		<pubDate>Mon, 09 May 2011 08:28:39 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/05/09/public-beta/</guid>
		<description><![CDATA[Dear grid warriors: more tools are now available to assist you in your battles. The GridFactory software including GridPilot is now publicly available for download. Please take it for a spin and report back - either by commenting on the pages of this site, or by email to gridfactory&#64;gridfactory.org.

Good luck!]]></description>
			<content:encoded><![CDATA[Dear grid warriors: more tools are now available to assist you in your battles. The GridFactory software including GridPilot is now publicly available for download. Please take it for a spin and report back - either by commenting on the pages of this site, or by email to <a href="#" onclick="this.href= 'mai' + 'lto:' + 'gridfactory' + '&#64;' + 'gridfactory.org' ; return true;">gridfactory<!-- @@@ -->&#64;<!-- @@@ -->gridfactory<!-- nospam -->.<!-- nomorespam -->org</a>.

Good luck!]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/05/09/public-beta/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CERN/ATLAS data processing on grids and GridFactory</title>
		<link>http://www.gridfactory.org/2011/05/07/cernatlas-data-processing/</link>
		<comments>http://www.gridfactory.org/2011/05/07/cernatlas-data-processing/#comments</comments>
		<pubDate>Sat, 07 May 2011 16:48:59 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[GridPilot applications]]></category>

		<category><![CDATA[High energy physics]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/05/07/cernatlas-data-processing/</guid>
		<description><![CDATA[In this post I&#8217;ll report on running the application &#8220;mc09_7TeV.107691&#8230;.&#8221; from the GridPilot app store. In the case of NorduGrid and WLCG, the ATLAS software is preinstalled on the resources. In the case of GridFactory, the jobs run inside a CernVM appliance with ATLAS software loaded through the AFS network file system. The input dataset [...]]]></description>
			<content:encoded><![CDATA[In this post I&#8217;ll report on running the application &#8220;mc09_7TeV.107691&#8230;.&#8221; from the GridPilot app store. In the case of NorduGrid and WLCG, the ATLAS software is preinstalled on the resources. In the case of GridFactory, the jobs run inside a <a href="http://cernvm.cern.ch/">CernVM appliance</a> with ATLAS software loaded through the AFS network file system. The input dataset consisted of 26 files totaling 36 GB, i.e. each input file was rather large. The files were physically located at atlassrm-fzk.gridka.de.
<br /><br />
Timing results are summarized below. There&#8217;s no timing for general NorduGrid because I did not manage run on other clusters than our own tier-3.
<br /><br />
<center><font color="gray">Summary of runs</font></center>
<br />
<table style="border:1px solid;" rules="all" width="95%">
<tr>
<td></td><td><b>NorduGrid tier-3 cluster (A)</b></td><td><b>WLCG <br /><br /> (B)</b></td><td><b>GridFactory  / virt. - 4 nodes - run 1 (C)</b></td><td><b>GridFactory  / virt. - 4 nodes - run 2 (D)</b></td><td><b>GridFactory - 4 nodes - run 1 (E)</b></td><td><b>GridFactory - 4 nodes - run 2 (F)</b></td><td><b>GridFactory - 1 node <br /> (G)</b></td>
</tr>
<tr><td width="35%">
Average submission time per job (s)
</td>
<td><!--13.15.00 - 13.15.35 - 15.00.56-->1.34</td><td><!--13.17.32 - 13.19.08 - 23.43.00-->3.69</td><td><!--16.08.00 - 16.08.14 - 16.34.35-->0.538</td><td><!--17.06.00 - 17.06.19 - 17.34.22-->0.731</td><td><!--00.58.00 - 00.58.19-01.14.02-->0.731</td><td><!--00.22.00 - 00.22.19 - 00.44.34-->0.731</td><td><!--22.47.00 - 22.47.18 - 23.28.55-->0.692</td>
</tr>
<tr><td>
Summed CPU time (s)
</td>
<td>3259</td><td><!--79732-->7691</td><td><!--13200-->4445</td><td><!--15868-->4390</td><td><!--11242-->2244</td><td><!--9096-->1926</td><td><!--8182-->1887</td>
</tr>
<tr><td>
Summed download time (s)
</td>
<td>-</td><td>72041</td><td>8755</td><td>11478</td><td>8998</td><td>7170</td><td>6295</td>
</tr>
<tr><td>
User real waiting time (submission, processing and data transfer time) (s)
</td>
<td>6356</td><td>37528</td><td>1595</td><td>1702</td><td>1354</td><td>962</td><td>2515</td>
</tr>
<tr><td>
Number of available cores
</td>
<td>~100</td><td>-</td><td>16</td><td>16</td><td>16</td><td>16</td><td>4</td>
</tr>
</table>

<br /><br /><br />

<center>
<img src="http://chart.apis.google.com/chart?chxl=0:|A|B|C|D|E|F|G&#038;chxr=1,0,72041&#038;chxs=0,676767,11.5,-0.167,l,676767|1,676767,11.5,0,lt,676767&#038;chxt=x,y&#038;chbh=12,2&#038;chs=620x400&#038;cht=bvg&#038;chco=BBCCED,7777CC,49188F&#038;chds=0,72041,0,72041,0,72041&#038;chd=t:3259,7691,4445,4390,2244,1926,1887|0,72041,8755,11478,8998,7170,6295|6356,37528,1595,1702,1354,962,2515&#038;chdl=Summed+CPU+time+(s)|Summed+download+time+(s)|User+real+waiting+time+(s)&#038;chma=|4&#038;chtt=Processing+time" width="620" height="400" alt="Processing time" />
</center>

<br /><br />
<Notes:</b> Contrary to the <a href="/2011/05/03/cernatlas-simulation-on-grids-and-clouds/">simulation runs</a> and more in line with the <a href="/2010/12/22/cernatlas-n-tuple-boildown-on-nordugrid-wlcg-and-gridfactory/">boildown runs</a>, this time our tier-3 cluster performed substantially better than the average WLCG resource, but still substantially worse than a new desktop PC. In fact, a user is better off running 26 such jobs on such a desktop PC (with 4 Intel i7 cores) - they finish 2.5 times faster than on the tier-3 cluster with 160 cores (of which a few were busy with other jobs). This is partly because each job runs 1.7 times faster on the desktop PC. The remaining gap must be due to the desktop PC having a faster internet connection and/or overhead incurred by the grid system.
<br /><br />
On WLCG, 16 out of 26 jobs failed and had to be resubmitted - with various reasons reported: &#8220;user timeout&#8221;  (.es, .it, .tw), &#8220;server responded with and error - transfer aborted&#8221; (.tw), atlas misconfiguration (.ru, .za).
<br /><br />
On the GridFactory cluster, I ran the same production twice with VirtualBox virtualization and twice without. In both cases, the ATLAS software was read from AFS. In the latter case, a noticeable speedup was observed in the second run - this is presumably due to AFS having cached the ATLAS software. In the former case, the speedup was much smaller - probably the cache in the virtual machine is too small to make a difference. Download times varied quite a lot - presumably for reasons outside of our control (the file server, network). Interestingly, the performance penalty incurred by virtualization appears to be almost a factor 2 - much higher than e.g. in the case of the  simulation runs. We ascribe this to the I/O penalty from running in a VirtualBox shared folder on a rather large input file.
]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/05/07/cernatlas-data-processing/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CERN/ATLAS Monte Carlo simulation on grids and clouds</title>
		<link>http://www.gridfactory.org/2011/05/03/cernatlas-simulation-on-grids-and-clouds/</link>
		<comments>http://www.gridfactory.org/2011/05/03/cernatlas-simulation-on-grids-and-clouds/#comments</comments>
		<pubDate>Tue, 03 May 2011 21:53:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[GridPilot applications]]></category>

		<category><![CDATA[High energy physics]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/05/03/cernatlas-simulation-on-grids-and-clouds/</guid>
		<description><![CDATA[In previous posts we saw that I/O bound jobs ran ~3 faster on standard SATA disks than on network file systems and block devices (GPFS, NFS, EBS). This post reports on CPU bound jobs.  I ran standard ATLAS Monte Carlo simulation on both grid and cloud resources: imported the ATLAS simulation app and ran [...]]]></description>
			<content:encoded><![CDATA[In previous posts we saw that I/O bound jobs ran ~3 faster on standard SATA disks than on network file systems and block devices (GPFS, NFS, EBS). This post reports on CPU bound jobs.  I ran standard ATLAS Monte Carlo simulation on both grid and cloud resources: imported the ATLAS simulation app and ran the default 100 small jobs, each generating 100 events and writing a file of about 0.8 MB in size. On the two clouds (C and D - in so far a virtualized batch system qualify as a cloud), the jobs ran inside a <a href="http://cernvm.cern.ch/">CernVM appliance</a> with ATLAS software loaded through the CVMFS network file system. The timing results are summarized in the table and chart below.

<br /><br />
<center><font color="gray">Summary of runs</font></center>
<br />
<!-- T3: -->
<!-- NG: 14.31.00 - 14.31.59 - 15.05.21 + 628 -- 2689 | 39272 / 100 --  393-->
<!--WLCG: 14.35.00 - 14.40.43 - 19.35.00 -- 18000 | 11330 / 87 -- 130-->
<!--GF-Irigo: 14.33.30 - 14.34.29 - 15.02.02 +55 -- 1773 | 8390 / 100 --  83.9-->
<!--GF-CERN: 11.58.00 - 11.58.59 - 12.26.20 | 7591/100 -- 75.9-->
<!--GF-CERN-virt: 22.55.00 - 22.55.56 - 23.40.30 | 8557 /100 -- 85.6-->
<table style="border:1px solid;" rules="all" width="95%">
<tr>
<td><td><b>NorduGrid tier-3 cluster (A)</b></td><td><b>WLCG (B)</b></td><td><b>GridFactory / Irigo (C)</b></td><td><b>GridFactory  / virt. (D)</b></td><td><b>GridFactory (E)</b></td>
</tr>
<tr><td width="35%">
Average submission time per job (s)
</td>
<td>0.59</td><td>6.43</td><td>0.59</td><td>0.56</td><td>0.59</td>
</tr>
<tr><td>
Average running time (s)
</td>
<td>393</td><td>130</td><td>83.9</td><td>85.6</td><td>75.9</td>
</tr>
<tr><td>
User real waiting time (submission, processing and data transfer time) (s)
</td>
<td>2689</td><td>18000 </td><td>1773</td><td>2730</td><td>1700</td>
</tr>
<tr><td>
Number of available cores
</td>
<td>~100</td><td>-</td><td>8</td><td>8</td><td>8</td>
</tr>
</table>
<br /><br />
<b>Notes</b>
<br /><br />
<ul>
<li>On the GridFactory cluster, switching on virtualizatilon incurred an 11.3% performance penalty and a penalty of 37.7% in &#8220;User real waiting time&#8221;. This rather substantial last penalty is primarily due to the short running time of each job, i.e. to running jobs via SSH and staging files in and out of virtual machines.</li>
<li>On WLCG (gLite) the &#8220;User real waiting time&#8221; was exceedingly long because of the &#8220;tail&#8221; problem mentioned in <a href="/?s=tail&#038;searchsubmit=Find">previous posts</a>: some jobs took a very long time to start.</li>
<li>On WLCG (gLite) 28 out of 100 jobs failed for a variety of reasons - most prominently because the ATLAS setup script was not found in the standard location &#8220;$VO_ATLAS_SW_DIR/software/[release]/setup.sh&#8221;. I don&#8217;t know if I should look for the setup script somewhere else - if someone out there does, please post a comment or drop me a line.</li>
<li>Despite the many cores, the tier-3 cluster did not do too well. Presumably one reason for this is simply that its processors are rather old Xeons, at least compared to the Core i7&#8217;s of the Irigo and GridFactory clusters. Another reason could be that other jobs were running on the cluster - using other ATLAS software releases. If some jobs running different ATLAS releases happened to run on the same (8-core) node, GPFS may have had trouble serving the software. This last hypothesis is supported by the large spread in the CPU times of the jobs: 59 s - 774 s, compared to e.g. 42 s - 132 s on the GridFactory cluster (without virtualization).</li>
</ul>
<br /><br />

<center>
<img src="http://chart.apis.google.com/chart?chxl=0:|A|B|C|D|E&#038;chxr=1,0,39272&#038;chxs=1,676767,11.5,0,lt,676767&#038;chxt=x,y&#038;chs=620x407&#038;cht=bvg&#038;chco=BBCCED,7777CC&#038;chds=0,39272,0,39272&#038;chd=t:39272,13023,8390,7591,8557|2689,18000,1773,2730,1700&#038;chdl=Summed+CPUtime+(s)|User+real+waiting+time+(s)&#038;chma=|12,5&#038;chtt=Simulation+time" width="620" height="407" alt="Simulation time" />
</center>]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/05/03/cernatlas-simulation-on-grids-and-clouds/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CERN/ATLAS boildown on clouds</title>
		<link>http://www.gridfactory.org/2011/04/15/cernatlas-n-tuple-boildown-on-ec2-and-irigo/</link>
		<comments>http://www.gridfactory.org/2011/04/15/cernatlas-n-tuple-boildown-on-ec2-and-irigo/#comments</comments>
		<pubDate>Fri, 15 Apr 2011 08:32:45 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[GridPilot applications]]></category>

		<category><![CDATA[High energy physics]]></category>

		<guid isPermaLink="false">http://www.gridfactory.org/2011/04/15/cernatlas-n-tuple-boildown-on-ec2-and-irigo/</guid>
		<description><![CDATA[In this post, I&#8217;ll take a look at some more runs of the “atlas_d3pd_boildown” application available in the GridPilot app store. The difference w.r.t. the runs described in a previous post is that  this time I ran on cloud as opposed to grid resources. On dedicated hardware and on two public clouds, Amazon&#8217;s EC2 [...]]]></description>
			<content:encoded><![CDATA[In this post, I&#8217;ll take a look at some more runs of the “atlas_d3pd_boildown” application available in the GridPilot app store. The difference w.r.t. the runs described in a <a href="/2010/12/22/cernatlas-n-tuple-boildown-on-nordugrid-wlcg-and-gridfactory/">previous post</a> is that  this time I ran on cloud as opposed to grid resources. On dedicated hardware and on two public clouds, <a href="http://aws.amazon.com/ec2/">Amazon&#8217;s EC2</a> and<a href="http://www.cabo.dk/produkter-en/irigo-servers-en"> Cabo&#8217;s Irigo cloud</a>, I fired up a GridFactory cluster, and changed my preferences to use each one in turn. On the dedicated hardware I made sure the jobs would run in virtual machines (VirtualBox with shared folder) - 2 on each of the 4 worker nodes, each running one job, and on both clouds I ran on 8 worker nodes, each running one job. Notice that in all cases, the jobs run inside a <a href="http://cernvm.cern.ch/">CernVM appliance</a>, but whereas on the two clouds, ATLAS software is accessed through the CVMFS network file system, on GridFactory, the software is accessed over the AFS network file system. Notice also that to avoid the notorious &#8220;tail&#8221; problem (see previous posts, e.g. <a href="/2011/03/29/pov-ray-ii/">this one</a>), this time I ran a &#8220;private&#8221; cluster on EC2, allowing only my own GridWorkers. On EC2 I chose instances of type &#8220;small&#8221; (1.7 GB of RAM, I/O performance moderate, 1 virtual core). On Irigo and the dedicated hardware with virtualization I chose a matching setup with instances with the same amount of RAM and 1 virtual core. On the dedicated hardware without virtualization I ran 2 jobs at a time on each 4-core physical machine. The results are summarized below.
<br /><br />
<center><font color="gray">Summary of runs</font></center>
<br />
<table style="border:1px solid;" rules="all" width="95%">
<tr>
<td width="40%"></td><td width="15%"><b>EC2</b></td><td width="15%"><b>Irigo</b></td><td width="15%"><b>Dedicated hardware with virtualization</b></td><td><b>Dedicated hardware without virtualization</b></td>
</tr>
<tr><td>
Average submission time per job (s)
</td>
<td><!--12.22.00 - 12.26.15 - 15.07.40-->1.24</td><td><!--13.48.00 - 13.53.29 - 14.43.26-->1.09</td><td><!--20.34.00 - 20.36.41 - 21.59.41-->0.531</td><td><!--23.14.00 - 23.16.44 - 00.34.53-->0.541</td>
</tr>
<tr><td>
Summed running time (s)
</td>
<td>55788</td><td>13715</td><td>9166</td><td>7755</td>
</tr>
<tr><td>
Summed CPU time (s)
</td>
<td>9501</td><td>8644</td><td>3220</td><td>2517</td>
</tr>
<tr><td>
User real waiting time (submission, processing and data transfer time) (s)
</td>
<td>9940</td><td>6926</td><td>4980</td><td>4853</td>
</tr>
</table>

<br /><br /><br /><br />

<center>
<img src="http://chart.apis.google.com/chart?chxl=1:|EC2|Irigo|GridFactory%2Fvirt.|GridFactory&#038;chxr=0,0,56000&#038;chxt=y,x&#038;chbh=a&#038;chs=520x225&#038;cht=bvg&#038;chco=49188F,7777CC,BBCCED&#038;chds=0,56000,0,56000,0,56000&#038;chd=t:9940,6926,4980,4853|9501,8644,3220,2517|55788,13715,9166,7755&#038;chdl=User+real+waiting+time+(s)|Summed+CPU+time+(s)|Summed+running+time+(s)&#038;chtt=Processing+time" width="520" height="225" alt="Processing time" />
</center>

<br /><br />

<b>Notes</b>:
<br /><br />
<ul>
<li>This was primarily an I/O exercise: with I/O referring to both network (download of input file) and disk (reading the file) I/O.</li>
<li>Irigo and EC2 have comparable disk I/O and CPU performance with Irigo having a ~10% edge.</li>
<li>The virtual machines on both Irigo and EC2 are booted from a shared file system: Irigo from its image store and EC2 from Amazon&#8217;s EBS. Apparently the technology underlying EBS is very similar in performance to that that used by Irigo for its image store - NFS (on top of ZFS).</li>
<li>The &#8220;moderate&#8221; network I/O of the &#8220;small&#8221; EC2 instances chosen, apparently is really moderate. Download of input files takes ~9 times longer than on Irigo and ~8 times longer than on the dedicated hardware (at CERN). These fast downloads are likely bounded mostly by the performance of the file server (in Germany) where the input files reside.</li>
<li>The disk images of the  virtual machines on both EC2 and Irigo are hosted on a shared file system. While this was expected to limit performance, it is still a bit surprising that the raw disks of the dedicated hardware with (VirtualBox) virtualization give an almost 3 times better performance.</li>
<li>From the two last columns it is seen that the virtualization layer itself apparently incurs ~27% overhead. This is presumably mainly due to the fact that the input data file is read from a VirtualBox shared folder.</li>
</ul>]]></content:encoded>
			<wfw:commentRss>http://www.gridfactory.org/2011/04/15/cernatlas-n-tuple-boildown-on-ec2-and-irigo/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

