<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Sajal Kayan</title>
	<atom:link href="http://www.sajalkayan.com/feed" rel="self" type="application/rss+xml" />
	<link>http://www.sajalkayan.com</link>
	<description>No Windows, No Gates, It is OPEN; No Bill, It is FREE</description>
	<pubDate>Sun, 04 Jul 2010 17:33:28 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>ATTN Big Media: Web page speed matters!</title>
		<link>http://www.sajalkayan.com/attn-big-media-web-page-speed-matters.html</link>
		<comments>http://www.sajalkayan.com/attn-big-media-web-page-speed-matters.html#comments</comments>
		<pubDate>Sun, 04 Jul 2010 14:59:50 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Webmaster Things]]></category>

		<category><![CDATA[pagespeed]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=118</guid>
		<description><![CDATA[I am not a front-end kinda guy, but these days my latest obsession is to test and improve web page loading times. With this blogpost, I intend to show with examples of some selected websites and what are they doing right and what *desperately* needs to be improved. As I learn more about this I [...]]]></description>
			<content:encoded><![CDATA[<p>I am not a front-end kinda guy, but these days my latest obsession is to test and improve web page loading times. With this blogpost, I intend to show with examples of some selected websites and what are they doing right and what *desperately* needs to be improved. As I learn more about this I will be posting more in-depth posts on these issues.</p>
<h4>Why does it matter?</h4>
<p>These days almost everyone is on high-speed &#8216;broadband&#8217; connections so the sites should load fast. Right? No! With faster connectivity, users get more impatient and expect websites to load faster. </p>
<p>E-Commerce websites have done <a href="http://radar.oreilly.com/2008/08/radar-theme-web-ops.html">studies</a> which proove faster sites results in more sales. Google has also been <a href="http://www.youtube.com/watch?v=MStKwEff_kY" title="Velocity 2010: Urs Holzle">advocating</a> to webmasters to improve their pageload times. Now <a href="http://googlewebmastercentral.blogspot.com/2010/04/using-site-speed-in-web-search-ranking.html" title="Using site speed in web search ranking">pageload is a signal for ranking in Google</a>, abit a minor signal, but it still matters.</p>
<p>Overall faster websites <a href="http://googleresearch.blogspot.com/2009/06/speed-matters.html">results</a> in a <a href="http://www.stevesouders.com/blog/2009/07/27/wikia-fast-pages-retain-users/" title="Wikia: fast pages retain users">higher user retention</a>. Faster websites also <a href="http://radar.oreilly.com/2009/07/velocity-making-your-site-fast.html">lower operational costs</a>. Not directly related, but an optimized website is also less susceptible to <a href="http://www.stevesouders.com/blog/2010/06/01/frontend-spof/">Frontend SPOF</a>.</p>
<h4>Analysis of pageload time</h4>
<p>My favorite tool for this is <a href="http://www.webpagetest.org/" title="Pagetest - where web sites go to get FAST!">webpagetest.org</a>. They also have an active <a href="http://www.webpagetest.org/forums/">forum</a> where users can discuss results and exchange tips on making it faster.</p>
<p>Some test results :-</p>
<table border="1" cellspacing="0" >
<tr>
<th>Site</th>
<th>Alexa Rank</th>
<th>First View</th>
<th>Repeat View</th>
</tr>
<tr>
<th colspan="4">My Sites</th>
</tr>
<tr>
<td><a href="http://www.sajalkayan.com/" title="Sajal Kayan">Main blog page</a></td>
<td>436,669</td>
<td><a href="http://www.webpagetest.org/result/100703_2a389e9913355156fe514dad6496fa1f/" rel="nofollow">1.512s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_2a389e9913355156fe514dad6496fa1f/" rel="nofollow">1.038s</a></td>
</tr>
<tr>
<td>A single <a href="http://www.sajalkayan.com/in-a-cdnd-world-opendns-is-the-enemy.html" title="In a CDN’d world, OpenDNS is the enemy!">post</a> page on my blog.</td>
<td>436,669</td>
<td><a href="http://www.webpagetest.org/result/100630_f939860d7469491b4445dbd93d9fd9a2/" rel="nofollow">0.822s</a></td>
<td><a href="http://www.webpagetest.org/result/100630_f939860d7469491b4445dbd93d9fd9a2/" rel="nofollow">0.648s</a></td>
</tr>
<tr>
<td><a href="http://www.thaindian.com/" title="Thaindian News">Thaindian.com Homepage</a></td>
<td>5,368</td>
<td><a href="http://www.webpagetest.org/result/100703_c7ab5a556aa0ce97fef055be59c4a520/" rel="nofollow">0.740s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_c7ab5a556aa0ce97fef055be59c4a520/" rel="nofollow">0.514s</a></td>
</tr>
<tr>
<td>A <a href="http://www.thaindian.com/newsportal/politics/us-vice-president-joe-biden-arrives-in-baghdad-on-surprise-visit_100390256.html" title="U.S. Vice President Joe Biden arrives in Baghdad on surprise visit">story page</a> on Thaindian News</td>
<td>5,368</td>
<td><a href="http://www.webpagetest.org/result/100703_e4da8ccb9799cf9fde59b0b3137aae71/" rel="nofollow">4.517s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_e4da8ccb9799cf9fde59b0b3137aae71/" rel="nofollow">3.411s</a></td>
</tr>
<tr>
<th colspan="4">Sites for Geeks/nerds</th>
</tr>
<tr>
<td><a href="http://stackoverflow.com/">Stack Overflow</a></td>
<td>445</td>
<td><a href="http://www.webpagetest.org/result/100703_1f13821a589ccc393783bcd07c3d8fc6/" rel="nofollow">2.233s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_1f13821a589ccc393783bcd07c3d8fc6/" rel="nofollow">1.734s</a></td>
</tr>
<tr>
<td>A <a href="http://stackoverflow.com/questions/3172298/gcc-xcode-speedup-suggestions">single</a> thread at Stack Overflow</td>
<td>445</td>
<td><a href="http://www.webpagetest.org/result/100703_b935a42bd57a347c9cb8dba3ef94269d/" rel="nofollow">2.846s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_b935a42bd57a347c9cb8dba3ef94269d/" rel="nofollow">2.645s</a></td>
</tr>
<tr>
<td><a href="http://slashdot.org/">Slashdot.Org</a></td>
<td>1,393</td>
<td><a href="http://www.webpagetest.org/result/100703_b81b1c497021b36a776bc5626d4660d3/" rel="nofollow">3.773s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_b81b1c497021b36a776bc5626d4660d3/" rel="nofollow">2.175s</a></td>
</tr>
<tr>
<td>A <a href="http://tech.slashdot.org/story/10/05/29/1355252/How-CDNs-and-Alternative-DNS-Services-Combine-For-Higher-Latency" title="How CDNs and Alternative DNS Services Combine For Higher Latency">single</a> thread at Slashdot</td>
<td>1,393</td>
<td><a href="http://www.webpagetest.org/result/100703_cf6076bf98dc973a471b93ee744498f3/" rel="nofollow">3.817s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_cf6076bf98dc973a471b93ee744498f3/" rel="nofollow">2.100s</a></td>
</tr>
<tr>
<th colspan="4">New media</th>
</tr>
<tr>
<td><a href="http://www.huffingtonpost.com/">The Huffington Post</a></td>
<td>152</td>
<td><a href="http://www.webpagetest.org/result/100703_46337caf586714160d7fa61340a04ed1/" rel="nofollow">6.945s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_46337caf586714160d7fa61340a04ed1/" rel="nofollow">4.044s</a></td>
</tr>
<tr>
<td>A <a href="http://www.huffingtonpost.com/2010/07/03/california-state-workers_n_634695.html" title="California State Workers Brace For Minimum Wage">single post on HuffPo</td>
<td>152</td>
<td><a href="http://www.webpagetest.org/result/100703_f23780e62939a0ba7c00d6e960bef6d2/" rel="nofollow">7.563s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_f23780e62939a0ba7c00d6e960bef6d2/" rel="nofollow">4.416s</a></td>
</tr>
<tr>
<td><a href="http://techcrunch.com/">TechCrunch</a></td>
<td>357</td>
<td><a href="http://www.webpagetest.org/result/100703_ae0469e4f1a4127ff10d71c38178c166/" rel="nofollow">16.462s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_ae0469e4f1a4127ff10d71c38178c166/" rel="nofollow">14.760s</a></td>
</tr>
<tr>
<td>An <a href="http://techcrunch.com/2010/07/03/facetime-and-why-apples-massive-integration-advantage-is-just-beginning/" title="FaceTime and Why Apples Massive Integration Advantage is Just Beginning">individual</a> post on TechCrunch</td>
<td>357</td>
<td><a href="http://www.webpagetest.org/result/100703_aa34704a0c7596e5df80a695bab87022/" rel="nofollow">10.987s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_aa34704a0c7596e5df80a695bab87022/" rel="nofollow">6.764s</a></td>
</tr>
<tr>
<td><a href="http://www.readwriteweb.com/">ReadWriteWeb</a></td>
<td>1,996</td>
<td><a href="http://www.webpagetest.org/result/100703_610dde132450be8c91bd00cc18d63f90/" rel="nofollow">17.766s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_610dde132450be8c91bd00cc18d63f90/" rel="nofollow">11.643s</a></td>
</tr>
<tr>
<td>A <a href="http://www.readwriteweb.com/archives/paperli_gets_investment_for_its_twitter_newspapers.php" title="Paper.li Gets Investment for Its Twitter Newspapers">single</a> post on ReadWriteWeb</td>
<td>1,996</td>
<td><a href="http://www.webpagetest.org/result/100703_ba46f8d118c183e10c8b22635c276630/" rel="nofollow">18.890s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_ba46f8d118c183e10c8b22635c276630/" rel="nofollow">10.408s</a></td>
</tr>
<tr>
<th colspan="4">Big media</th>
</tr>
<tr>
<td><a href="http://news.google.com/">Google News</a></td>
<td>1</td>
<td><a href="http://www.webpagetest.org/result/100704_625c0dac7cbe78df617239a9b5a125bb/" rel="nofollow">3.253s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_625c0dac7cbe78df617239a9b5a125bb/" rel="nofollow">1.365s</a></td>
</tr>
<tr>
<td>Single <a href="http://www.google.com/hostednews/ap/article/ALeqM5hvWEqwq3CrRvaQCmt21MfoYhjZJQD9GO6HI80" title="Petraeus: We are in this to win in Afghanistan" rel="nofollow">story</a> on Google News</td>
<td>1</td>
<td><a href="http://www.webpagetest.org/result/100704_ba30dd802cd14323a2e2e840034cf0ce/" rel="nofollow">2.688s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_ba30dd802cd14323a2e2e840034cf0ce/" rel="nofollow">1.238s</a></td>
</tr>
<tr>
<td><a href="http://news.yahoo.com/">Yahoo! News</a></td>
<td>4</td>
<td><a href="http://www.webpagetest.org/result/100704_170206d154d05ef42b622636c1a58f1c/" rel="nofollow">3.174s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_170206d154d05ef42b622636c1a58f1c/" rel="nofollow">1.707s</a></td>
</tr>
<tr>
<td>Single <a href="http://news.yahoo.com/s/ap/as_afghanistan;_ylt=AhRJzVLEo04_F09GGolXDKes0NUE;_ylu=X3oDMTNlNW5zdHBlBGFzc2V0A2FwLzIwMTAwNzA0L2FzX2FmZ2hhbmlzdGFuBGNjb2RlA21vc3Rwb3B1bGFyBGNwb3MDMQRwb3MDMgRwdANob21lX2Nva2UEc2VjA3luX3RvcF9zdG9yeQRzbGsDcGV0cmFldXN3ZWFy" title="Petraeus: We are in this to win in Afghanistan" rel="nofollow">story</a> on Yahoo! News</td>
<td>4</td>
<td><a href="http://www.webpagetest.org/result/100704_bf2f23029be14f1af65648beac250dfd/" rel="nofollow">5.310s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_bf2f23029be14f1af65648beac250dfd/" rel="nofollow">2.335s</a></td>
</tr>
<tr>
<td><a href="http://www.nytimes.com/">New York Times</a></td>
<td>91</td>
<td><a href="http://www.webpagetest.org/result/100704_51db94d84d53e178ef38e759a5292467/" rel="nofollow">7.038s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_51db94d84d53e178ef38e759a5292467/" rel="nofollow">2.200s</a></td>
</tr>
<tr>
<td>Single <a href="http://www.nytimes.com/2010/07/04/business/04bptax.html" title="As Oil Industry Fights a Tax, It Reaps Billions From Subsidies">single</a> story on NYT</td>
<td>91</td>
<td><a href="http://www.webpagetest.org/result/100704_66d6d33534b86ca572123b08e08e9270/" rel="nofollow">4.650s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_66d6d33534b86ca572123b08e08e9270/" rel="nofollow">4.507s</a></td>
</tr>
<tr>
<td><a href="http://www.bbc.co.uk/">BBC</a></td>
<td>40</td>
<td><a href="http://www.webpagetest.org/result/100704_ac802e4531dbd9daab60b0d429a57c72/" rel="nofollow">7.485s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_ac802e4531dbd9daab60b0d429a57c72/" rel="nofollow">3.427s</a></td>
</tr>
<tr>
<td>Single <a href="http://news.bbc.co.uk/2/hi/middle_east/10500869.stm" title="Hezbollah mentor Fadlallah dies in Lebanon">story</a> on BBC</td>
<td>40</td>
<td><a href="http://www.webpagetest.org/result/100704_24f6bc473e625429b56dc26cc9aa3192/" rel="nofollow">7.148s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_24f6bc473e625429b56dc26cc9aa3192/" rel="nofollow">3.851s</a></td>
</tr>
<tr>
<td><a href="http://edition.cnn.com/">CNN</a></td>
<td>58</td>
<td><a href="http://www.webpagetest.org/result/100703_2e1e879aecaf0ec4981d15e2f0deafa7/" rel="nofollow">10.197s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_2e1e879aecaf0ec4981d15e2f0deafa7/" rel="nofollow">2.528s</a></td>
</tr>
<tr>
<td>An <a href="http://edition.cnn.com/2010/WORLD/meast/07/03/munich.mastermind.dead/index.html" title="Suspected Munich massacre mastermind dead, reports say">Article</a> on CNN</td>
<td>58</td>
<td><a href="http://www.webpagetest.org/result/100703_9c67e76e0e27e54b5d0f854a83b8d48b/" rel="nofollow">7.989s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_9c67e76e0e27e54b5d0f854a83b8d48b/" rel="nofollow">3.523s</a></td>
</tr>
<tr>
<td><a href="http://www.msnbc.msn.com/">MSNBC</a></td>
<td>9</td>
<td><a href="http://www.webpagetest.org/result/100704_4ea04da58d2e8576a4d9bd458c5f6bc9/" rel="nofollow">8.172s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_4ea04da58d2e8576a4d9bd458c5f6bc9/" rel="nofollow">6.638s</a></td>
</tr>
<tr>
<td>Single <a href="http://www.msnbc.msn.com/id/38082645/ns/world_news-americas/" title="Drug war casts pall on Mexican elections">story</a> on MSNBC</td>
<td>9</td>
<td><a href="http://www.webpagetest.org/result/100704_454f2eb62ebb00ef25cef5674f00082e/" rel="nofollow">12.064s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_454f2eb62ebb00ef25cef5674f00082e/" rel="nofollow">4.191s</a></td>
</tr>
<tr>
<td><a href="http://www.washingtonpost.com/">The Washington Post</a></td>
<td>316</td>
<td><a href="http://www.webpagetest.org/result/100704_cfba18748623f2dc7253bddd0b22792e/" rel="nofollow">9.878s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_cfba18748623f2dc7253bddd0b22792e/" rel="nofollow">3.828s</a></td>
</tr>
<tr>
<td>Single <a href="http://www.washingtonpost.com/wp-dyn/content/article/2010/07/02/AR2010070204042.html" title="Post policies fuel reader confusion on when writers can offer opinion">story</a> on WaPo</td>
<td>316</td>
<td><a href="http://www.webpagetest.org/result/100704_5f358d626dd80e82baae102a87306cd4/" rel="nofollow">15.435s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_5f358d626dd80e82baae102a87306cd4/" rel="nofollow">4.530s</a></td>
</tr>
<tr>
<td><a href="http://abcnews.go.com/">ABC News</a></td>
<td>492</td>
<td><a href="http://www.webpagetest.org/result/100704_195213d99f8ad424f594121ce0d90ec4/" rel="nofollow">14.761s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_195213d99f8ad424f594121ce0d90ec4/" rel="nofollow">4.205s</a></td>
</tr>
<tr>
<td>Single <a href="http://abcnews.go.com/International/wireStory?id=11082548" title="Van Der Sloot Files Suit Against Initial Lawyer">story</a> on ABC News</td>
<td>492</td>
<td><a href="http://www.webpagetest.org/result/100704_1e67db2628f0d62d3440f9c91d648986/" rel="nofollow">8.697s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_1e67db2628f0d62d3440f9c91d648986/" rel="nofollow">3.037s</a></td>
</tr>
<tr>
<td><a href="http://www.rollingstone.com/">RollingStone</a></td>
<td>3,376</td>
<td><a href="http://www.webpagetest.org/result/100703_66004c5f83aedb432a3def54da60a187/" rel="nofollow">18.700s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_66004c5f83aedb432a3def54da60a187/" rel="nofollow">3.572s</a></td>
</tr>
<tr>
<td>An <a href="http://www.rollingstone.com/politics/matt-taibbi/blogs/TaibbiData_May2010/122137/83512">article</a> on RollingStone</td>
<td>3,376</td>
<td><a href="http://www.webpagetest.org/result/100703_23498c0f96dc6fa6e1f9c415f3b8c100/" rel="nofollow">10.707s</a></td>
<td><a href="http://www.webpagetest.org/result/100703_23498c0f96dc6fa6e1f9c415f3b8c100/" rel="nofollow">2.905s</a></td>
</tr>
<tr>
<td><a href="http://www.foxnews.com/">Fox News</a></td>
<td>201</td>
<td><a href="http://www.webpagetest.org/result/100704_6caae51946f1b26a66543f08c2c2aaa9/" rel="nofollow">26.755s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_6caae51946f1b26a66543f08c2c2aaa9/" rel="nofollow">8.761s</a></td>
</tr>
<tr>
<td>Single <a href="http://www.foxnews.com/world/2010/07/03/van-der-sloot-files-suit-lawyer-charging-posing-public-defender/" title="Van der Sloot files suit against first lawyer, charging him with posing as public defender">story</a> on Fox News</td>
<td>201</td>
<td><a href="http://www.webpagetest.org/result/100704_17b7f402a9d40f3a694a899b8919e371/" rel="nofollow">16.281s</a></td>
<td><a href="http://www.webpagetest.org/result/100704_17b7f402a9d40f3a694a899b8919e371/" rel="nofollow">5.839s</a></td>
</tr>
</table>
<p><strong><em>The Sajal Kayan award</em> for excellence in web development goes to <em>FOX News</em></strong><br />
Test conditions :-</p>
<ul>
<li><em>First View</em> is simulated to act as the first time the visitor visits the website, the <em>Repeat view</em> is when the user visits the website again after a while, probably closes and reopens the browser in between.</li>
<li>All tests were done on <a href="http://www.microsoft.com/windows/internet-explorer/default.aspx">Internet Explorer 8</a>. Yes, I know you hate IE, I hate it more than you! But since the majority of the web uses this browser we test on IE8. Also IE8 performs significantly better than its predecessors. Some of the sites tested may take upto 2x the time to load on IE7 due to lower parallel downloads per host.</li>
<li>The tests were done from a test machine in Dulles, VA Provided by <a href="http://dev.aol.com/">AOL</a> set to work at DSL speeds(1500 Kbps down, 384 Kbps up and 50ms latency)</li>
<li>The timings above show when the <em>window.onload</em> event was triggered. It is possible that the page was usable well before that. Very rarely is also possible that for the page needs to do more stuff after onload in order to be usable.</li>
<li>Each test was repeated 5 times and its average was taken</li>
<li>I have tried to be as fair as possible in selection of pages for the test. I didnt use pages filled with videos/many images, etc.</li>
<li>Tests were conducted during this weekend(July 3rd and 4th) so traffic to those site would be much lower than usual.</li>
</ul>
<p>You can see from the table above, that except for <em>Google News</em>, <em>Yahoo! News</em> and (to some extent) <em>The New York Times</em> and <em>BBC</em>, all other websites have not yet given a minute of thought on Frontend performance of their web pages. This needs to change. Even one of the biggest evangalist of Frontend performance Google, <a href="http://www.webpagetest.org/result/100704_ba30dd802cd14323a2e2e840034cf0ce/5/performance_optimization/" rel="nofollow">forgot</a> to set correct cache control headers for images on Google News.</p>
<p>I place more emphasis on individual story pages since those are the pages which a first time user would encounter first - coming via search, links, tweets etc&#8230;</p>
<p>Now, when browsing un-optimized sites from Thailand, the load times increased exponentially due to a much higher latency and occasional packet loss.</p>
<p>Finally! compare each of the above link <a href="http://www.webpagetest.org/video/compare.php?tests=100704_WZY,100704_WZZ,100704_X00,100704_X01,100704_X02,100704_X03,100704_X04,100704_X05,100704_X06,100704_X07,100704_X08,100704_X09,100704_X0A,100704_X0B,100704_X0C,100704_X0D,100704_X0E,100704_X0F,100704_X0G,100704_X0H,100704_X0J,100704_X0K,100704_X0M,100704_X0N,100704_X0P,100704_X0Q,100704_X0R,100704_X0S,100704_X0T,100704_X0V,100704_X0W,100704_X0X,100704_X0Y" rel="nofollow">visually in IE7 in video mode</a>. WaPo single story is missing due to a bug. For each URL, 3 first-view tests will be run from &#8216;Dulles, VA - IE7&#8242; and the median run will be used for comparison.</p>
<h4>How to improve?</h4>
<p>Essential reading :-</p>
<ul>
<li><a href="http://developer.yahoo.com/performance/rules.html">Yahoo! : Best Practices for Speeding Up Your Web Site</a></li>
<li><a href="http://code.google.com/speed/page-speed/docs/rules_intro.html">Google : Web Performance Best Practices</a></li>
<li><a href="http://ronnysnewyorkpizza.com/about.html">This awesome list of tools</a></li>
</ul>
<p>An optimizers goal should not be simply to get the total load time low. Thats important, but more important is the time in which the website kinda becomes usable. This for a news publishing site is being able to display the title and body of the story. The user can start reading the content while other parts of the website loads. This for <a href="http://www.thaindian.com/">Thaindian News</a> story page hovers around 0.9 to 1.5 seconds!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/attn-big-media-web-page-speed-matters.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>In a CDN&#8217;d world, OpenDNS is the enemy!</title>
		<link>http://www.sajalkayan.com/in-a-cdnd-world-opendns-is-the-enemy.html</link>
		<comments>http://www.sajalkayan.com/in-a-cdnd-world-opendns-is-the-enemy.html#comments</comments>
		<pubDate>Mon, 17 May 2010 17:19:30 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Webmaster Things]]></category>

		<category><![CDATA[Akamai]]></category>

		<category><![CDATA[benchmark]]></category>

		<category><![CDATA[bind]]></category>

		<category><![CDATA[CDN]]></category>

		<category><![CDATA[dns]]></category>

		<category><![CDATA[google]]></category>

		<category><![CDATA[OpenDNS]]></category>

		<category><![CDATA[Softlayer]]></category>

		<category><![CDATA[true]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=117</guid>
		<description><![CDATA[While many people are happy with using DNS service providers such as OpenDNS, Google, etc&#8230; I will show you here why they may not produce optimal results.
The way most CDNs work is by using DNS routing. When a user attempts to resolve a hostname, the CDN&#8217;s DNS server responds with an IP which is closest [...]]]></description>
			<content:encoded><![CDATA[<p>While many people are happy with using DNS service providers such as OpenDNS, Google, etc&#8230; I will show you here why they may not produce optimal results.</p>
<p>The way most CDNs work is by using DNS routing. When a user attempts to resolve a hostname, the CDN&#8217;s DNS server responds with an IP which is closest based on the IP address of the requester. A more detailed insight into the workings of a CDN can be found on an earlier post &#8220;<a href="http://www.sajalkayan.com/make-your-own-cheap-charlie-cdn.html">Make your own cheap charlie CDN</a>&#8221;</p>
<p>For my test here, I tested from the following locations listed below :-</p>
<ol>
<li><a href="http://www.asianet.co.th/">True</a> - Thailand : My personal internet connection provided by the ISP called True Internet.</li>
<li><a href="http://www.softlayer.com/">Softlayer</a> - United States : A server hosted at Softlayer&#8217;s Washington DC Datacenter.</li>
<li><a href="http://aws.amazon.com/ec2/">EC2</a> - United States : An EC2 instance at Amazons us-east-1c availability zone.</li>
<li><a href="http://aws.amazon.com/ec2/">EC2-EU</a> - Ireland : An EC2 instance at Amazons eu-west-1 availability zone. - Thanks <a href="http://twitter.com/LukeInTH">Luke</a></li>
<li><a href="http://aws.amazon.com/ec2/">EC2-APAC</a> - Singapore : An EC2 instance at Amazons ap-southeast-1a availability zone.</li>
<li><a href="http://www.comhem.se/">Com Hem</a> - Sweden : An ISP in sweden. - Thanks <a href="http://twitter.com/nadam9">Adam</a></li>
<li><a href="http://www.tataindicombroadband.in/">Tata</a> - India : An ISP in India. - Thanks <a href="http://gaeatimes.com/">Angsuman</a></li>
</ol>
<p>The following DNS servers were used to resolve the domains :-</p>
<ol>
<li><a href="http://www.opendns.com/">OpenDNS</a> (208.67.222.222 , 208.67.220.220 )- Has different caches in multiple locations(Anycasted) - Chicago, Illinois, USA; Dallas, Texas, USA; Los Angeles, California, USA; Miami, Florida, USA; New York, New York, USA; Palo Alto, California, USA; Seattle, Washington, USA; Washington, DC, USA; Amsterdam, The Netherlands and London, England, UK</li>
<li><a href="http://code.google.com/speed/public-dns/">Google Public DNS</a> (8.8.8.8 , 8.8.4.4 ) - &#8220;Google Public DNS servers are available worldwide&#8221; . I think Google has their DNS servers in all countries where they have hosting infrastructure.</li>
<li>Local DNS - The ISP provided DNS in the different locations.</li>
</ol>
<p>The test was done to the following CDN providers :-</p>
<ol>
<li><a href="http://www.internap.com/">Internap</a> ( cdn.thaindian.com ) - Uses DNS routing. POPs (Point Of Presence) in the following locations : Atlanta; Boston; Chicago; Dallas; Denver; El Segundo; Houston; Miami; New York; Philadelphia; Phoenix; San Jose; Seattle; Washington, DC; Sydney; Tokyo; Singapore; Hong Kong; Amsterdam; London</li>
<li><a href="http://www.akamai.com/">Akamai</a> ( profile.ak.fbcdn.net ) - AFAIK they have a POP in almost all countries including Thailand. Note: Akamai does not entertain sales queries from Thai companies.</li>
</ol>
<h3>Results:-</h3>
<p>1) <strong>Internap</strong> ( using cdn.thaindian.com )</p>
<table border="1" style="font-size: 8pt;">
<tbody>
<tr>
<th>Location</th>
<th colspan="2">Opendns</th>
<th colspan="2">Google</th>
<th colspan="2">Local</th>
</tr>
<tr>
<th></th>
<th>IP Returned</th>
<th>Ping to IP (ms)</th>
<th>IP Returned</th>
<th>Ping to IP (ms)</th>
<th>IP Returned</th>
<th>Ping to IP (ms)</th>
</tr>
<tr>
<th>True (Thailand)</th>
<td>64.94.126.65</td>
<td>256</td>
<td>74.201.0.130</td>
<td>365</td>
<td>203.190.126.131</td>
<td style="background-color: #CCFF99">152</td>
</tr>
<tr>
<th>Softlayer (US-East Coast)</th>
<td>69.88.152.250</td>
<td style="background-color: #CCFF99">1.253</td>
<td>74.201.0.130</td>
<td>25.69</td>
<td>69.88.152.250</td>
<td>1.388</td>
</tr>
<tr>
<th>EC2 (US-East Coast)</th>
<td>69.88.152.250</td>
<td>2.144</td>
<td>74.201.0.130</td>
<td>20.229</td>
<td>69.88.152.250</td>
<td style="background-color: #CCFF99">2.094</td>
</tr>
<tr>
<th>EC2 (Europe)</th>
<td>77.242.194.130</td>
<td>13.331</td>
<td>64.7.222.130</td>
<td>159.422</td>
<td>77.242.194.130</td>
<td style="background-color: #CCFF99">12.504</td>
</tr>
<tr>
<th>EC2 (Singapore)</th>
<td>64.94.126.65</td>
<td>202</td>
<td>74.201.0.130</td>
<td>228</td>
<td>202.58.12.98</td>
<td style="background-color: #CCFF99">37.260</td>
</tr>
<tr>
<th>Com Hem (Sweden)</th>
<td>77.242.194.130</td>
<td>40.035</td>
<td>64.7.222.130</td>
<td>189.647</td>
<td>69.88.148.130</td>
<td style="background-color: #CCFF99">36.310</td>
</tr>
<tr>
<th>Tata (India)</th>
<td>64.7.222.130</td>
<td>313.2</td>
<td>64.74.124.65</td>
<td>304.1</td>
<td>203.190.126.131</td>
<td style="background-color: #CCFF99">150</td>
</tr>
</tbody>
</table>
<p>2) <strong>Akamai</strong> ( using profile.ak.fbcdn.net )</p>
<table border="1"  style="font-size: 8pt;">
<tbody>
<tr>
<th>Location</th>
<th colspan="2">Opendns</th>
<th colspan="2">Google</th>
<th colspan="2">Local</th>
</tr>
<tr>
<th></th>
<th>IP Returned</th>
<th>Ping to IP (ms)</th>
<th>IP Returned</th>
<th>Ping to IP (ms)</th>
<th>IP Returned</th>
<th>Ping to IP (ms)</th>
</tr>
<tr>
<th>True (Thailand)</th>
<td>208.50.77.112</td>
<td>239.4</td>
<td>60.254.185.83</td>
<td>138.9</td>
<td>58.97.45.59</td>
<td style="background-color: #CCFF99">18.88</td>
</tr>
<tr>
<th>Softlayer (US-East Coast)</th>
<td>72.246.31.57</td>
<td>1.312</td>
<td>72.246.31.42</td>
<td>1.262</td>
<td>24.143.196.88</td>
<td style="background-color: #CCFF99">0.877</td>
</tr>
<tr>
<th>EC2 (US-East Coast)</th>
<td>72.246.31.73</td>
<td>2.581</td>
<td>72.246.31.25</td>
<td style="background-color: #CCFF99">1.792</td>
<td>72.247.242.51</td>
<td>1.941</td>
</tr>
<tr>
<th>EC2 (Europe)</th>
<td>195.59.150.139</td>
<td style="background-color: #CCFF99">13.449</td>
<td>92.122.207.177</td>
<td>29.022</td>
<td>195.59.150.138</td>
<td>13.516</td>
</tr>
<tr>
<th>EC2 (Singapore)</th>
<td>208.50.77.94</td>
<td>202</td>
<td>60.254.185.73</td>
<td>71.7</td>
<td>124.155.222.10</td>
<td style="background-color: #CCFF99">7.052</td>
</tr>
<tr>
<th>Com Hem (Sweden)</th>
<td>217.243.192.8</td>
<td>51.73</td>
<td>92.123.69.82</td>
<td>35.972</td>
<td>92.123.155.139</td>
<td style="background-color: #CCFF99">13.212</td>
</tr>
<tr>
<th>Tata (India)</th>
<td>209.18.46.113</td>
<td>300</td>
<td>203.106.85.33</td>
<td>196</td>
<td>125.252.226.58</td>
<td style="background-color: #CCFF99">100.5</td>
</tr>
</tbody>
</table>
<p>The ping timings represent the lag to the destination server from the location in question. I will try to update the results from more locations if I can get shell access to a server or PC in other countries. If you are willing to run the tests for me please contact me(or post in comments)</p>
<h3>Conclusion</h3>
<p>Using OpenDNS or Google Public DNS may be fast in resolving the DNS, but they do not give the ideal results.</p>
<p>In the case of Global DNS providers, the IP of the original requester is not passed along to the CDN&#8217;s DNS servers so they are unable to route the user to the nearest POP.</p>
<p>As you can see in the result tables above, when using OpenDNS from Thailand, trying to access static assets of Facebook, I am directed to a server in the USA whereas when using Google&#8217;s DNS i am directed to a server in Japan and when using my ISP&#8217;s DNS server I access content locally, hosted within my own ISPs network!</p>
<p>While the effect on large websites using CDN is significant, smaller non-CDN&#8217;d websites are also effected. Most websites embed widgets, advertising and other assets which are likely to be CDN&#8217;d.</p>
<p>The solution would be to use your ISPs DNS server rather than these Global providers. If they really suck so bad, its fairly simple to set up BIND as a caching recursive resolver to resolve hostnames directly bypassing the ISPs crappy service.</p>
<p><a href="http://www.linkedin.com/in/billf">Bill Fumerola</a>, ex-director of network engineering at OpenDNS <a href="http://forums.opendns.com/comments.php?DiscussionID=1096#Item_7">confirms this problem</a> on OpenDNS forums. </p>
<p>You can run the tests from your own computer using this simple script: <a href="http://www.sajalkayan.com/dnstest.py" target="_blank">dnstest.py</a></p>
<p>Here is the named.conf for a recursive server. Set your computer to use 127.0.0.1 as the DNS. - config may differ for you, RTFM and adapt accordingly.</p>
<pre>options {
        directory "/var/named";
        listen-on {
		127.0.0.1;
        };
        auth-nxdomain yes;
        allow-recursion {
                127.0.0.1;
        };
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";

};

//
// a caching only nameserver config
//
zone "." {
        type hint;
        file "named.ca";
};

include "/etc/named.rfc1912.zones";

include "/etc/named.dnssec.keys";
include "/etc/pki/dnssec-keys/dlv/dlv.isc.org.conf";</pre>
<p>EDIT 1: Inverted the axis added test data from Europe<br />
EDIT 2: Added test data from Singapore<br />
EDIT 3: Added test data from Sweden<br />
EDIT 4: Added test data from India<br />
EDIT 5: Added link to Bill Fumerola’s explanation of the problem.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/in-a-cdnd-world-opendns-is-the-enemy.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Simple command to &#8220;watch&#8221; the webserver access log</title>
		<link>http://www.sajalkayan.com/simple-command-to-watch-the-webserver-access-log.html</link>
		<comments>http://www.sajalkayan.com/simple-command-to-watch-the-webserver-access-log.html#comments</comments>
		<pubDate>Fri, 02 Apr 2010 14:34:16 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Linux]]></category>

		<category><![CDATA[Webmaster Things]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=116</guid>
		<description><![CDATA[I am often curious as to what bots are going on my site at any given moment. So much so that I devote one terminal tab to running this script.
save the following as say bot.sh and make it executable :-



#!/bin/bash


watch &#34;grep $1 /path/to/access.log &#124; tail -15&#34;



note: the number after tail can be adjusted depending on [...]]]></description>
			<content:encoded><![CDATA[<p>I am often curious as to what bots are going on my site at any given moment. So much so that I devote one terminal tab to running this script.</p>
<p>save the following as say bot.sh and make it executable :-</p>
<div class="dean_ch" style="white-space: wrap;">
<ol>
<li class="li1">
<div class="de1"><span class="re3">#!/bin/bash</span></div>
</li>
<li class="li1">
<div class="de1">watch <span class="st0">&quot;grep $1 /path/to/access.log | tail -15&quot;</span></div>
</li>
</ol>
</div>
<p>note: the number after tail can be adjusted depending on your terminal size&#8230;</p>
<p>Run it on the server as :-</p>
<div class="dean_ch" style="white-space: wrap;">
<ol>
<li class="li1">
<div class="de1">[user@server ~]# ./bot.sh Googlebot</div>
</li>
</ol>
</div>
<p>OR</p>
<div class="dean_ch" style="white-space: wrap;">
<ol>
<li class="li1">
<div class="de1">[user@server ~]# ./bot.sh msnbot</div>
</li>
</ol>
</div>
<p>OR</p>
<div class="dean_ch" style="white-space: wrap;">
<ol>
<li class="li1">
<div class="de1">[user@server ~]# ./bot.sh &lt;suspicious ip address&gt;</div>
</li>
</ol>
</div>
<p>and so on&#8230;.</p>
<p><strong>UPDATE</strong>: Better <a href="http://www.sajalkayan.com/simple-command-to-watch-the-webserver-access-log.html#comment-1039">alternative</a> by <a href="http://whsgroup.ath.cx/">willwill</a>. Save as bot.sh:-</p>
<div class="dean_ch" style="white-space: wrap;">
<ol>
<li class="li1">
<div class="de1"><span class="re3">#!/bin/bash</span></div>
</li>
<li class="li1">
<div class="de1">watch <span class="st0">&quot;tail -f /path/to/access.log | grep $1&quot;</span></div>
</li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/simple-command-to-watch-the-webserver-access-log.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Future releases of Firefox to speed page load time considerably?</title>
		<link>http://www.sajalkayan.com/future-releases-of-firefox-to-speed-page-load-time-considerabilly.html</link>
		<comments>http://www.sajalkayan.com/future-releases-of-firefox-to-speed-page-load-time-considerabilly.html#comments</comments>
		<pubDate>Wed, 20 Jan 2010 14:29:58 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Webmaster Things]]></category>

		<category><![CDATA[GAM]]></category>

		<category><![CDATA[google ad manager]]></category>

		<category><![CDATA[site performance]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=115</guid>
		<description><![CDATA[Living in Thailand has its fair share of disadvantages. The most prominent being bad internet and poor response times. In most cases, the packet shaping, caching and filtering mechanisms use by ISPs do more harm than good. A response from a US server may take anywhere between 100 to 1000 ms extra than it should [...]]]></description>
			<content:encoded><![CDATA[<p>Living in Thailand has its fair share of disadvantages. The most prominent being bad internet and poor response times. In most cases, the packet shaping, caching and filtering mechanisms use by ISPs do more harm than good. A response from a US server may take anywhere between 100 to 1000 ms extra than it should (not counting the ping lag and server processing overhead, etc). These days, most websites integrate a lot of client side external scripts and APIs, lagging responses make for a horrible user experience.</p>
<p>Especially when within one ad code, I have a default ad code and that too has a default. This means, when an impression is trying to be filled, the ad network decides, if they can fill the impression based on parameters I set or not. If not, then they pass the impression down the chain to another network. It goes on until the end network. In my case the chain is mostly 3 networks. I cant increase it as it results in a poorer user experience.</p>
<p><img src="http://www.sajalkayan.com/wmt-performance-chart.png" border="0" alt="Google Webmaster Tools" width="500" height="94" /></p>
<p>Recently, Google started showing average response times in <a href="https://www.google.com/webmasters/tools/home?hl=en">Google Webmaster Tools</a> so, Ive started worrying about these things more than I should.</p>
<p>On <a title="Thaindian News" href="http://www.thaindian.com/newsportal/">my site</a>, I have 2 ad blocks(leaderboard, skyscraper and another block which shows up on individual story pages) which load up before the main content page. Recently I moved the ads to Google ad manager which has a <a href="http://www.google.com/support/admanager/bin/answer.py?hl=en&amp;answer=94394">wonderful way of debugging ad loading</a> by adding ?google_debug to the end of the URL.</p>
<p>My first impression of <a href="https://www.google.com/admanager">Google ad manager</a> was excellent. My page was no more held up while the ads loaded, but soon I realized thats not an admanager feature, it is <strong>firefox 3.5.8pre</strong> which is speeding things up.</p>
<p><a rel="lightbox" href="http://www.sajalkayan.com/ff-3.5.8pre-ubuntu-big.png"> <img src="http://www.sajalkayan.com/ff-3.5.8pre-ubuntu.png" alt="" /></a></p>
<p>Browsers Useragent : <em>Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8pre) Gecko/20100116 Ubuntu/9.04 (jaunty) Shiretoko/3.5.8pre</em> (Click image for full screenshot)</p>
<p>My tests on my laptop shows otherwise. (it runs 3.0.17).</p>
<p><a rel="lightbox" href="http://www.sajalkayan.com/ff-3.0-ubuntu-big.png"> <img src="http://www.sajalkayan.com/ff-3.0-ubuntu.png" alt="" /></a></p>
<p>Browsers Useragent: <em>Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.16) Gecko/2009121601 Ubuntu/9.04 (jaunty) Firefox/3.0.17</em> (Click on image for full screenshot)</p>
<p>This does not speed up much on Chrome or IE too&#8230; They all show that the &#8220;Time the page is blocked fetching ads from Google&#8221; to range between 1000ms to 2500ms. The variation is irrelevant its due to network issues and ad server response times. But the bottom line is that these browsers do hold up the page while the ads load.</p>
<p>Maybe this is an improvement in the latest Ubuntu nightly build or a general improvement, whatever it is, the future is <a href="http://www.getfirefox.com/">Firefox</a> and they are <a href="http://www.mozilla.com/en-US/firefox/fastest/">fast</a>!</p>
<p>So far, there has been no proper way to load ads such that they don&#8217;t block the rest of the page from loading. The 2 ways i know of are very ugly and I don&#8217;t like them :-</p>
<ol>
<li>Load the target adscript from a separate HTML file loaded via iframe - costs one extra request/ad code, may screw up ad targeting, etc.</li>
<li>Place a blank hidden div in place of ad, load the ad in a hidden div below the actual content and then using javascript trickery swap contents of the hidden div with this ad div. - sounds ugly again. not a neat solution.</li>
</ol>
<p>Of course there is a neat and ideal solution&#8230; which is to make your template in such a way(CSS absolute positioning or something) such that the HTML of the content appears before in the code than the ad javascript&#8230; but again this is cumbersome. <a href="http://www.google.com/support/forum/p/Google+Ad+Manager/thread?tid=4e3c789b45f46902&amp;hl=en">Interesting discussion here</a>.</p>
<p>In an ideal world, all ad networks would be banned from using <em>document.write</em> in their scripts and use some form of ajax to call the banner code after(or during) rest of the page has loaded. Its not 2001 anymore!</p>
<p>Here is what I request from you, open the following URL <a href="http://www.thaindian.com/newsportal/?google_debug">http://www.thaindian.com/newsportal/?google_debug</a> then there should be 1 or 2 popups(maybe some browsers need to disable popup blocker). Look at the popup which resembles the <a rel="lightbox" href="http://www.sajalkayan.com/ff-3.0-ubuntu-big.png">screenshots</a> above, and report your findings in the comments below. Be sure to wait for the main page to complete loading and don&#8217;t forget to include your full useragent. If you can upload screenshots somewhere then please drop their URLs in comments too.</p>
<p>The info i need, could be like the following example:-</p>
<p><em>User Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8pre) Gecko/20100116 Ubuntu/9.04 (jaunty) Shiretoko/3.5.8pre<br />
Debug: -<br />
7342	Information  	Time the page is blocked fetching ads from Google 0 ms<br />
7343	Information  	Time the page is blocked rendering ads from Google 0 ms</em><br />
Your useragent can be checked <a href="http://whatsmyuseragent.com/">here</a>.</p>
<p>Video of pageload on Google Chrome:-<br />
<object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/M6B9t2tbwx0&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/M6B9t2tbwx0&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>Video of pageload on Firefox 3.6pre(Ubuntu build):-<br />
<object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/4SIMl8IEZsg&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/4SIMl8IEZsg&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p><strong>UPDATE</strong>: I upgraded my main browser to Firefox 3.6 (Your User Agent: Mozilla/5.0 (<em>X11; U; Linux x86_64; en-US; rv:1.9.2pre) Gecko/20100120 Ubuntu/9.04 (jaunty) Namoroka/3.6pre</em>) same results as 3.5.8pre its bloody fast and doesn&#8217;t stall the pageload waiting for ads.</p>
<p><strong>UPDATE 2:</strong> Based on comment by Archit below, the speed improvement is not visible on 3.6rc2 . My conclusions are based on the nightly builds by the <a href="https://launchpad.net/~ubuntu-mozilla-daily">Ubuntu Mozilla Daily Build Team</a></p>
<p><strong>UPDATE 3:</strong> Added videos</p>
<p><strong>UPDATE 4:</strong> For my site I implemented the hidden div trick, so for now, all browsers will not notice the visual delay.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/future-releases-of-firefox-to-speed-page-load-time-considerabilly.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>I, me and Solid State Drives</title>
		<link>http://www.sajalkayan.com/i-me-and-solid-state-drives.html</link>
		<comments>http://www.sajalkayan.com/i-me-and-solid-state-drives.html#comments</comments>
		<pubDate>Sun, 20 Sep 2009 09:47:24 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Gadgets]]></category>

		<category><![CDATA[HDD]]></category>

		<category><![CDATA[SSD]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=114</guid>
		<description><![CDATA[Let me first explain my set-up before the upgrade. I use 2 computers, 1 desktop at office and a laptop at home or while traveling.
Laptop : Lenovo; core 2; 2 GB RAM; regular 5.4k rpm hard disk. Purchased abt 1.5 years ago.
Desktop : (costed same as laptop but purchased only 3 or 4 months ago) [...]]]></description>
			<content:encoded><![CDATA[<p>Let me first explain my set-up before the upgrade. I use 2 computers, 1 desktop at office and a laptop at home or while traveling.</p>
<p>Laptop : Lenovo; core 2; 2 GB RAM; regular 5.4k rpm hard disk. Purchased abt 1.5 years ago.</p>
<p>Desktop : (costed same as laptop but purchased only 3 or 4 months ago) i7 CPU 920  @ 2.67GHz ; 6 GB DDR3 ; a kickass motherboard; 1TB Seagate HDD (ST31000528AS) - 7.2k rpm</p>
<p>After using the Desktop for most of the work, I was no more able to work from the Laptop which was significantly slower than the desktop. Having used <a href="http://en.wikipedia.org/wiki/Solid-state_drive"><strong>Solid State Drive (SSD)</strong></a> on my server for few months as Mysql data directory, I decided to see how it could improve things on the laptop.</p>
<p>Over the last few months, I had saved enough to treat myself to some gadgets <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><img src="http://techplore.com/technology/wp-content/uploads/2009/06/Intel-X25-M-80GB-SSD.jpg" alt="Intel X25-M 80GB SSD" /></p>
<p>On reaching Fortune mall, I couldn&#8217;t find SSDs anywhere, no store there had heard of &#8220;Solid State Disk&#8221; or &#8220;SSD&#8221;. In fact one shopkeeper thought I wanted to buy SD cards. I only upgraded the memory to 4 GB. After this upgrade, things didn&#8217;t speed up much, just that it didn&#8217;t lag anymore after opening loads of applications.</p>
<p>I had lost all hope&#8230; and even started thinking in what other way to spend my gadget budget when Twitter came to the rescue in the form of a <a rel="nofollow" href="http://twitter.com/smartbrain/statuses/3925803217">reply</a> from my fav columnist which said &#8220;<em>@sajal at fortune jet has some, but for components, zeer is better these days i feel. It swings.</em>&#8221; .</p>
<p>Getting my hopes up.. I went to that shop, they had options between Intel X25-M(<a href="http://en.wikipedia.org/wiki/Multi-level_cell">MLC</a>) and few OCI brands. Since I don&#8217;t care about diskspace, I was actually looking for X25-E(<a href="http://en.wikipedia.org/wiki/Single-level_cell">SLC</a>) which is in my server, but they hadn&#8217;t heard of it here in Bangkok, so I settled for a <a title="nice review" href="http://techreport.com/articles.x/15433">80 GB X25-M</a> costing me 13,500 Baht (approx 400 USD). Decision to go for Intel was highly influenced by <em>AnandTech&#8217;s</em> <a href="http://anandtech.com/storage/showdoc.aspx?i=3631">reviews</a>. <em>AnandTech</em> has series of articles and benchmarks on SSD performance and benchmarks for various applications. (Dear <em>AnandTech</em> : Will you give me a job? All i need is a chance to play with cool things <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> )</p>
<p>Disadvantages of using SSD in laptop:-</p>
<ol>
<li>Unlike before, now you can&#8217;t feel the vibration, noise from the drive, hence you don&#8217;t &#8220;feel&#8221; when your disk is being accessed or is it idling.</li>
<li>The disk activity LED hardly lights up&#8230; SSDs idle much more than regular disk since the requested data is returned immediately, thus the transaction is completed before the LED can light up fully.</li>
<li>The laptop(running Ubuntu 9.04) boots up in &lt; 20 seconds including me logging in. This doesnt give me enough time to get coffee, pee, etc after reaching home and pushing the on switch.</li>
<li>It is slightly thinner than a regular 2.5&#8243; notebook HDD, hence it is placed in my laptop at a very slight angle. Maybe later i should put in a metal sheet or something to compensate.</li>
</ol>
<p>Now, the performance of my laptop is in fact slightly better than the desktop. I am looking forward to the day when SSDs get more commoditized and we start seeing them compete with regular HDD in terms of cost per GB. - If i had the kind of money, id load up 10 64 GB Intel X25-Es in my main desktop <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>My conclusion is that the main bottleneck in laptops is Disk I/O. I guess within a year or sooner, we will see most mid-segment laptops coming with SSDs instead of currently HDD.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/i-me-and-solid-state-drives.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Google News ranking factors, 2003 patent revealed</title>
		<link>http://www.sajalkayan.com/google-news-ranking-factors-2003-patent-revealed.html</link>
		<comments>http://www.sajalkayan.com/google-news-ranking-factors-2003-patent-revealed.html#comments</comments>
		<pubDate>Thu, 20 Aug 2009 10:13:34 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[SEO]]></category>

		<category><![CDATA[google]]></category>

		<category><![CDATA[google news]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=113</guid>
		<description><![CDATA[Via WebProNews by Chris Crum
A patent application by Google &#8220;Systems and methods for improving the ranking of news articles&#8221; was Granted on August 18, 2009. The patent was originally filed about 6 years ago on &#8220;September 16, 2003&#8243;. Interesting analysis in human readable language at Seo By The Sea by Bill Slawski.
Before continuing, it is [...]]]></description>
			<content:encoded><![CDATA[<p>Via <a href="http://www.webpronews.com/topnews/2009/08/19/possible-google-news-ranking-factors-revealed-in-patent">WebProNews</a> by Chris Crum<br />
A patent application by Google &#8220;<a href="http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&#038;Sect2=HITOFF&#038;u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&#038;r=1&#038;p=1&#038;f=G&#038;l=50&#038;d=PTXT&#038;S1=7,577,655.PN.&#038;OS=pn/7,577,655&#038;RS=PN/7,577,655">Systems and methods for improving the ranking of news articles</a>&#8221; was Granted on August 18, 2009. The patent was originally filed about 6 years ago on &#8220;September 16, 2003&#8243;. Interesting analysis in human readable language at <a href="http://www.seobythesea.com/?p=2810">Seo By The Sea</a> by Bill Slawski.</p>
<p>Before continuing, it is better if you read the <a href="http://www.seobythesea.com/?p=2810">Bill&#8217;s</a> and <a href="http://www.webpronews.com/topnews/2009/08/19/possible-google-news-ranking-factors-revealed-in-patent">Chris&#8217;</a> posts first.</p>
<p>In spite of this filing being 6 years old, I personally believe some of the theory is still valid today. It is important to know what Google <strong>was</strong> doing in 2003 to better understand what it <strong>may</strong> be doing today.</p>
<p>Abstract of the patent :-</p>
<blockquote><p>A system ranks results. The system may receive a list of links. The system may identify a source with which each of the links is associated and rank the list of links based at least in part on a quality of the identified sources.</p></blockquote>
<p>I will first discuss points already established and then try to have my own conclusions.</p>
<p><strong>Source Rank</strong> : This is a rank given to different news sources. An article from a source having higher &#8220;Source Rank&#8221; would be more likely to rank higher than others. According to the patent, the following metrics go into determining the &#8220;Source Rank&#8221;.</p>
<p><strong>Number of articles produced by the news source during a given time period </strong> : Presumably more the better, rather more original articles the better compared to newswire stories.</p>
<p><strong>Average length of an article from the news source </strong> : Presumably, a news source with longer articles would get a better <em>Source Rank</em>.</p>
<p><strong>Breaking news score</strong> : The most interesting aspect, I had a rough feeling this was an important factor, the patent agrees. Ill discuss in my conclusions below, citing examples. Basically as per the patent, a news source which publishes news about events which just occurred, gives source a higher <em>Source Rank</em>.</p>
<p><strong>Usage pattern</strong> : Tracking click thrus from Google News search and analyzing that data. All links on Google news are redirected thru their forwarder. They have been tracking this data for as long as i can remember.</p>
<p><strong>Human opinion of the news source </strong> : Quite obvious <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><strong>Circulation statistics of the news source</strong> : Circulation stats from various media monitoring agencies.</p>
<p><strong>The size of the staff associated with the news source </strong> : Google recently started showing(where possible) author names in news search. These are detected automagically using some algorithm. Im quite sure that they have been tracking these internally for quite some time.<br />
<img src="http://www.sajalkayan.com/Google.png" alt="Google News search result showing Author Name"></p>
<p><strong>The number of news bureaus associated with the news source</strong> To favour bigger more established news outlets.</p>
<p><strong>Original named entities appearing in articles produced by the news source</strong> : A named entity is a specific person, place, organization, or thing. More unique Named Entities the better. This probably shows more in-depth news source.</p>
<p><strong>Number of topics on which the source produces content</strong> : To determine the niche the news source participates in. A news source like <a href="http://www.techcrunch.com/">TechCrunch</a> almost exclusively writes about Tech related articles, Google may then determine that TechCrunch is an authority on Tech related topics.</p>
<p><strong>International diversity of the news source </strong> : Checking on countries from where people visit to these sites from via Google News Search based on IP.</p>
<p><strong>The writing style used by the news source </strong> : Grammar, spelling, readability. Writing style may also help Google determine target audience. (eg British vs American English)</p>
<h3>Conclusion</h3>
<p>This is not at all related to what Google told us about <a href="http://www.sajalkayan.com/secrets-of-google-newswhat-i-learnt-the-hard-way.html">ranking on Google News</a>, it does provide some nice insight.</p>
<p>Now what i believe, is that <a href="http://www.sajalkayan.com/seo-and-newsgooglecom-run-your-own-news-website.html">Google News</a> also implements what id like to call a <strong>Source Rank per Topic</strong>. The <em>Breaking news score</em> as explained above is applicable on per topic basis too. Example my site had few stories about an incident just after a major news broke. It got some traffic, then got clouded by the regular big sources which allegedly have a much higher <em>Source Rank</em>. But from a couple of days later, any follow-ups I did, ranked well on Google News. My assumption is that Google sees which sources were the ones to <em>Break</em> the particular story and assigns them a temporary(or permanent) authority on the topic.</p>
<p>I have no views on the content length point, but i do agree that more original sentences do result in a higher <strong>Source Rank</strong>.</p>
<p>Another point which i don&#8217;t see mentioned but have a strong belief to be an important factor for the <em>Source Rank</em> is the performance of the website. Its basic common sense, that if Google is sending a lot of traffic, they don&#8217;t want these people to wait for ages while the overloaded servers of the News site is churning out the pages. Google would rather like faster sites. This was personally observed by me after I implemented a new caching mechanism which made average random page generation time drop to 50 to 100ms from previous ~1s . Within days my traffic from Google doubled. So even if you are running a small site like mine, it is best to keep your random page load delay as small as possible.</p>
<p>Google also sees(IMHO) regular SEO policies in determining the <em>Story Rank</em> for a news source. Internal linkage, external Linkage, etc..</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/google-news-ranking-factors-2003-patent-revealed.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Prospective search using python</title>
		<link>http://www.sajalkayan.com/prospective-search-using-python.html</link>
		<comments>http://www.sajalkayan.com/prospective-search-using-python.html#comments</comments>
		<pubDate>Wed, 22 Jul 2009 08:56:31 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=112</guid>
		<description><![CDATA[Prospective search, or persistent search, is a relatively less common method of implementing search where the list of keywords is defined, and when provided a single document it determines the list of keywords applicable to it.
This is different from traditional (or &#8220;retrospective&#8221;) search, where many documents are stored into an indexed and when provided with [...]]]></description>
			<content:encoded><![CDATA[<p><strong><a href="http://en.wikipedia.org/wiki/Prospective_search" rel="nofollow">Prospective search</a></strong>, or <strong>persistent search</strong>, is a relatively less common method of implementing search where the list of keywords is defined, and when provided a single document it determines the list of keywords applicable to it.</p>
<p>This is different from traditional (or &#8220;retrospective&#8221;) search, where many documents are stored into an indexed and when provided with a search term, the search engine returns the list of documents which best match the query.</p>
<p>The best real world examples would be how Google News Alerts(or IMHO categorization/clustering in Google News) works. When a new news story is found by Google, it makes more sense to run a prospective search on the news story to find which alert subscriptions (or news category) it belongs to, rather than searching for all the alerts repeatedly on their entire index.</p>
<p>Lucene has a <a href="http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/memory/MemoryIndex.html">MemoryIndex</a> class for just this purpose, ive made a simple implementation in python using <a href="http://lucene.apache.org/pylucene/">pylucene</a>. MemoryIndex is a special class in lucene for on-the-fly searching. It can contain only one doccument which may have more than one field. This is ideal for prospective search.</p>
<p>Installation and setup of pylucene is out of scope of this post&#8230; <a href="http://lucene.apache.org/pylucene/documentation/install.html">RTFM</a>! (do note u need to edit the MakeFile)</p>
<div class="dean_ch" style="white-space: wrap;">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">import</span> <span class="kw3">sys</span>, <span class="kw3">os</span>, lucene, <span class="kw3">time</span>, <span class="kw3">threading</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">def</span> ProspectiveSearch<span class="br0">&#40;</span>body, terms<span class="br0">&#41;</span>:</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; lucene.<span class="me1">initVM</span><span class="br0">&#40;</span>lucene.<span class="me1">CLASSPATH</span><span class="br0">&#41;</span></div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; index = lucene.<span class="me1">MemoryIndex</span><span class="br0">&#40;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; index.<span class="me1">addField</span><span class="br0">&#40;</span><span class="st0">&quot;content&quot;</span>, body, lucene.<span class="me1">StandardAnalyzer</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">parser</span> = lucene.<span class="me1">QueryParser</span><span class="br0">&#40;</span><span class="st0">&quot;content&quot;</span>, lucene.<span class="me1">StandardAnalyzer</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; matches = <span class="br0">&#91;</span><span class="br0">&#93;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span> term <span class="kw1">in</span> terms:</div>
</li>
<li class="li2">
<div class="de2">&nbsp; &nbsp; &nbsp; &nbsp; score=index.<span class="me1">search</span><span class="br0">&#40;</span><span class="kw3">parser</span>.<span class="me1">parse</span><span class="br0">&#40;</span>term<span class="br0">&#41;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> score &gt; <span class="nu0">0</span>:</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; matches += <span class="br0">&#91;</span>term<span class="br0">&#93;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">return</span> matches</div>
</li>
</ol>
</div>
<p>sample usage :-</p>
<div class="dean_ch" style="white-space: wrap;">
<ol>
<li class="li1">
<div class="de1">body = <span class="st0">&quot;hi my name is sajal kayan&quot;</span></div>
</li>
<li class="li1">
<div class="de1">terms = <span class="br0">&#91;</span><span class="st0">&quot;sajal&quot;</span>, <span class="st0">&quot;good&quot;</span>, <span class="st0">&quot;boy&quot;</span>, <span class="st0">&quot;name&quot;</span>, <span class="st0">&quot;sajal AND NOT kayan&quot;</span>, <span class="st0">&quot;sajal AND kayan&quot;</span><span class="br0">&#93;</span></div>
</li>
<li class="li1">
<div class="de1">matches = ProspectiveSearch<span class="br0">&#40;</span>body, terms<span class="br0">&#41;</span></div>
</li>
</ol>
</div>
<p>In this case returns ['sajal', 'name', 'sajal AND kayan']</p>
<p>Note:initVM() is <a href="http://stackoverflow.com/questions/548493/jcc-initvm-doesnt-return-when-modwsgi-is-configured-as-daemon-mode">giving problems</a> on mod_wsgi</p>
<p>On my computer, this is the benchmark i noticed for a 244 word content.</p>
<ul>
<li>1,492 queries : 0.79 seconds (for whole script only 248ms for the search loop)</li>
<li>14,920 queries : 1.519 seconds</li>
<li>74,600 queries : 3.425 seconds</li>
<li>149,200 queries : 5.552 seconds</li>
<li>298,400 queries : 10.328 seconds</li>
</ul>
<p>If you know a better method to achieve prospective search in python do let me know. Would also be interested to know if any RPC based search software does this.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/prospective-search-using-python.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>BarCampBKK3 - my experience!</title>
		<link>http://www.sajalkayan.com/barcampbkk3-my-experience.html</link>
		<comments>http://www.sajalkayan.com/barcampbkk3-my-experience.html#comments</comments>
		<pubDate>Mon, 25 May 2009 12:20:51 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[BarCampBangkok]]></category>

		<category><![CDATA[Thailand]]></category>

		<category><![CDATA[Bangkok]]></category>

		<category><![CDATA[BarCamp]]></category>

		<category><![CDATA[Barcamp Bangkok]]></category>

		<category><![CDATA[barcampbkk3]]></category>

		<category><![CDATA[geek]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=111</guid>
		<description><![CDATA[Last weekend(23rd and 24th May) I attended BarCamp Bangkok 3, it was an awesome experience&#8230; In this blogpost i intend to outline some of the interesting aspects of it from my viewpoint.

(Photo Credit new_davich on flickr)
Firstly over 700 people registered on the Barcamp Website. Atleast 550 people showed up at the actual event. That is [...]]]></description>
			<content:encoded><![CDATA[<p>Last weekend(23rd and 24th May) I attended <a href="http://www.barcampbangkok.org/"><strong>BarCamp Bangkok 3</strong></a>, it was an awesome experience&#8230; In this blogpost i intend to outline some of the interesting aspects of it from my viewpoint.</p>
<p><a href="http://www.flickr.com/photos/newdavich/3557413836/"><img src="http://farm4.static.flickr.com/3652/3557413836_397383f4d8_m.jpg" alt="Barcampbkk3 sign board" width="240" height="180" /></a></p>
<p>(Photo Credit <a href="http://www.flickr.com/photos/newdavich/">new_davich</a> on flickr)</p>
<p>Firstly over <a href="http://www.barcampbangkok.org/whos-coming">700 people registered on the Barcamp Website</a>. Atleast <strong>550</strong> people showed up at the actual event. That is 550 people registered at the registration desks on Day 1. There may have been more people turning up throughout the day who didn&#8217;t register and I don&#8217;t yet have the figure for Day 2. This IMHO would make <strong>BarCampbkk3 the biggest BarCamp in ASEAN</strong>. There were many people who flew in to Bangkok from overseas exclusively for the <a href="http://www.sajalkayan.com/tag/barcamp">BarCamp</a> from countries including Malaysia, Singapore, Cambodia, Vietnam and Hong Kong. Many to Bangkok for their first time.</p>
<p>Many thanks to <a href="http://www.spu.ac.th/english/index_eng.html">Sripatum University(SPU)</a> for agreeing to be the venue. They were very helpful and even provided us with 20 to 30 volunteers to help with the arrangements.</p>
<p><a href="http://www.flickr.com/photos/newdavich/3557451688/"><img src="http://farm3.static.flickr.com/2425/3557451688_69a5df2afe_m.jpg" alt="BarCampbkk3 Opening Ceremony" width="240" height="180" /></a></p>
<p>Opening Ceremony! - Dont be scared barcamp isint anything formal.. this is exception <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> (Photo Credit <a href="http://www.flickr.com/photos/newdavich/">new_davich</a> on flickr)</p>
<p>I collected the following schwag :-</p>
<p><a href="http://www.flickr.com/photos/vii/3560205034/"><img src="http://farm4.static.flickr.com/3371/3560205034_678a036a24_m.jpg" alt="BarCampbkk3 Shirt" width="180" height="240" /></a></p>
<p><strong>BarCamp Bangkok black T-Shirt (Thanks Luke for the awesome design)</strong> - Photo Credit Virak</p>
<p><a href="http://www.flickr.com/photos/preetamrai/3561447929/"><img src="http://farm4.static.flickr.com/3368/3561447929_eb5b0f1ce1_m.jpg" alt="Cloth Bag from SPU" width="240" height="180" /></a></p>
<p><strong>An eco friendly cloth Bag from SPU</strong> (Photo credit <a href="http://www.preetamrai.com/">Preetam Rai</a>)<br />
ATIZ white T-Shirt (if you can find photo ping me)<br />
Yahoo Car hanging thingy. (if you can find photo ping me)</p>
<p><strong>Tech start-ups in Thailand</strong></p>
<p>Among the interesting topics covered were some presentations and a discussion relating to Start-ups in Thailand. There were talks focused on financing issues and other issues faced by startups. The most common factors discouraging Thais and Foreigners from setting up a start-up in Thailand is(IMHO) the procedure and red-tape involved in setting up and managing a Thai Company. <a href="http://www.johnberns.com/">John</a> mentioned about a friend who flew to Singapore in a morning and by afternoon he had his company set-up and ready for business. So thats about 10,000 Baht for the airfare and about S$200 to S$300(about 4,700 to 7,100 Thai Baht) for formalities, etc. Here in Thailand even if you know exactly what to do, it would take weeks.</p>
<p><a href="http://twitter.com/proteusguy">Ben</a> from <a href="http://proteus-tech.com/">Proteus Tech</a> gave and interesting talk titled &#8220;How to Create a Successful Technical Startup&#8221;. Proteus Tech is also interested to encourage the potential Thai entrepreneurs. Proteus Tech said in a statement:-</p>
<blockquote><p>&#8220;We hope to organize a startup event to help people understand how to write a business plan and define a business strategy. Then we&#8217;ll have a follow up &#8220;startup gauntlet&#8221; where we give them a chance to present their biz plan and get evaluated + win some seed capital to start.&#8221;</p></blockquote>
<div id="__ss_1484965" style="width: 425px; text-align: left;"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=howtocreatetechnicalstartup-090525063933-phpapp02&amp;stripped_title=how-to-create-technical-startup" /><embed type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=howtocreatetechnicalstartup-090525063933-phpapp02&amp;stripped_title=how-to-create-technical-startup" allowscriptaccess="always" allowfullscreen="true"></embed></object></div>
<p>Ben&#8217;s Presentation - Why didn&#8217;t I see this a few years ago, I learned some of the points the hard way.</p>
<p><strong>Overnight Activities</strong></p>
<p>This was the first Barcamp in Thailand where we stayed at the venue overnight. The evening started with drinks at a nearby pub, after which we returned back to the venue. I tried in vain to help people getstated in Linux, but looks like nobody was interested&#8230; We played a couple of rounds of a <a href="http://www.eblong.com/zarf/werewolf.html">Werewolf Game</a> which was interesting, the foreigners always got nominated to be werewolves and kicked out first&#8230;. <a href="http://murz.wordpress.com/">@murz</a> (tried to) introduce us to a board game &#8220;<a href="http://www.gamecabinet.com/sumo/Issue2/AdelVerpflichtet.html">Adel Verpflichtet</a>&#8220;. The rules were so complex that she had to draw a flowchart to explain it <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Along with <a href="http://www.lopsta.com/">Jan</a>, I did a &#8220;<strong>SEO site clinic</strong>&#8221; where we analyzed volunteers websites from an SEO viewpoint. Unlike last BarCamp, this was attended by very few people, probably due to clash in timing with other more popular topics.</p>
<p><strong>Overall it was very exciting to be a part of BarCampBKK3 looking forward to BarCampBKK4</strong></p>
<p>Links:-</p>
<p>BarCamp Bangkok Website : <a href="http://www.barcampbangkok.org">http://www.barcampbangkok.org</a><br />
Pics : <a href="http://www.flickr.com/search/?q=barcampbkk3&amp;w=all">http://www.flickr.com/search/?q=barcampbkk3&amp;w=all</a><br />
Slides : <a href="http://www.flickr.com/search/?q=barcampbkk3&amp;w=all">http://www.slideshare.net/search/slideshow?lang=**&amp;submit=post&amp;q=+barcampbkk3&amp;commit=search</a></p>
<p>Blogs : <a href="http://blogsearch.google.com/blogsearch?q=barcampbkk3">http://blogsearch.google.com/blogsearch?q=barcampbkk3</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/barcampbkk3-my-experience.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Typical interaction of Windows Vista</title>
		<link>http://www.sajalkayan.com/typical-interaction-of-windows-vista.html</link>
		<comments>http://www.sajalkayan.com/typical-interaction-of-windows-vista.html#comments</comments>
		<pubDate>Mon, 27 Apr 2009 08:37:26 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Humour]]></category>

		<category><![CDATA[microsoft]]></category>

		<category><![CDATA[vista]]></category>

		<category><![CDATA[windows]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=110</guid>
		<description><![CDATA[Vista : Are you sure?
User : Yes
Vista : Are you sure about being sure?
User : Yes
Vista : Are you sure about being sure about being sure?
User : Yes
Vista : Are you sure about being sure about being sure about being sure?
User : Yes
Vista : Are you sure about being sure about being sure about being [...]]]></description>
			<content:encoded><![CDATA[<p>Vista : Are you sure?<br />
User : Yes<br />
Vista : Are you sure about being sure?<br />
User : Yes<br />
Vista : Are you sure about being sure about being sure?<br />
User : Yes<br />
Vista : Are you sure about being sure about being sure about being sure?<br />
User : Yes<br />
Vista : Are you sure about being sure about being sure about being sure about being sure?<br />
User : Yes<br />
Vista : Are you sure about being sure about being sure about being sure about being sure about being sure?<br />
User : Yes<br />
Vista : Are you sure about being sure about being sure about being sure about being sure about being sure about being sure?<br />
User : Grrr&#8230;. Screw you Microsoft!!!!<br />
Vista : Are you sure you want to screw Microsoft?<br />
&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/typical-interaction-of-windows-vista.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Python script to detect bad bots/people faking as Googlebot</title>
		<link>http://www.sajalkayan.com/python-script-to-detect-bad-botspeople-faking-as-googlebot.html</link>
		<comments>http://www.sajalkayan.com/python-script-to-detect-bad-botspeople-faking-as-googlebot.html#comments</comments>
		<pubDate>Sat, 28 Mar 2009 16:25:34 +0000</pubDate>
		<dc:creator>Sajal Kayan</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<category><![CDATA[Webmaster Things]]></category>

		<category><![CDATA[bots]]></category>

		<category><![CDATA[google]]></category>

		<category><![CDATA[Googlebot]]></category>

		<category><![CDATA[logazier]]></category>

		<category><![CDATA[scraper]]></category>

		<guid isPermaLink="false">http://www.sajalkayan.com/?p=109</guid>
		<description><![CDATA[A script for analyzing my webservers access.log is long overdue here is a small start. Just recently I noticed a bad bot was attempting to scrape whole of my site using Googlebot&#8217;s useragent. Since im learning python, I thought it might be a nice experience to write a simple script which can help me detect [...]]]></description>
			<content:encoded><![CDATA[<p>A script for analyzing my webservers access.log is long overdue here is a small start. Just recently I noticed a bad bot was attempting to scrape whole of my site using Googlebot&#8217;s useragent. Since im learning python, I thought it might be a nice experience to write a simple script which can help me detect these fakers.</p>
<p>The script looks at the access log, looks for records matching &#8220;Googlebot&#8221; then validates based on techniques mentioned at &#8220;<a href="http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html">How to verify Googlebot</a>&#8221; at Google Webmaster Central Blog. It may also be useful or even fun to catch other SEOs trying to see your site thru Googlebot&#8217;s eyes.</p>
<p>The logic is simple. The IP from which the request is coming in should point to a *.googlebot.com and in turn the hostname should resolve back to the same IP. The first part can be faked by a smart faker, but the latter is not possible(unless they break into Google&#8217;s DNS servers <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> ). This 2 step validation is a sure shot method.</p>
<p>For a Genuine Googlebot request :-</p>
<p>Server log entry :-<br />
<code><strong>66.249.71.202</strong> - - [28/Mar/2009:08:59:14 -0500] GET / HTTP/1.1 &#8220;200&#8243; 17892 &#8220;-&#8221; &#8220;Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)&#8221; &#8220;-&#8221;</code><br />
IP : 66.249.71.202</p>
<p>Thus :-<br />
<code># host 66.249.71.202<br />
202.71.249.66.in-addr.arpa domain name pointer crawl-66-249-71-202.<strong>googlebot.com</strong>.<br />
# host crawl-66-249-71-202.googlebot.com.<br />
crawl-66-249-71-202.googlebot.com has address <strong>66.249.71.202</strong><br />
#</code><br />
For now this script outputs : The number of hits, IP, hostname, and what ip the hostname resolvs to&#8230;.<br />
<code># ./logazier.py<br />
92 - 99.190.96.157 - adsl-99-190-96-157.dsl.pltn13.sbcglobal.net - FAKE - 99.190.96.157<br />
36 - 24.154.150.217 - dynamic-acs-24-154-150-217.zoominternet.net - FAKE - 24.154.150.217<br />
4 - 83.82.191.185 - 5352BFB9.cable.casema.nl - FAKE - 83.82.191.185<br />
4 - 69.64.69.150 - 69-64-69-150.dedicated.abac.net - FAKE - 69.64.69.150<br />
3 - 64.191.54.85 - venus.surfwebhost.com - FAKE - 64.191.54.85<br />
3 - 117.47.205.13 - err - FAKE - err<br />
2 - 218.186.12.202 - cm202.omega12.maxonline.com.sg - FAKE - 218.186.12.202<br />
1 - 96.254.203.143 - pool-96-254-203-143.tampfl.fios.verizon.net - FAKE - 96.254.203.143<br />
1 - 76.160.175.238 - mail.appianllc.com - FAKE - 76.160.175.238<br />
1 - 121.246.166.247 - 121.246.166.247.static-hyd.vsnl.net.in - FAKE - err<br />
1 - 117.196.235.141 - err - FAKE - err</code><strong></strong></p>
<p><strong>The script can be downloaded at : <a href="http://www.sajalkayan.com/logazier/0.0.1/logazier.py">http://www.sajalkayan.com/logazier/0.0.1/logazier.py</a></strong></p>
<p>Upcoming features.</p>
<ol>
<li> Detect other major bots as well - yahoo, msn, alexa, etc&#8230;</li>
<li> Analyze the access.log for bad bot activity even when the bots use regular browser user agents - much more complex than I thought <img src='http://www.sajalkayan.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.sajalkayan.com/python-script-to-detect-bad-botspeople-faking-as-googlebot.html/feed</wfw:commentRss>
		</item>
	</channel>
</rss>
