Gregarius » BetaNews.Com » mars 24, 2010

BetaNews.Com (9 non lus)

Exclusive: Olympic snowboarder Shaun White discusses his first skateboarding game

Publié: mars 24, 2010, 11:41pm CET par Tim Conneally

By Tim Conneally, Betanews

Of all the things we expected to come from a conference about wireless technology, an interview with a two-time gold medal olympian was not one of them, but today, Betanews got an exclusive interview with professional snowboarder/skateboarder Shaun White about his first skateboarding-only videogame from Ubisoft.
Truth be told, running into Shaun was purely accidental. I was scheduled to talk to Marvell about its Armada 600 platform at the very same time the he was scheduled to do an autograph signing for the company. As a huge line amassed around Marvell's booth, I completely expected to have my discussion time bumped. Instead, Marvell invited me to ask Shaun a few questions.
"We've got a skateboarding game coming out that's been three years in the making," White said. "I've finished the soundtrack, voice-over stuff within the next little bit here. But it's amazing, it's one of the best games we've played. We had a writer from Family Guy to write the script. I mean...when I play a game, I wanna have fun, I want to enjoy it and also laugh, you know what I mean? It's got to be a mixture of things, so it's pretty cool."
As a skateboarder and snowboarder, I assumed White has played at least one of the long-running Tony Hawk's Pro Skater series. I asked him how this game compares.
"It's a mixture, obviously. You can't just deny the fact that Tony [Hawk]'s games are incredibly fun, and the SKATE game is a bunch of fun. But...I'm proud to say, it's not a re-do of their games in the slightest. We have certain elements, certain controller things that we kind of borrowed from SKATE -- trick styles, and the way the levels are built, it's like we got [ideas for] those things from Tony's, but it's definitely like a mixture of those two," White said.
"Basically, everywhere you go, everything changes. So as you skate, you create your whole world. It's like the whole world's black and white and you have to basically change the world through your actions. So you're only [at] this little stair, and then when you land, the shock wave goes out, and trees start sprouting, rails start growing. The whole thing changes, the music picks up, it's just really entertaining. It's just like you shaping the world, and you can actually get on rails and, like, shape the way they're going to go. You control everything."
Unfortunately, I didn't get to talk to White any more about the upcoming game, but he's very excited about it. White's talents as a professional skateboarder are sometimes overlooked by his domination in competitive snowboarding.
I was going to ask him what kind of phone he uses, but that was when I was actually bumped. Specifically by Marvell's COO, Weili Dai, who came in to meet the olympian, who she said "looks just like a UC Berkley student."
Copyright Betanews, Inc. 2010
Betanews Relative Performance Index for browsers 3.0: How it works and why

Publié: mars 24, 2010, 8:33pm CET par Scott M. Fulton, III
By Scott M. Fulton, III, Betanews

The Betanews test suite for Windows-based Web browsers is a set of tools for measuring the performance, compliance, and scalability of the processing component of browsers, particularly their JavaScript engines and CSS renderers. Our suite does not test the act of loading pages over the Internet, or anything else that is directly dependent on the speed of the network.
But what is it measuring, really? The suite is measuring the browser's capability to perform instructions and produce results. In the early days of microcomputing, computers (before we called them PCs) came with interpreters that processed instructions and produced results. Today, browsers are the virtual equivalent of Apple IIs and TRS-80s -- they process instructions, and produce results. Many folks think they're just using browsers to view blog pages and check the scores. And then I catch them watching Hulu or playing a game on Facebook or doing something silly on Miniclip, and surprise, they're not just reading the paper online anymore. More and more, a browser is a virtual computer.
So I test it like a computer, not like a football scoreboard. The final result is a raw number that, for the first time on Betanews, can be segmented into three categories: computational performance (raw number crunching), rendering performance, and scalability (defined momentarily). These categories are like three aspects of a computer; they are not all the aspects of a computer, but they are three important ones. They're useful because they help you to appreciate the different browsers for the quality of the work they are capable of providing.
Many folks tell me they don't do anything with their browsers that requires any significant number crunching. The most direct response I have to this is: Wrong. Web pages are sophisticated pieces of software, especially where cross-domain sharing is involved. They're not pre-printed and reproduced like something from a fax machine; more and more, they're adaptive composites of both textual and graphic components from a multitude of sources, that come together dynamically. The degree of math required to make everything flow and align just right, is severely under-appreciated.
Why is everything relative?
By showing relative performance, the aim of the RPI index is to give you a simple-to-understand gauge of how two or more browsers compare to one another on any machine you install them on. That's why we don't render results in milliseconds. Instead, all of our results are multiples of the performance of a relatively slow Web browser that is not a recent edition: Microsoft Internet Explorer 7, running on Windows Vista SP2 (the slower of the three most recent Windows versions), on the same machine (more about the machine itself later).
I'm often asked, why don't I show results relative to IE8, a more recent version? The answer is simple: It wouldn't be fair. Many users prefer IE8 and deserve a real measurement of its performance. When Internet Explorer 9 debuts, I will switch the index to reflect a multiple of IE8 performance. In years to come, however, I predict it will be more difficult to find a consistently slow browser on which to base our index. I don't think it's fair for anyone, including us, to presume IE will always be the slowest brand in the pack.
Browsers will continue to evolve; and as they do, certain elements of our test suite may become antiquated, or unnecessary. But that time hasn't come yet. JavaScript interpreters are, for now, inherently single-threaded. So the notion that there may be some interpreters better suited to multi-core processors, has not been borne out by our investigation, including from the information we've been given by browser manufacturers. Also, although basic Web standards compliance continues to be important (our tests impose a rendering penalty for not following standards, where applicable), HTML 5 compliance is not a reality for any single browser brand. Everyone, including Microsoft, wants to be the quintessential HTML 5 browser, but the final working draft of the standard was only published by W3C weeks ago.
There will be a time when HTML 5 compliance should be mandatory, and we'll adapt when the time comes.
What's this "scalability" thing?
In speaking recently with browser makers -- especially with the leader of the IE team at Microsoft, Dean Hachamovitch -- I came to agree with a very fair argument about our existing test suite: It failed to account for performance under stress, specifically with regard to each browser's capability to crunch harder tasks and process them faster when necessary.
A simple machine slows down proportionately as its workload increases linearly. A well-programmed interpreter is capable of literally pre-digesting the tasks in front of it, so that at the very least it doesn't slow down as much -- if planned well, it can accelerate. Typical benchmarks have the browser perform a number of tasks repetitively. But because JavaScript interpreters are (for now) single-threaded, when scripts are written to interrupt a sequence of reiterative instructions every second or so -- for instance, to update the elapsed time reading, or to spin some icon to remind the user the script is running -- that very act works against the interpreter's efficiency. Specifically, it inserts events into the stream that disrupt the flow of the test, and work against the ability of the interpreter to break it down further. As I discovered when bringing together some tried and true algorithms for our new test suite, including one-second or even ten-second timeouts messes with the results.
Up to now, the third-party test suites we've been using (SunSpider, SlickSpeed, JSBenchmark) perform short bursts of instructions and measure the throughput within given intervals. These are fair tests, but only under narrow circumstances. With the tracing abilities of JavaScript interpreters used in Firefox, Opera, and the WebKit-based browsers, the way huge sequences of reiterative processes are digested, is different than the way short bursts are digested. I discovered this, with the help of engineers from Opera, when debugging the old test battery I've since thrown out of our suite: A smart JIT compiler can effectively dissolve a thousand or a million or ten million instructions that effectively do nothing, into a few void instructions that accomplish the same nothing. So if a test pops up a "0 ms" result, and you increase the workload by a factor of, say, a million, and it still pops up "0 ms"...that's not an error. That's the sign of a smart JIT compiler.
The proof of this premise also shows up when you vary the workload of a test that does a little something more than nothing: Browsers like Chrome accelerate the throughput when the workload is greater. So if their interpreters "knew" the workload was great to begin with, rather than just short bursts or, instead, infinite loops broken by timeouts (whose depth perhaps cannot be traced in advance), they might actually present better results.
That's why I've decided to make scalability count for one-third of our overall score: Perhaps an interpreter that's not the fastest to begin with, is still capable of finding that extra gear when the workload gets tougher; and perhaps we could appreciate that fact if we only took the time to do so.
Next: The newest member of the suite...
The newest member of the suite
Decades ago, back when I tested the efficiency of BASIC and C compilers, I published the results under my old pseudonym, "D. F. Scott." In honor of my former "me," I named our latest benchmark test battery "DFScale."
It has ten components, two of which are broken into two sub-components, for a total of 12 tests. Each component renders results on a grid which is divided into numbers of total iterations: 20, 22, 24, 30, 100, 250, 500, 1000, 10000, 25000, 50000, 100000, 250000, 500000, 1000000, and 10000000. Not every component can go that high. By that, I mean that at high levels, the browser can crash, or hang, or even generate stack overflow errors. Until I am satisfied that these errors are not exploitable for malicious purposes, I will not release the DFScale suite to the public.
On the low end of the scale, I scale some components up where the iteration numbers are too low to make meaningful judgments. So for a test whose difficulty scales exponentially -- and rapidly -- like the Fibonacci sequence, for instance, I stick to the low end of the scale.
Some tests involve the use of arrays of random integers, whose size and scale equally correspond to the grid numbers. So an array of 10,000 units, for example, would contain random values from 1 to 10,000. Yes, we use the fairer random number generator and value shuffler used by Microsoft to correct its browser choice screen, in an incident brought to the public's attention in early March by IBM's Rob Weir. In fact, the shuffler is one of our tests; and in honor of Weir, we named the test the "Rob Weir Shuffle."
To ensure fairness, we randomize and shuffle the arrays that components will use (except for the separate test of the "Weir Shuffle") at the start of runtime. Then components will sort through copies of the same shuffled arrays, rather than regenerate new ones.
Reiterative loops run uninterrupted on each browser, which is something that some browsers don't like. Internet Explorer stops "runaway" loops to ask the user if she wishes to continue; to keep this from happening, we tweaked the System Registry. Chrome puts up a warning giving the user the option to stop the loop, but that warning is on a separate thread which apparently does not slow down the loop.
Each component of DFScale produces a pair of scores, for speed and scalability, which require Excel for us to tally. The speed score is the average of the iterations per second for each grid number. The scalability score is a product of two elements: First, the raw times for each grid number are plotted geometrically on the y-axis, with the numbers of iterations on the x-axis. We then estimate a best-fit line, and calculate the slope of that line. The lower the slope, the higher the score. Secondly, we estimate a best-fit exponential curve, representing the amount of acceleration a JIT compiler can provide over time. Again, the lower the curve's coefficient, the flatter the curve is, and the higher the score.
Rather than attribute arbitrary numbers to that score, we compare the slope and coefficient values to those generated from a test of IE7 in Vista SP2. Otherwise, saying that browser A scored a 3 and browser B scored a 12, would be meaningless -- twelve of what? So the final score on a component is 50% slope, 50% curve, relative to IE7.
The ten tests in the (current) DFScale suite are as follows:
- The "Rob Weir Shuffle" - or rather, the fairer array shuffling algorithm made popular by Weir's disassembly of the standard random number generator in IE.
- The Sieve of Eratosthenes (which I used so often during the '80s that several readers suggested it be officially renamed "The D. F. Sieve"). It's the algorithm used to identify prime numbers in a long sequence; and here, the sequence lengths are determined by the grid numbers.
- The Fibonacci sequence, which is an algorithm that produces integers that are each the sum of the previous two. Of the many ways of expressing this algorithm, we picked two with somewhat different construction. The first compresses the entire comparison operation into a single instruction, using an alternative syntax first introduced in C. The second breaks down the operation into multiple instructions -- and because it's longer, you'd think it would be harder to run. It's not, because the alternative syntax ends up being more difficult for JIT compilers to digest, which is why the longer form has been dubbed the "Fast Fib."
- The native JavaScript sort() method, which isn't really a JavaScript algorithm at all, but a basic test of each browser's ability to sort an array using native code. In the real world, the sort() method is the typical way developers will handle large arrays; they don't include their own QuickSort or Heap Sort scripts. The reason we test these other sorting algorithms (to be explained further) is to examine the different ways JavaScript interpreters handle and break down reiterative tasks. The native method can conceivably act as a benchmark for relative efficiency; and amazingly, external scripts can be faster than the native method, especially in Google Chrome.
- Radix Sort is an algorithm that simulates the actions of an old IBM punch-card sorter, which sorted a stack of cards based on their 0 - 9 digits at various locations. No one would dare use this algorithm today for real sorting, but it does test each browser's ability to scale up a fixed and predictable set of rules, when the complexity of the problem scales exponentially. Nearly all browsers are somewhat slower with the Radix at larger values, with IE8 dramatically slower.
- QuickSort is the time-tested "divide-and-conquer" approach to sorting lists or arrays, that often remains the algorithm of choice for large indexes. It picks a random array value, called a "pivot point," and then shuffles the others to one side or the other based on their relative value. The result are groups of lower and higher values that are separated from one another, and can in turn be sorted as smaller groups.
- Merge Sort is a more complex approach to divide-and-conquer that is often used for sorting objects with properties rather than just arrays. It efficiently divides a complex array into smaller arrays within a binary tree, sorts the smaller ones, and then integrates the results with bigger ones as it goes along. It's also very memory intensive, because it breaks up the problem all over the place. (If the video below doesn't explain it for you, then at least it will give you something to temper your nightmares.)
- Bubble Sort is the quintessential test of the simplest, if least efficient, methodology for attacking a problem: It compares two adjacent values in the array repetitively, and keeps moving the highest one further forward...until it can't do that anymore, at which point, it declares the array sorted. As a program, it should break down extremely simply, which should give interpreters an excuse to find that extra gear...if it has one. If it doesn't, the sort times will scale very exponentially as the problem size scales linearly.
- Heap Sort has been said to be the most stable performer of all the sort algorithms. Naturally, this is the one that gives our browsers the most fits -- the stack overflow errors happen here. Safari handles the error condition by simply refusing to crash, but ending the loop -- which speaks well for its security. It cannot perform this test at ten million iterations. Other browsers...can force a reboot if the problem size is too high. Essentially, its task is to weed out high values from a shrinking array. The weeds are piled atop a heap, and the pile forms a sorted array.
- Euler's Problem #14 is a wonderful discovery I made while searching for a test that was similar in some ways to, but different in others from, the Fibonacci sequence. The page where I discovered it grabbed my attention for its headline, "Yet Another Meaningless JavaScript Benchmark," which somehow touched my soul. The challenge for this problem was to find a way of demonstrating the "unproven conjecture" that a particular chain of numbers, starting with any given integer value and following only two rules, would eventually pare down to 1. The rules are: 1) if the value is even, the next value in the chain is the current one divided by 2; 2) if the value is odd, the next one is equal to three times the current one, plus 1. Our test runs the conjecture with the first 1000, 10000, 25000 integers, and so on. The problem should scale almost linearly, giving interpreters the opportunity to push the envelope and accelerate. In our first test of the IE9 Tech Preview, it appeared to be accelerating very well, running the first 250,000 integer sequence in 0.794 seconds versus 10.9 seconds for IE8. Alas, IE9 is not yet stable enough to run the test with higher values.
  Problem 14 is also interesting for us because we test it two ways, suggested by blogger and developer Motti Kanzkron: He realized that at some point, future sequences would always touch upon a digit used in previous sequences, whose successive values in the chain would always be the same. So there's no need to complete the rest of the sequence if it's already been done once; his optimization borrows memory to tack old sequences onto new ones and move forward. We compare the brute-force "naïve" version to the smart "optimized" version, to judge the ability of a JIT compiler to optimize code compared to an optimization that would be more obvious to a programmer.
Next: The other third-party test batteries...
The other third-party test batteries
Since we started testing browsers in early 2009, Betanews has maintained one very important methodology: We take a slow Web browser that you might not be using much anymore, and we pick on its sorry self as our test subject. We base our index on the assessed speed of Microsoft Internet Explorer 7 on Windows Vista SP2 -- the slowest browser still in common use. For every test in the suite, we give IE7 a 1.0 score. Then we combine the test scores to derive an RPI index number that, in our estimate, best represents the relative performance of each browser compared to IE7. So for example, if a browser gets a score of 6.5, we believe that once you take every important factor into account, that browser provides 650% the performance of IE7.
We believe that "performance" means doing the complete job of providing rendering and functionality the way you expect, and the way Web developers expect. So we combine computational efficiency, rendering efficiency (coupled with standards compliance tests), and scalability. This way, a browser with a 6.5 score can be thought of as doing the job more than five times faster and better.
Here now are the other third-party batteries we use for our Browser Test Suite 3.0, and how we've modified them where necessary to suit our purposes:
- Nontroppo CSS rendering test. Up until recently, we were using a modified version of a rendering test used by HowToCreate.co.uk, whose two purposes have been to time how long it takes to re-render the contents of multiple arrays of <DIV> elements and to time the loading of the page that includes those elements. The reason we modified this page was because the JavaScript onLoad event fires at different times for different browsers -- despite its documented purpose, it doesn't necessarily mean the page is "loaded." There's a real-world reason for these variations: In Apple Safari, for instance, some page contents can be styled the moment they're available, but before the complete page is rendered, so firing the event early enables the browser to do its job faster -- in other words, Apple doesn't just do this to cheat. But the actual creators of the test themselves, at nontroppo.org, did a better job of compensating for the variations than we did: Specifically, the new version now tests to see when the browser is capable of accessing that first <DIV> element, even if (and especially when) the page is still loading.
  Here's how we developed our new score for this test battery: There are three loading events: one for Document Object Model (DOM) availability, one for first element access, and the third being the conventional onLoad event. We counted DOM load as one sixth, first access as two sixths, and onLoad as three sixths of the rendering score. Then we adjusted the re-rendering part of the test so that it iterates 50 times instead of just five. This is because some browsers do not count milliseconds properly in some platforms -- this is the reason why Opera mysteriously mis-reported its own speed in Windows XP as slower than it was. (Opera users everywhere...you were right, and we thank you for your persistence.) By running the test for 10 iterations for five loops, we can get a more accurate estimate of the average time for each iteration because the millisecond timer will have updated correctly. The element loading and re-rendering scores are averaged together for a new and revised cumulative score -- one which readers will discover is much fairer to both Opera and Safari than our previous version.
- Celtic Kane JSBenchmark.The new JSBenchmark from Sean P. Kane is a modern version of the classic math tests first made popular, if you can believe it, by folks like myself who tested compilers for computer magazines. QuickSort is covered here too, and in this case, JSBenchmark renders relative throughput during a given interval. There's other problems too, including one called the "Genetic Salesman," which finds the shortest route through a geometrically complex maze. It's good to see a modern take on my old favorites. Rather than run a fixed number of iterations and time the result, JSBenchmark runs an undetermined number of iterations within a fixed period of time, and produces indexes that represent the relative efficiency of each algorithm during that set period -- higher numbers are better.
- SunSpider JavaScript benchmark. Maybe the most respected general benchmark suite in the field, SunSpider focuses on computational JavaScript performance rather than rendering -- the raw ability of the browser's underlying JavaScript engine. It comes from the folks who produce the WebKit open source rendering engine that currently has closer ties with Safari, but we've found SunSpider's results to appear fair and realistic, and not weighted toward WebKit-based browsers. There are nine categories of real-world computational tests (3D geometry, memory access, bitwise operations, complex program control flow, cryptography, date objects, math objects, regular expressions, and string manipulation). Each test in this battery is much more complex, and more in-tune with real functions that Web browsers would perform every day, than the more generalized, classic approach now adopted by JSBenchmark. All nine categories are scored and average relative to IE7 in Vista SP2.
- Mozilla 3D cube by Simon Speich, also known as Testcube 3D, is an unusual discovery from an unusual source: an independent Swiss developer who devised a simple and quick test of DHTML 3D rendering while researching the origins of a bug in Firefox. That bug has been addressed already, but the test fulfills a useful function for us: It tests only graphical dynamic HTML rendering -- which is finally becoming more important thanks to more capable JavaScript engines. And it's not weighted toward Mozilla -- it's a fair test of anyone's DHTML capabilities.
  There are two simple heats whose purpose is to draw an ordinary wireframe cube and rotate it in space, accounting for forward-facing surfaces. Each heat produces a set of five results: total elapsed time, the amount of that time spent actually rendering the cube, the average time each loop takes during rendering, and the elapsed time in milliseconds of the fastest and slowest loop. We add those last two together to obtain a single average, which is compared with the other three times against scores in IE7 to yield a comparative index score. We also now extrapolate a scalability score, which compares the results from the larger cube to the smaller one to see if the interpreter accelerated and by how much.
- SlickSpeed CSS selectors test suite. As JavaScript developers know, there are a multitude of third-party libraries in addition to the browser's native JS library, that enable browsers to access elements of a very detailed and intricate page (among other things). For our purposes, we've chosen a modified version of SlickSpeed by Llama Lab, which covers many more third-party libraries including Llama's own. This version tests no fewer than 56 shorthand methods that are supposed to be commonly supported by all JavaScript libraries, for accessing certain page elements. These methods are called CSS selectors (one of the tested libraries, called Spry, is supported by Adobe and documented here).
  So Llama's version of the SlickSpeed battery tests 56 selectors from 10 libraries, including each browser's native JavaScript (which should follow prescribed Web standards). Multiple iterations of each selector are tested, and the final elapsed times are rendered. Here's the controversial part: Some have said the final times are meaningless because not every selector is supported by each browser; although SlickSpeed marks each selector that generates an error in bold black, the elapsed time for an error is usually only 1 ms, while a non-error is as high as 1000. We compensate for this by creating a scoring system that penalizes each error for 1/56 of the total, so only the good selectors are scored and the rest "get zeroes."
  Here's where things get hairy: As some developers already know, IE7 got all zeroes for native JavaScript selectors. It's impossible to compare a good score against no score, so to fill the hole, we use the geometric mean of IE7's positive scores with all the other libraries, as the base number against which to compare the native JavaScript scores of the other browsers, including IE8. The times for each library are compared against IE7, with penalties assessed for each error (Firefox, for example, can generate 42 errors out of 560, for a penalty of 7.5%.) Then we assess the geometric mean, not the average, of each battery -- the reason we do this is because we're comparing the same functions for each library, not different categories of functions as with the other suites. Geometric means will account better for fluctuations and anomalies.
Next: Table rendering and standards compliance...
Table rendering and standards compliance
- Nontroppo table rendering test. As has already been proven in the field, CSS is the better platform for rendering complex pages using magazine-style layout. Still, a great many of the world's Web pages continue to use HTML's old <TABLE> element (created to render data in formal tables) for dividing pages into grids. We heard from you that if IE7 is still important (it is our index browser after all), old-style table rendering should still be tested. And we concur. The creator of our CSS rendering test has created a similar platform for testing not only how long it takes a browser to render a huge table, but how soon the individual cells (<TD> elements) of that table are available for manipulation. When the test starts, it times the duration until the browser starts rendering the table and then ends that rendering, from the same mark, for two index scores. It also times the loading of the page, for a third index score. Then we have it re-render the contents of the table five times, and average the time elapsed for each one, for a fourth score. The four items are then averaged together for a cumulative score.
- Nontroppo standard browser load test. (That Nontroppo gets around, eh?) This may very well be the most generally boring test of the suite: It's an extremely ordinary page with ordinary illustrations, followed by a block full of nested <DIV> elements. But it allows us to take away all the variable elements and concentrate on straight rendering and relative load times, especially when we launch the page locally. It produces document load time, document plus image load times, DOM load times, and first access times, all of which are compared to IE7 and averaged.
- Canvas rendering test. The canvas object in JavaScript is a local memory segment where client-side instructions can plot complex geometry or even render detailed, animated text, all without communicating with the server. The Web page contains all the instructions the object needs; the browser downloads them, and the contents are plotted locally. We discovered on the blog of Web developer Ernest Delgado a personal test originally meant to demonstrate how much faster the Canvas object was than using Vector Markup Language in Internet Explorer, or Scalable Vector Graphics in Firefox. We'd make use of the VML and SVG test ourselves if Apple's Safari -- in the interest of making things faster -- hadn't implemented a system that replaces them with Canvas by default. (To embrace HTML 5, newer browsers will have to incorporate SVG; and IE9 is finally taking big steps in that direction.)
  We use Delgado's rendering test to grab two sets of plot points from Yahoo's database -- one outlining the 48 contiguous United States, and one set outlining Alaska complete with all its islands. Those plot points are rendered on top of Google Maps projections of the mainland US and Alaska at equal scale, and both renderings are timed separately. Those times are compared against IE7, and the two results are averaged with one another for a final score.
- Acid3 standards compliance test. To the extent to which browsers don't follow the most commonly accepted Web standards, we penalize them in the rendering category. Think of it like a Winter biathlon where, for every target the skier fails to shoot, he has to ski a mini-lap. The function of the Web Standards Project's Acid3 test has changed dramatically, especially as most of our browsers become fully compliant. IE7 only scored a 12% on the Acid3, and IE8 scored 20%; but today, most of the alternative browsers are at 100% compliance, with Firefox at 93% and with 3.7 Alpha 1 scoring 96%. So it means less now than it did in earlier months to have Acid3 yield an index score of 8.33, which is the score for any browser that scores 100% thanks to IE7. Now that cumulative index scores are closer to 20, having an eight-and-a-third in the mix has become a deadweight rather than a reward.
  So now, for the other batteries that have to do with rendering (all three Nontroppos and TestCube 3D), plus the native JavaScript library portion of the SlickSpeed test, we're multiplying the index score by the Acid3 percentage. As a result, the amount of any non-compliance with the Web Standards Project's assessment is applied as a penalty against those rendering scores. Today, only Mozilla and Microsoft browsers are affected by this penalty, and Firefox only slightly -- all the others are unaffected.
- The CSS3.info compliance test. Microsoft asked us, as long as we're calling Acid3 a litmus test for standards, why don't we also pay attention to the CSS3.info test for CSS compliance. It was a fair question, and we checked into it, we couldn't find anything against it. There are 578 standard selectors in the CSS3 test, the fraction of which that each browser supports constitutes a percentage. It's then combined with the Acid3 percentage to yield a total rendering penalty. (Yes, Microsoft argued its way out of a few points: IE8 scores 20% on the Acid3, but 60.38% on the CSS3.info, for a 40% allowable percentage instead of 20%.)
Next: Our physical test platform, and why it doesn't matter...
Our physical test platform, and why it doesn't matter
The physical test platform we've chosen for browser testing is a triple-boot system, which enables us to boot different Windows platforms from the same large hard drive. For the Comprehensive Browser Test (CRPI), our platforms are Windows XP Professional SP3, Windows Vista Ultimate SP2, and Windows 7 RTM. For the standard RPI test, we use just Windows 7.
All platforms are always brought up to date using the latest Windows updates from Microsoft, prior to testing. We realize, as some have told us, this could alter the speed of the results obtained. However, we expect real-world users to be making the same changes, rather than continuing to use unpatched and outdated software. Certainly the whole point of testing Web browsers on a continual basis is because folks want to know how Web browsers are evolving, and to what degree, on as close to real-time a scale as possible. When we update Vista, we re-test IE7 on that platform to ensure that all index scores are relative to the most recent available performance, even for that aging browser on that old platform.
Our physical test system is an Intel Core 2 Quad Q6600-based computer using a Gigabyte GA-965P-DS3 motherboard, an Nvidia 8600 GTS-series video card, 3 GB of DDR2 DRAM, and a 640 GB Seagate Barracuda 7200.11 hard drive (among others). Three Windows XP SP3, Vista SP2, and Windows 7 RC partitions are all on this drive. Since May 2009, we've been using a physical platform for browser testing, replacing the virtual test platforms we had been using up to that time. Although there are a few more steps required to manage testing on a physical platform, you've told us you believe the results of physical tests will be more reliable and accurate.
But the fact that we perform all of our tests on one machine, and render their results as relative speeds, means that the physical platform is actually immaterial here. We could have chosen a faster or slower computer (or, frankly, a virtual machine) and you could run this entire battery of tests on whatever computer you happen to own. You'd get the same numbers because our indexes are all about how much faster x is than y, not how much actual time elapsed.
The speed of our underlying network is also not a factor here, since all of our tests contain code that is executed locally, even if it's delivered by way of a server. The download process is not timed, only the execution.
Why don't we care about download speeds, especially how long it takes to load certain pages? We do, but we're still in search of a scientifically reliable method to test download efficiency. Web pages change by the second, so any test that measures the time a handful of browsers consumes to download content from any given set of URLs, is almost pointless today. And the speed of the network can vary greatly, so a reliable final score would have to factor out the speed at the time of each iteration. That's a cumbersome approach, and that's why we haven't embarked on it yet.
There are three major benchmark suites that we have evaluated and re-evaluated, and with respect to their authors, we have chosen not to use. Dromaeo comes from a Firefox contributor whom we respect greatly, named John Resig. We appreciate Resig's hard work, but we don't yet feel his results numbers correspond to the differences in performance that we see with our own eyes, or that we can time with a stopwatch. The browsers just aren't that close together. Meanwhile, we've currently rejected Google's V8 suite -- built to test its V8 JavaScript engine -- for the opposite reason: Yes, we know Chrome is more capable than IE. But 230 times more capable? No. That's overkill. There's a huge exponential curve there that's not being accounted for, and once it is, we'll reconsider it.
We've also been asked to evaluate Futuremark's PeaceKeeper. I'm very familiar with Futuremark's tests from my days at Tom's Hardware. Though it's obvious to me that there's a lot going on in each of the batteries of the Peacekeeper suite, it doesn't help much that the final result is rendered only as a single tick-mark. And while that may sound hypocritical from a guy who's pushing a single performance index, the point is, for us to make sense of it, we need to be able to see into it -- how did that number get that high or that low? If Futuremark would just break down the results into components, we could compare each of those components against IE7 and the modern browsers, and we could determine where each browser's strengths and weaknesses lie. Then we could tally an index based on those strengths and weaknesses, rather than an artificial sum of all values that blurs all those distinctions.
This is as complete and as thorough an explanation of our testing procedure as you're ever going to get.
Copyright Betanews, Inc. 2010
Why Sony Ericsson is worth watching in the Android space

Publié: mars 24, 2010, 7:16pm CET par Tim Conneally

By Tim Conneally, Betanews

In the past, having too many different screen resolutions to support was a problem for Windows Mobile developers. For the users of Android phones, it seems like too few screen sizes could become a problem. With Android, there are only three general screen classes: small, medium, and large.
And the trend lately among Android devices has been to have bigger and brighter screens. When the Motorola Droid debuted last October, the device's 3.7" screen looked downright huge. Yesterday, the 4" screen on the Samsung Galaxy S and 4.3" screen on the HTC EVO made the Droid look small by comparison. Unfortunately, the shape of the chassis must reflect the size of the screen. What's happening is that we are seeing bigger, flatter phones.
That's why Sony Ericsson's Xperia X10 line is so compelling. The X10 was one of the first Android devices with a 1 GHz Qualcomm Snapdragon processor, and the first to offer a 4" screen, but Sony Ericsson isn't using that model as a baseline to crazily push toward tablet-sized phones. Quite the contrary, it's using it as the biggest model in a line of phones conceptually more like the iPod. At Mobile World Congress in February, the company announced that it was going against what most other manufacturers appeared to be doing with Android, and made their devices smaller.
In fact, now Sony Ericsson has made the smallest Android phone yet.
Last night, I got a chance to play with the Xperia X10 Mini Pro, and I was supremely impressed. The tiny X10 Mini Pro can fit in the palm of my hand, and yet almost no usability has been sacrificed. The chiclet keyboard is highly responsive, and though it is running an older version of Android, the UI has been totally redesigned to make it easy to use on a small screen.
Though Sony Ericsson isn't one of the hotter brands in the United States, its products are highly compelling, especially if you are looking for something outside of the standard touchphone form factor.
The Xperia X10 Mini Pro will be available in Europe in May, but sadly, no US release plans have yet been announced.
Copyright Betanews, Inc. 2010
Google's Hong Kong move leads to censorship, followed closely by opportunism

Publié: mars 24, 2010, 4:58pm CET par Scott M. Fulton, III

By Scott M. Fulton, III, Betanews

What, exactly, would one be blocked from seeing now that the "Great Firewall of China," as it's been dubbed, separates citizens of mainland China from Google? This morning, Betanews used a fabulous Firefox 3.0 add-in tool called ChinaChannel, created by independent developers in Hong Kong, to set up a proxy connection using a China IP address, so we could peruse Google as though we were in China itself. Then using an ordinary copy of Opera 10.51 on the other side, we browsed Google.com.hk -- the server to which Google is now redirecting Google.cn requests -- using our regular US-based connection.
We've used this tool in the past, and we had an easier time obtaining a proxy connection with a China-based proxy. At first this morning, we found proxy servers were frequently denying connection requests, although repeated requests often got through after 10 or more tries. However, sometimes our connection only lasted as long as a minute.
Although Google's PRC service status page reports its Images service is online, we noticed that from time to time, we were not able to obtain image results from searches. Sometimes image results did not appear within regular Web queries, when using the US-based Opera connection to Google.com.hk, the images did appear. Sometimes requests to images.google.com.hk turned up pages with empty frames but active links, as was the case for this unresolved search for pictures of nuclear weapons.
At other times, requests were blocked, using response messages that clearly originated from a China-based ISP, not from Google. And then from time to time, image requests for relatively innocuous subjects, like Sandra Bullock, were processed without trouble.
By mid-morning, however, it appeared the gig was up, as every request for a proxy connection ended up being blocked. ChinaChannel rotates its proxy requests through a list of services (including aiya.com.cn, chinanetcenter.com, and a few services whose names are now blanked out in their "Access Denied" messages) that, at one time, were open. Requests to Aiya were met with this message: "Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect."
What we were looking for was evidence to back up Associated Press reports this morning that Chinese censors are filtering -- not blocking, but rather sifting through -- Google searches. The tiny speck of evidence we were able to turn up suggests that China's roadblocks at present are not that sophisticated. Conceivably, state-run ISPs may be blocking (or trying to block) requests specifically to the images subdomain of Google.com.hk. A successful block could result in the "Image results for..." portion of a straight Google Web query, from appearing in the list of results, without the ISP having to block the entire page. It's like filtering, but without the filter.
The unresolved request for pictures of big things blowing up, could be attributed to the roadblock kicking in partway through an IP request that was about to be fulfilled, rather than a filter noticing a specific request for pictures of mushroom clouds.
In any event, it does appear as though China authorities are using crude, but effective, means of limiting access to certain assets. But why not block access to Google entirely? As we found during our brief proxy connection, although Google.cn requests are rerouted to the .HK domain, straight Web searches still go through.
One possible answer may be that an outright block of Google could make Google more popular in China than ever, if only in a symbolic sense. Thwarting access to picture searches keeps Google's boat afloat, while taking the wind out of its sails. Users may be sympathetic to Google's cause, but they'll end up switching to Baidu anyway.
So don't be surprised if, after a few weeks of this "politicalizationalized" standoff (if only Norm Crosby were still with us), the interim outcome is a report of a boost in Baidu market share.
A quick check of Baidu's "Opening-up policy," to borrow a phrase, reveals that certain search requests continue to be directly filtered. Taking a suggestion from engineer Ed Felten, we tried a search for the Buddhist group Falun Gong or Falun Dafa. The response was a connection refusal that our Opera browser processed as a closure -- Baidu hung up on us. And just as before, Baidu would then refuse to process some subsequent requests for innocuous material that it would happily find for us before, such as the aforementioned Sandra Bullock, or for something called "beta news."
However, Baidu would merrily pop back into existence following a search for Deng Xiaoping.
Contrary to popular belief, where there is censorship, there is usually an influx of capitalism. One case in point we stumbled onto this morning was a Web site that offers citizens in China and elsewhere a three-day free trial toward a subscription to a frequently updated list of free proxy servers, which it says will link individuals to services such as YouTube. Yes, now you can overcome the effects of state-run censorship...for a low monthly fee.
Now, I can't speak for the relative effectiveness of online marketing campaigns in the Asia/Pacific region, but I'd have to say that if I were an English-speaking Chinese citizen, I'd be turned off of such a service the moment it presented me with this popup: Purporting to be a link to an auction site, it offered me a chance to bid on the list of active Chinese subscribers to the proxy service, for a starting bid of $1,500.
Copyright Betanews, Inc. 2010
Get in on the limited beta of new suggestion engine, Zite

Publié: mars 24, 2010, 7:19am CET par Tim Conneally

By Tim Conneally, Betanews

Late last year, I took a look at how search services were being affected by the unchecked growth of ultra-digested, 140-character-or-less news blips. In my research, I talked to a Vancouver-based startup called Worio that was tackling the difficult problem of creating a search engine that "understood" what kind of data was important to the user.
Now, the team is working on creating a new content discovery service, which it is calling Zite.
Zite harvests keywords and themes from your Twitter feed and Delicious bookmarks and turns them into a "follow list." The same way you follow individuals on Twitter, you follow keywords on Zite, and you can customize your list to be as long or short as you like.
Based upon this follow list, Zite then creates a news feed of articles it thinks you will like. The results are broken into two lists: articles published within the last 24 hours, and articles that are more than a day old.
It's a super new service, and the Zite team wants your help, so they've given us 50 beta invites which will let you get in and access Zite directly right now.
Click here to get access to the Zite beta!
Only the first 50 readers to click that link will be able to participate in this round of testing, so click quickly!
Copyright Betanews, Inc. 2010
T-Mobile talks network upgrades

Publié: mars 24, 2010, 3:19am CET par Tim Conneally

By Tim Conneally, Betanews

Rather than debut anything unknown or surprising, mobile network operator T-Mobile today presented everything it had already announced, and then concentrated on talking about the widespread 3G network upgrade it's rolling out this year.
While this doesn't always get people drooling, T-Mobile looks to be taking a level-headed approach to network growth which the company says will result in the overall fastest 3G network in the US.
T-Mobile says its HSPA+ network upgrade will be complete by the end of this year, bringing a theoretical peak throughput of 21 Mbps to the company's entire 3G footprint.
Now, there's been some confusion as to what this upgrade will mean, as evinced by some of the questions in today's Q&A session with T-Mobile at CTIA.
As of right now, T-Mobile has launched or announced about 12 HSPA 7.2 devices. Every single one of these will be compatible with HSPA+ because the network upgrade is fully backward compatible. T-Mobile said this path is much easier on users because it doesn't require any new hardware on their side. Once the network is upgraded and activated, they simply get faster service.
In an era where the lifespan of mobile devices keeps getting shorter, T-Mobile's growth plans are a refreshing change. Instead of tempting customers into new hardware, the carrier is actually trying to add value to the devices customers have already purchased.
Copyright Betanews, Inc. 2010
Lenovo launches new Edge 14- and 15-inch Edge laptops, with frosting on top

Publié: mars 24, 2010, 2:14am CET par Jacqueline Emigh

By Jacqueline Emigh, Betanews

Literally on the wheels of a gourmet cupcake truck, Lenovo this week rolled out the 14- and 15-inch editions of its emerging ThinkPad Edge laptop line-up.
Priced starting at $499, and slated for availability in April, Lenovo's latest notebooks offer the same capabilities as the 13-inch Edge released in January, while adding wider screens and an illuminated keyboard, said Jay McBain, Lenovo's director of small and medium business. McBain gave Betanews a briefing aboard what can easily be described as the most unique setting we've ever experienced for a product demo: a concessions truck operated by an early user, a New York City-based mobile cupcake company named Cupcake Stop.
Holding an Edge PC in his lap, Cupcake Stop entrepreneur Lev Ekster touted the Edge keyboard as both easy to use and especially spill resistant, a key advantage in a work environment where 75 cupcake varieties are both baked and sold from a single truck.
For the Edge models, Lenovo has gotten rid of the ThinkPad's embedded number pad, along with keys such as System Request which are geared more to IT administrators than to consumers or most business users.
Ekster told Betanews that he uses the Edge to run all aspects of the business, started almost a year ago. He'll expand soon to add a second truck, plus a brick-and-mortar cupcake store; and he'll introduce overnight delivery services of flash-frozen cupcakes to destinations across the US and Canada, and to the UK and Japan.
It has come to this: The venue for a press demonstration of Lenovo's latest ThinkPad Edge models, the concessions truck from The Cupcake Stop in New York City.
For Ekster's purposes, the Edge handles tasks that include recipe management, payroll, inventory control, and social networking capabilities. For instance, Ekster uses Twitter and Facebook to communicate the truck's locations throughout New York City on specific days of the week, and suggestions and other feedback from customers for new cupcake flavors.
As additional benefits of the Edge, Ekster pointed to its wireless connectivity, its built-in fingerprint reader, and a tool named Rescue and Recovery, designed to help safeguard sensitive data by performing back-ups to a second Windows OS located in a separate partition on board.
Ekster acknowledged that the fingerprint reader -- used to help identify him in accessing accounts on various Web sites -- makes him feel "cool and spy-like."
Wireless options on the Edge laptops include Wi-Fi, Bluetooth, and support for wide area wireless network services from Qualcomm Gobi 2000 GPS (which in the US is supported by 3G networks from AT&T and Verizon).
The Edge notebooks also come with built-in high res, "low light sensitive" cameras; a choice of Intel Core or Celeron processors; hard drive storage of 250, 320, or 500 GB; up to 4 GB memory; and -- for home-oriented entertainment -- VGA and HDMI out, plus a choice of DVD/RW or Blu-ray drive.
To mark the launch of the 14- and 15-inch editions, Ekster -- a recent law school graduate -- whipped up a special red velvet cupcake, frosted in the Edge color scheme of "Heatwave Red - Gloss" with silver trim. The laptops also come in smooth and glossy Midnight Black, with the same silver trim.
Copyright Betanews, Inc. 2010
CTIA's 'other' Android superphone: Samsung Galaxy S

Publié: mars 24, 2010, 1:27am CET par Tim Conneally

By Tim Conneally, Betanews

HTC and Sprint's EVO may have stolen the show at CTIA today, but Samsung showed off its own Android 2.1 superphone called the Galaxy S. It's just as impressive as the EVO, just without the 4G muscle.
And while it may look like the fraternal twin of Apple's iPhone, especially with the TouchWIZ UI, Samsung's Galaxy S is no iClone.
The Galaxy S has 4" Super AMOLED (800 x 480) capacitive touchscreen, a 1 GHz Cortex A8 application processor, quad-band GSM and 7.2 Mbps HSPA 3G radios, A-GPS, Bluetooth 3.0, 802.11n, a 5 megapixel camera and a VGA front-facing camera, 720p video recording capabilities, either 16 GB or 8 GB of internal memory expandable to 32 GB via microSD, accelerometer, digital compass, proximity and light sensors.
While all of that is impressive on its own, Samsung has packed it all into a chassis with a 9.9mm profile. It makes the device slightly thinner than the iPhone 3GS (12.5mm in thickness) and thicker than the iPod Touch (8.5mm).
Some of the most outstanding features so far, however, appear to be the exclusive applications that come with it. Galaxy S has a built in app for tethering; a Samsung AllShare app, which gives the phone instant DLNA compatibility for multimedia sharing in the home; and an e-book reader app from Skiff. That's the company that rose to popularity earlier this year when Hearst Publishing announced the Skiff e-reader. The screenshot Samsung showed today looked perfectly identical to the iBooks app that's due to launch with the iPad on April 3.
If it wasn't for the EVO announcement which came just two hours afterward, the Galaxy S would have undoubtedly been today's most intriguing new device unveiling at CTIA. Battery life for the EVO, however, remains a pivotal issue, and it may turn out that the more typical 3G radio array in the Galaxy S was the smarter choice for Samsung.
Samsung says the Galaxy S will be launched worldwide "soon."
Copyright Betanews, Inc. 2010
Sprint has the game changer: The first 4G smartphone

Publié: mars 24, 2010, 12:25am CET par Tim Conneally

By Tim Conneally, Betanews

Sprint and HTC this afternoon finally took the wraps off of the "Supersonic" 4G smartphone, the HTC EVO 4G, and everything about this device is killer.
HTC and Sprint have spared nothing in this top-of-the-line device. It has a 1 GHz Qualcomm Snapdragon (QSD8650) processor, a 4.3" (800x480) capacitive touchscreen, an 8 megapixel dual flash camera, and a 1.3 megapixel forward-facing camera. Add Android 2.1, 4G WiMAX/3G EV-DO Rev. A, the ability to act as an 8-device 4G hotspot, an FM radio, Bluetooth 2.1, digital compass, proximity, velocity, and light sensors, GPS and 3.5mm headphone and HDMI output. In short, pretty much everything you could want.
Physically, it looks and feels a lot like the Windows Mobile-powered HD2 on T-Mobile, with a 4.8" x 2.6" x .5" chassis that weighs 6 ounces. Onboard memory is 1 GB ROM, 512 MB RAM. Though the battery is 1500mAh Lithium (Li-ion) battery, HTC representatives didn't talk much about the actual life of it other than to say they expect it to be able to last all day. Obviously, if eight devices are using it as a 4G hotspot, it won't last anywhere near as long as that.
The biggest unanswered questions about the EVO now are: How much will it cost, and when will it be available? Sprint CEO Dan Hesse said at CTIA this afternoon it should be out in the summer, but he did not go into any more specifics. He also did not say if it will be released nationwide, or only in the 27 WiMAX-connected Sprint markets. I grilled Sprint representatives further today, but they were careful to not give any more details.
Video streaming on the device was excellent. Sprint showed off a new YouTube HQ browser specifically made to view higher-resolution YouTube content on the EVO. In the demonstration today, Sprint showed both sideloaded and live streaming HD content, and they were both flawless.
EVO is, simply stated, the most powerful smartphone to date, passing the Samsung Galaxy S simply because of the advantages 4G connectivity affords. I'll post something about the Samsung Galaxy S next, and I have a meeting with HTC tomorrow so we'll be able to get a little more one-on-one time with the EVO. Today, handling of the device was under semi-strict control (they wouldn't let me drop into the "about phone" menu to see what was going on with the software) and the model HTC will have on the show floor will be under glass.
One thing is absolutely certain about CTIA today, though: Android just took charge of the superphone category, this time for real.
Copyright Betanews, Inc. 2010