News
 
Unicode nearing 50% of the web
2010-02-01 15:52
Administrator

Google logoAbout 18 months ago, we published a graph showing that Unicode on the web had just exceeded all other encodings of text on the web. The growth since then has been even more dramatic.

Web pages can use a variety of different character encodings, like ASCII, Latin-1, or Windows 1252 or Unicode. Most encodings can only represent a few languages, but Unicode can represent thousands: from Arabic to Chinese to Zulu. We have long used Unicode as the internal format for all the text we search: any other encoding is first converted to Unicode for processing.

This graph is from Google internal data, based on our indexing of web pages, and thus may vary somewhat from what other search engines find. However, the trends are pretty clear, and the continued rise in use of Unicode makes it even easier to do the processing for the many languages that we cover.

Searching for "nancials"?

Unicode is growing both in usage and in character coverage. We recently upgraded to the latest version of Unicode, version 5.2 (via ICU and CLDR). This adds over 6,600 new characters: some of mostly academic interest, such as Egyptian Hieroglyphs, but many others for living languages.

We're constantly improving our handling of existing characters. For example, the characters "fi" can either be represented as two characters ("f" and "i"), or a special display form "fi". A Google search for [financials] or [office] used to not see these as equivalent — to the software they would just look like *nancials and of*ce. There are thousands of characters like this, and they occur in surprisingly many pages on the web, especially generated PDF documents.

But no longer — after extensive testing, we just recently turned on support for these and thousands of other characters; your searches will now also find these documents. Further steps in our mission to organize the world's information and make it universally accessible and useful.

And we're angling for a party when Unicode hits 50%!

Source: googleblog

Last news
 
Weekly updates
2010-03-13

X-DRIVERS logoEfficiently following to the software released by different producers' updates our recourse makes easier your work. As usual, we are glad to get you know a list of drivers, firmwares, BIOSes, benchmarks, flash, tweaking and information utilities for the passing week.

All updates' versions are in one list, classified on the equipment type for better perception the information.

 
 
Safari 4.0.5
2010-03-13

Apple Safari logoApple safari is the most popular Mac OS X browser. With its third version release Apple decided to extend it to Windows OS. As a result every Safari update is released for two OSes.

 
 
ASRock Announces UCC-Featuring 890GX Extreme3 Motherboard
2010-03-12

ASRock logoDo you desire for a free CPU upgrade? Of course, you do! ASRock Inc. today launched the UCC [Unlock CPU Core] Series motherboard that delivers added core processing power by enabling the ASRock exclusive UCC function of the AMD CPUs. Unprecedentedly integrated with ASRock exclusive UCC Chip, ASRock UCC function is well-designed to provide extra performance boost on ASRock UCC Series motherboards. In conjunction with AMD's 8XX Series chipsets, ASRock 890GX Extreme3 motherboard, the first UCC series motherboard, comes with unique UCC feature and supports both advanced rendering from HD 4290, DX10.1 graphics technology and ATI Quad CrossFireX, 3-Way CrossFireX technologies.

 
 
MSI Announces GE700 Performance Notebook
2010-03-12

MSI logoMSI's newest 17" gaming notebook—the GE700, just made its debut. It comes equipped with Intel's latest core i5 processor with ATI Radeon HD 5730 discrete graphics card (with 1GB DDR3 VRAM), two cinema-class speakers and a subwoofer, a HD webcam, and dual hard disk architecture, so you can install two hard drives for maximum memory storage.

 
 
Creative Announces Sound Blaster X-Fi Titanium HD and USB Sound Blaster X-Fi HD Audio
2010-03-12

Logo CreativeCreative Technology Ltd. today announced the PCI-E Creative Sound Blaster X-Fi Titanium HD and USB Creative Sound Blaster X-Fi HD, setting the gold standard for PC audio with the first discrete audio card and USB digital audio system to include THX TruStudio PC audio technology.

 
Search:
Updates