AlbertPacino
Explorer
As we've become completely immersed in the Internet era, Google has become a verb, librarians are increasingly lonely, and most of us have mastered the basics of Boolean logic--without even knowing exactly what it is. We've become a society of information managers, navigating huge amounts of data with ease and expertly tracking down obscure facts and figures.
But as far as we've come, all we've really done is become good at finding needles in haystacks. There's no sophistication, no wisdom involved, and it's largely because our search tools are pretty dumb.
Imagine you were suffering from a bad case of tennis elbow and wanted to find a doctor who could see you on Saturday.
A simple Google search for "doctors" would find some referral services, but it would also produce pages of doctor jokes and medical associations. More significantly, you'd miss all kinds of "physicians" and "therapists" who might be able to help, simply because you didn't choose that word. Search on "tennis elbow" and you're not going to find help for "athletic injuries." And searching for offices that are "open Saturdays" won't help you find the ones with "weekend hours."
To solve that problem, we need a search system that doesn't just process and parse our language, but understands it; programs that don't just match your search terms but intuitively recognize context to deliver what you're really looking for. Fortunately, engineers and researchers around the world are already at work to bring about this system, and they call it the semantic Web.
Conceived by Tim Berners-Lee, a computer scientist generally considered the father of the World Wide Web, the semantic Web isn't an entirely new network. It's a vision of a world where "tags," or code, is hidden inside Web pages to help computers understand meaning. Individual terms like "doctor" would be tagged with identifying code allowing a program reading the document to refer back to a central dictionary and learn that a "doctor" is the same as a "physician."
But the semantic Web isn't just a fancy thesaurus. It also defines the relationships between words, allowing a program to understand that "price" is measured in "dollars," which can be converted into "yen," and that both of those words refer to different kinds of "money."
The resulting system bears the same relationship to today's Web as a pile of books does to a well-cataloged library. "The World Wide Web, as we know it today, is mostly unstructured content," says Burton Group analyst Peter O'Kelly. "The general idea is to infuse more meaning, try to provide more of a sense of structure about the world."
At its most basic level, a semantic Web would allow search engines to act more intelligently, making it easier to find specific things. That's good news for Web surfers, as well as for the companies who develop search engines, like Google, Yahoo! and Microsoft.
But semantic technology also holds great promise for all kinds of businesses. "We need to collect data to conduct business for all sorts of reasons," says Gartner Group analyst Alexander Linden. With data growth rates averaging between 20% and 30% annually, many businesses are drowning under the weight of their own files and devoting huge resources to processing and handling them. It's becoming increasingly important to automate the process so businesses don't have to keep throwing staff at the problem. "We need to describe the data better so machines can take over," he says. "We want to get the human out of the loop for obvious reasons--they cost money, and they make errors."
To some extent, businesses are doing this already. In financial services, companies are tagging financial data with a language called XBRL, which helps identify related items in different financial documents and allows computers to automatically generate complex financial reports.
"The field of financial reporting is prone to error, and it's costly," explains Linden. "You've got a lot of humans running around trying to get one piece of data, and interoperability is achieved through re-keying information or through copy and paste." But if documents are tagged with XBRL, computers can do the job automatically, even if the databases use different formats or terminology, reducing errors and saving money.
A more familiar example of tagging might be Froogle, Google's comparison shopping service. Retailers who want their Web sites to show up in Froogle searches have to update their product pages with hidden labels on things like price, name and manufacturer. Everyone uses the same tag for price, regardless of what they actually call it, so Google can easily collect product information from thousands of different stores, even if they're in different languages.
Froogle's product labels are written in a language called XML, which will also serve as the basic language of the semantic Web. It's already well developed and adopted, so the first steps toward realizing Berners-Lee's vision have already occurred.
The next step is to agree on how to define the relationships between words. Developers at organizations including the nonprofit World Wide Web Consortium are already working on a new language, called the Resource Description Framework, which will help computers understand that a "price" can be listed in "dollars" or "yen." After that, they'll need to invent yet another language to express logical concepts, and allow users to query semantically tagged data.
Those could be significant hurdles. "It's hard enough for people to communicate with each other even when there's no technology involved," says Burton Group's O'Kelly. "It's not easy to create the body of information that these things work on." It will likely be years before there's any sort of useful consensus.
And even once the standards are agreed upon, many companies may find there's no need to use them; Froogle, or instance, has its own set of tags that work just fine, so why move to something else? "A lot of this stuff is something that's being done today, it's just not being done as a sort of galactic standard," says O'Kelly.
Companies might also find that the massive job of going through all their existing data and inserting semantic tags is just too daunting a job to tackle, even if it might save time and money somewhere down the line.
Nonetheless, it's not too early for businesses to start thinking about the semantic Web, or to begin experimenting with tagging their corporate documents and files. Companies who don't understand how best to handle the data that is their lifeblood risk slowly bleeding to death. "[Executives] have to say this is important," says Linden. "They have to understand it."
Source
But as far as we've come, all we've really done is become good at finding needles in haystacks. There's no sophistication, no wisdom involved, and it's largely because our search tools are pretty dumb.
Imagine you were suffering from a bad case of tennis elbow and wanted to find a doctor who could see you on Saturday.
A simple Google search for "doctors" would find some referral services, but it would also produce pages of doctor jokes and medical associations. More significantly, you'd miss all kinds of "physicians" and "therapists" who might be able to help, simply because you didn't choose that word. Search on "tennis elbow" and you're not going to find help for "athletic injuries." And searching for offices that are "open Saturdays" won't help you find the ones with "weekend hours."
To solve that problem, we need a search system that doesn't just process and parse our language, but understands it; programs that don't just match your search terms but intuitively recognize context to deliver what you're really looking for. Fortunately, engineers and researchers around the world are already at work to bring about this system, and they call it the semantic Web.
Conceived by Tim Berners-Lee, a computer scientist generally considered the father of the World Wide Web, the semantic Web isn't an entirely new network. It's a vision of a world where "tags," or code, is hidden inside Web pages to help computers understand meaning. Individual terms like "doctor" would be tagged with identifying code allowing a program reading the document to refer back to a central dictionary and learn that a "doctor" is the same as a "physician."
But the semantic Web isn't just a fancy thesaurus. It also defines the relationships between words, allowing a program to understand that "price" is measured in "dollars," which can be converted into "yen," and that both of those words refer to different kinds of "money."
The resulting system bears the same relationship to today's Web as a pile of books does to a well-cataloged library. "The World Wide Web, as we know it today, is mostly unstructured content," says Burton Group analyst Peter O'Kelly. "The general idea is to infuse more meaning, try to provide more of a sense of structure about the world."
At its most basic level, a semantic Web would allow search engines to act more intelligently, making it easier to find specific things. That's good news for Web surfers, as well as for the companies who develop search engines, like Google, Yahoo! and Microsoft.
But semantic technology also holds great promise for all kinds of businesses. "We need to collect data to conduct business for all sorts of reasons," says Gartner Group analyst Alexander Linden. With data growth rates averaging between 20% and 30% annually, many businesses are drowning under the weight of their own files and devoting huge resources to processing and handling them. It's becoming increasingly important to automate the process so businesses don't have to keep throwing staff at the problem. "We need to describe the data better so machines can take over," he says. "We want to get the human out of the loop for obvious reasons--they cost money, and they make errors."
To some extent, businesses are doing this already. In financial services, companies are tagging financial data with a language called XBRL, which helps identify related items in different financial documents and allows computers to automatically generate complex financial reports.
"The field of financial reporting is prone to error, and it's costly," explains Linden. "You've got a lot of humans running around trying to get one piece of data, and interoperability is achieved through re-keying information or through copy and paste." But if documents are tagged with XBRL, computers can do the job automatically, even if the databases use different formats or terminology, reducing errors and saving money.
A more familiar example of tagging might be Froogle, Google's comparison shopping service. Retailers who want their Web sites to show up in Froogle searches have to update their product pages with hidden labels on things like price, name and manufacturer. Everyone uses the same tag for price, regardless of what they actually call it, so Google can easily collect product information from thousands of different stores, even if they're in different languages.
Froogle's product labels are written in a language called XML, which will also serve as the basic language of the semantic Web. It's already well developed and adopted, so the first steps toward realizing Berners-Lee's vision have already occurred.
The next step is to agree on how to define the relationships between words. Developers at organizations including the nonprofit World Wide Web Consortium are already working on a new language, called the Resource Description Framework, which will help computers understand that a "price" can be listed in "dollars" or "yen." After that, they'll need to invent yet another language to express logical concepts, and allow users to query semantically tagged data.
Those could be significant hurdles. "It's hard enough for people to communicate with each other even when there's no technology involved," says Burton Group's O'Kelly. "It's not easy to create the body of information that these things work on." It will likely be years before there's any sort of useful consensus.
And even once the standards are agreed upon, many companies may find there's no need to use them; Froogle, or instance, has its own set of tags that work just fine, so why move to something else? "A lot of this stuff is something that's being done today, it's just not being done as a sort of galactic standard," says O'Kelly.
Companies might also find that the massive job of going through all their existing data and inserting semantic tags is just too daunting a job to tackle, even if it might save time and money somewhere down the line.
Nonetheless, it's not too early for businesses to start thinking about the semantic Web, or to begin experimenting with tagging their corporate documents and files. Companies who don't understand how best to handle the data that is their lifeblood risk slowly bleeding to death. "[Executives] have to say this is important," says Linden. "They have to understand it."
Source